[SERVER-27127] drop() regression heisenbug Created: 19/Nov/16  Updated: 22/Nov/16  Resolved: 21/Nov/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.4.0-rc1, 3.4.0-rc3, 3.4.0-rc4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Golden Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File server21727.js     File server27127.cpp     Text File server27127.log    
Operating System: ALL
Steps To Reproduce:

Install libmongoc 1.5.0-rc6, checkout the branch from the github link above, build the mongocxx driver and run tests repeatedly.

Sprint: Storage 2016-12-12
Participants:

 Description   

In developing mongocxx, I stumbled across a bug where a collection drop followed immediately by a count on the collection with an empty document filter doesn't always return a count of zero. This occurs in fixture code, so the collection is being repeatedly filled and dropped for different tests.

The problem was seen on some (but only some) test runs on 3.4-rc1, 3.4-rc3 and 3.4-rc4. (3.4-rc2 was not tested). The problem did not occur on 3.2.9 or 3.2.11.

Replacing the drop with multi-document delete does not exhibit the problem on any of the 3.4 rc's.

I don't have time to do a more complete investigation, but I wanted to file a ticket so that it wasn't lost and others could explore it.

The failing test code can be seen here: https://github.com/xdg/mongo-cxx-driver/blob/CXX-894-server-bug-demo/src/mongocxx/test/collection.cpp#L98-L103

Tests were run on my macbook.



 Comments   
Comment by Eric Milkie [ 22/Nov/16 ]

I think the crux of the problem is defining "writes arriving prior to time X", since "unacknowledged write concern" is not something one can use with strict ordering guarantees. When you do define a write as having arrived?
If you use "acknowledged write concern", I don't believe there is any issue between writes and drops.

Comment by David Golden [ 21/Nov/16 ]

FWIW, I agree the 3.4 behavior is technically "correct" (sequential consistency) but may be surprising to customers who still tend to think transactionally.

Given historical scrutiny of our consistency guarantees, the bigger question is what model we want to provide for collection drops? Do we think customers want a linearizable drop? I.e. "At time X, I dropped the collection, so I know that no writes arriving prior to time X are in the collection."

Under the current 3.4 behavior, how could someone achieve an equivalent? Spin on drop until count with linearized read concern is zero? Should we document that in the release notes?

I'm happy either way we choose, but I'd like to make sure we're changing vs expectations intentionally, not coincidentally.

Comment by Eric Milkie [ 21/Nov/16 ]

I believe the 3.4 behavior is still okay. Thanks for looking into the code, rassi

Comment by J Rassi [ 21/Nov/16 ]

In the C++ reproduction case, the unacknowledged inserts on line 50 (performed by server thread N) are racing with the collection drop on line 27 (performed by server thread N+1). When the drop wins the race, the "Count after drop was 2" message is printed. If the inserts win the race, the "Count after drop was 0" message is printed.

Though the reproduction case contains a client-side race, I can confirm that the server version does reliably affect which thread wins this race. I've diagnosed the difference in 3.2 versus 3.4 server behavior to the following:

  • In 3.2, when an insert to a non-existent collection is attempted, the inserting thread both creates the collection and performs the insert under a single database X lock.
  • In 3.4, when an insert to a non-existent collection is attempted, the inserting thread creates the collection under a database X lock, and then drops the lock and performs the insert under a database IX lock (if the collection no longer exists when the IX lock is acquired, the process is repeated).

The server changes were made under SERVER-23128, and I can confirm that the repro fails on d819ac6, and passes on that commit's parent.

When I reproduce locally on a build not incorporating the fix for SERVER-23128 (3.3.4 and below), the inserts often win the race, and the events are sequenced as written in the test.

When I reproduce locally on a build incorporating for fix for SERVER-23128 (3.3.5 and above), the drop is sequenced right after the insert drops its X lock for the collection creation. The insert operation then creates the collection for the second time and then performs the insert, after the drop, which explains the difference in the collection count. I've uploaded a log level 1 server log file (server27127.log) demonstrating the latter behavior, which shows the first collection creation attempt at 14:21:40.827, and the second collection creation attempt (post-drop) at 14:21:40.854, both for the same unacknowledged insert operation.

See also attached shell repro (server27127.js), which is similar (but not identical) to C++ driver repro. The shell repro also exhibits different behavior on 3.3.4 and below versus 3.3.5 and above.

milkie/redbeard0531: could one of you take a look at David's C++ repro case and say whether we're okay with shipping the 3.4 behavior? I personally don't see any correctness issue with dropping locks in an insert between implicit collection creation and the actual insert, but would like one of you to speak on behalf of the server team here.

Comment by David Golden [ 19/Nov/16 ]

I have a repro I'll attach – it occurs after unordered bulk insert with write concern set to unacknowledged. The repro also demonstrates that deleting all documents can also occasionally fail – though apparently less often.

Also, note that the drop/count() occurs on a subsequent client connection from the bulk insert. I.e.:

  • connection n: drop, count (correct), bulk insert
  • connection n+1: drop, count (wrong), ...
Comment by David Golden [ 19/Nov/16 ]

Standalone, enterprise edition – just my normal "keep mongod in background for development" arrangement.

Config file:

dbpath = /Users/david/mongodb/var/db
logpath = /Users/david/mongodb/var/mongodb.log
pidfilepath = /Users/david/mongodb/localhost/mongodb.pid
logappend = false
port = 27017
ipv6=true
nohttpinterface = true

Startup options: --setParameter enableTestCommands=1 --enableMajorityReadConcern --fork

When confirming the bug, I started mongod with an empty data directory.

Comment by Andy Schwerin [ 19/Nov/16 ]

Are these tests run against a standalone, a single-node replset, a multi-node replset, or a sharded configuration?

Generated at Thu Feb 08 04:14:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.