Priority: Critical - P2
Affects Version/s: 3.0.0
Fix Version/s: 3.0.4
Steps To Reproduce:
The program I wrote to expose the bug is at: https://github.com/jeffhj11/MongoBug.git
1. Set up a cluster with three shards, each with WiredTiger as the storage engine.
2. Run the program to insert records into an unsharded collection. Wait for it to compete.
3. Shard the collection on the uuid field
4. Insert one record and wait for the load balancer to start.
5. Run the program again while the balancer is running. On most occasions, we lost a few records.The program I wrote to expose the bug is at: https://github.com/jeffhj11/MongoBug.git 1. Set up a cluster with three shards, each with WiredTiger as the storage engine. 2. Run the program to insert records into an unsharded collection. Wait for it to compete. 3. Shard the collection on the uuid field 4. Insert one record and wait for the load balancer to start. 5. Run the program again while the balancer is running. On most occasions, we lost a few records.
Sprint:Sharding 5 06/26/16
There appears to be a bug where data can go missing in MongoDB. The problem seems to occur when doing a large number of concurrent inserts into a sharded cluster while the balancer is running. In all instances where we lost data, wiredTiger was the storage engine and the shard key was effectively a random UUID, so that inserts were going to all shards.
The test description and program shows the error with a Java application using a MongoDB database, but we also had the problem (less frequently) with a Python script.
The bug seems to be some sort of concurrency/race condition problem. It is not guaranteed to happen on any one run, but we were able to replicate it fairly consistently. The number of documents missing range from 1-1000 during 400,000-800,000 inserts.
Also note that we counted documents in two ways to determine that we had lost documents. The first was by running an aggregate to count the documents in an effort to avoid the way count() works when the balancer is running. We also waited for the balancer to finish and ran both a count() and an aggregate to ensure that documents are missing.
System Configuration & Setup
We tested with the following versions
MongoDB Versions: 3.0.1, 3.0.2, 3.0.3. For any given test, the instances were all running the same version of mongo.
OS Version: Centos 6.6 for 3.0.1 and 3.0.2, MacOSX for 3.0.3.
Write Concern: Acknowledged, Majority. On a few tests we had journaling enabled as well.
Java version: 8u25
Java Driver: 3.0.0, 2.12.2
We had the error occur in two different configurations, one with many servers and one mongod on each, and one with all mongo applications on one server.
Please note that the servers we were running on were virtual and did not have particularly high iops.
3 shards, each shard was a replicate set with only one mongod instance, run with options:
1 mongos, run with options:
3 shards, each shard was a replicate set with two mongod instances and one arbiter, run with options:
The replica sets were configured and primaries were elected prior to running the tests.
The program can be pulled from github at https://github.com/jeffhj11/MongoBug.git
The Java application simply starts up a number of threads and each thread inserts a number of documents. There are options in the application to specify a message to more easily determine what data went missing.
The document structure is simarly to:
To run a test, we would run the Java application to insert 400,000 documents into an unsharded database collection. After that completed, we would log into the mongos and shard the collection with the commands:
When that was completed, we would manually insert one document to start the balancer. We would watch the logs to ensure that the balancer was working. Once the balancer started, we would run our application again, so that multiple threads in the application were inserting documents into Mongo while the balancer was running.
Common Themes and what we noticed
The following seemed to be the common pieces that would cause MongoDB to lose data. Please keep in mind that this does not mean these are the actual causes, simply our observations and speculation.
• We only lost data when WiredTiger was the storage engine, never with MMAP
• We only lost data when the shard key was effectively or random UUID, whether it was sharded on the ‘uuid’ field or it was sharded on the ‘_id’ field when the ‘_id’ field was set to our UUID.
• We only lost data when the balancer was running and we were inserting data at that time. The only data that was lost was the data being inserted while the balancer was running, the data that was there prior to the balancer running always existed at the end of the test.
• On tests where we did something to slow down the inserts the likelihood of there being problems decreased.
• One one test we did a find() immediately after doing an insert and match each field to ensure the exact document was found. That test did lose data, so one record was lost after our application and run a find() to retrieve it and it was successful.
• We noticied the problem with both Python and Java, but it occurred much more often with our Java program. We are also better Java programmers, and our Java program runs much faster than our python program.