[SERVER-18211] MongoDB fails to correctly roll back collection creation Created: 26/Apr/15  Updated: 13/May/16  Resolved: 29/Apr/15

Status: Closed
Project: Core Server
Component/s: Replication, Storage
Affects Version/s: 2.4.14, 2.6.9, 3.0.2, 3.1.2
Fix Version/s: 2.6.10, 3.0.3, 3.1.3

Type: Bug Priority: Critical - P2
Reporter: Ronan Bohan Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: ET
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File rollback-repro.sh    
Issue Links:
Duplicate
is duplicated by SERVER-20597 Save data that is rolled back to the ... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

Run the attached rollback-repro.sh script (which uses 'mlaunch' to start and manage a 3 node PSA replica set).

It runs through a test which forces a rollback but when using MongoDB 3.0.x or 3.1.x the 'rsBackgroundSync' fails to find the data and write it to disk.

Earlier versions of MongoDB, e.g. 2.6.9, correctly handle this case.

Sprint: RPL 3 05/15/15
Participants:

 Description   

If MongoDB rolls back an explicit collection creation, it will not record the dropped data to disk.

This issue was made more acute in MongoDB 3.0, when all implicit collection creation was changed to explicitly create an oplog entry. Thus, downstream replicating nodes now create all collections explicitly.



 Comments   
Comment by Eric Milkie [ 30/Apr/15 ]

Hi David,
In all versions prior to version 3.0, only explicit collection creation (via the create command) is affected by this issue. In version 3.0, there was a change made that caused all collection creation, both implicit and explicit, to be affected by this issue.
-Eric

Comment by David Murphy [ 30/Apr/15 ]

Can anyone tell me if 2.4 is also affected or was this caused in 2.6+

Thanks
David

________________________________
Rackspace Limited is a company registered in England & Wales (company registered number 03897010) whose registered office is at 5 Millington Road, Hyde Park Hayes, Middlesex UB3 4AZ. Rackspace Limited privacy policy can be viewed at www.rackspace.co.uk/legal/privacy-policy - This e-mail message may contain confidential or privileged information intended for the recipient. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse@rackspace.com and delete the original message. Your cooperation is appreciated.

Comment by Githook User [ 30/Apr/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-18211 write to disk all documents in a collection when rolling back createCollection
Branch: v2.6
https://github.com/mongodb/mongo/commit/29271e11331d0cefca2330352c496da3c06c095b

Comment by Githook User [ 29/Apr/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-18211 write to disk all documents in a collection when rolling back createCollection

(cherry picked from commit aa54b581e9afaf7444846a35bbd1adc8262d1330)
Branch: v3.0
https://github.com/mongodb/mongo/commit/998f82764ec8c52a1c478623c0f1e2e61b692ab5

Comment by Githook User [ 29/Apr/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-18211 do not log a message when rolling back an insert for a dropped collection

(cherry picked from commit 91a530bf170231a0f3024e130e1c798a544f5375)

Conflicts:
src/mongo/db/repl/rs_rollback.cpp
Branch: v3.0
https://github.com/mongodb/mongo/commit/ab323b2f6607f265352b68c7efea971d54007207

Comment by Githook User [ 29/Apr/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-18211 write to disk all documents in a collection when rolling back createCollection
Branch: master
https://github.com/mongodb/mongo/commit/aa54b581e9afaf7444846a35bbd1adc8262d1330

Comment by Githook User [ 29/Apr/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-18211 do not log a message when rolling back an insert for a dropped collection
Branch: master
https://github.com/mongodb/mongo/commit/91a530bf170231a0f3024e130e1c798a544f5375

Comment by Eric Milkie [ 28/Apr/15 ]

Indeed, this problem exists even in 2.6 (not logging collection data before rolling back a create-collection metadata operation), but it is much more noticeable in 3.0+ because, as Matt mentioned, we are now logging all collection creation, both implicit and explicit. In 2.6, only explicitly created collections are vulnerable to this issue.

Comment by David Murphy [ 28/Apr/15 ]

Thanks Matt is sounds like the second part is the part we might want to manually patch in 3.0.X until MongoDB officially pushes a fix to 3.1/3.2, if you could update the ticket when you have a patch on that it would be great. Right now the loss of the data permanently sounds pretty bad to us.

Comment by Matt Dannenberg [ 28/Apr/15 ]

There are two problems here: logging a misleading error message and not recording the documents contained in a collection that is dropped during rollback.

This behavior began when we started logging implicit createCollection calls (ie, when we do the first insert on a collection, a createCollection oplog entry is created). During rollback, the createCollection is rolled back first, dropping the collection. As a result, when we attempt to rollback the insert, the document cannot be found in the collection (as the collection no longer exists) and we log an error. We should not log this error if the collection does not exist, as the document has been dropped.

The other problem is we do not log the contents of a collection that is dropped. This can lead to data being permanently lost during rollback. We should start dumping the contents of collections we drop during rollback. I will use this ticket to track both of these problems and their solutions, and will update the ticket to reflect this.

Comment by David Murphy [ 28/Apr/15 ]

Ronan,

This is pretty critical for us, can you share the commit link and hash so we can look at the patch in github and figure out if we can apply it to 3.0.4+ for our needs. The loss of rollback data is logical corruption and really a sev-0 type issue as we wont know how to fix the data set without this data.

Thanks
David

Comment by Ronan Bohan [ 26/Apr/15 ]

I should also point out - I have tried this with MongoDB 3.0.2 using both MMAPv1 and WiredTiger storage engines - both show the same behavior.

Comment by Ronan Bohan [ 26/Apr/15 ]

Running the script against different versions of MongoDB results in the following entries in the log file (for Node0).

MongoDB 2.6.9

2015-04-26T21:09:10.896+0100 [rsBackgroundSync] replSet ROLLBACK
2015-04-26T21:09:10.921+0100 [rsBackgroundSync] replSet rollback 4.7
2015-04-26T21:09:10.922+0100 [rsBackgroundSync] replSet rollback 5 d:2 u:0

MongoDB 3.0.2

2015-04-26T21:10:32.337+0100 I REPL     [ReplicationExecutor] transition to ROLLBACK
2015-04-26T21:10:32.349+0100 I REPL     [rsBackgroundSync] rollback 4.7
2015-04-26T21:10:32.349+0100 E REPL     [rsBackgroundSync] rollback cannot find object by id

MongoDB 3.1.1

2015-04-26T21:15:46.137+0100 I REPL     [ReplicationExecutor] transition to ROLLBACK
2015-04-26T21:15:46.140+0100 I REPL     [rsBackgroundSync] rollback 4.7
2015-04-26T21:15:46.140+0100 E REPL     [rsBackgroundSync] rollback cannot find object: { _id: ObjectId('553d474d9ee692483172ef42') } in namespace test.test

Rollback data
The failure in MongoDB 3.0.x and 3.1.x prevents the rollback process continuing and, therefore, no rollback data is written to disk. In contrast, the rollback directory for MongoDB 2.6.9 looks as follows:

$ ls .../rollback
test.test.2015-04-26T20-09-10.0.bson

Other versions
I've tested this against other versions of MongoDB prior to 2.6.9 and they all work as expected - so the problem appears to be a regression in 3.0

Generated at Thu Feb 08 03:46:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.