[SERVER-12061] Do not silently ignore read errors when syncing a replica set node Created: 12/Dec/13  Updated: 21/Sep/17  Resolved: 17/Oct/14

Status: Closed
Project: Core Server
Component/s: Replication, Stability
Affects Version/s: 2.4.8
Fix Version/s: 2.6.6, 2.7.8

Type: Improvement Priority: Major - P3
Reporter: Alexander Komyagin Assignee: Eric Milkie
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-17903 When corruption detected, server cont... Closed
Related
related to SERVER-17903 When corruption detected, server cont... Closed
is related to DOCS-4315 Document that documents will be skipp... Closed
is related to SERVER-1558 Documents should write checksum on wr... Closed
Tested
Backwards Compatibility: Minor Change
Backport Completed:
Participants:
Case:

 Description   

When a new clean node is being added to a replica set, if the source for initial sync has corrupted data, it seems that we try to sync whatever we can, silently ignoring all the records we can not fetch.

While this "best effort" behavior makes sense, it can lead to significant data inconsistency within the replica set. We should not ignore data access errors during initial sync.

example behavior (PRIMARY is the node with corruption in dummy.acl namespace, SECONDARY is the newly synced secondary):

X:SECONDARY> rs.slaveOk()
X:SECONDARY> use dummy
switched to db dummy
X:SECONDARY> db.acl.count()
101
X:SECONDARY> exit
bye
AD-MAC10G:ff alexander$ mongo
MongoDB shell version: 2.4.8
connecting to: test
X:PRIMARY> use dummy
switched to db dummy
X:PRIMARY> db.acl.count()
10002
X:PRIMARY> exit
bye



 Comments   
Comment by Githook User [ 24/Nov/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 massert on failure

(cherry picked from commit 5a74508369bfb820b18746623708688a76ae3419)
Branch: v2.6
https://github.com/mongodb/mongo/commit/a3ab21753b0dbf20cc98a6085a9c5af4e8af856d

Comment by Githook User [ 24/Nov/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 fix assert code

(cherry picked from commit 525b5ef94f72c376c4d831539be3988b8ff0a0ec)
Branch: v2.6
https://github.com/mongodb/mongo/commit/bc21c23cbe79344034081bb8e4910fb0711b4ecc

Comment by Githook User [ 24/Nov/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 optionally abort cloner when corruption detected from source

(cherry picked from commit 6556243ff038a3f9cdc52e6a97684203c8c9ec54)

Conflicts:
src/mongo/db/cloner.cpp
Branch: v2.6
https://github.com/mongodb/mongo/commit/a724bbb7eb6c046887c1a61516f273cef652f29c

Comment by Githook User [ 15/Oct/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 massert on failure
Branch: master
https://github.com/mongodb/mongo/commit/5a74508369bfb820b18746623708688a76ae3419

Comment by Githook User [ 15/Oct/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 fix assert code
Branch: master
https://github.com/mongodb/mongo/commit/525b5ef94f72c376c4d831539be3988b8ff0a0ec

Comment by Githook User [ 15/Oct/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-12061 optionally abort cloner when corruption detected from source
Branch: master
https://github.com/mongodb/mongo/commit/6556243ff038a3f9cdc52e6a97684203c8c9ec54

Generated at Thu Feb 08 03:27:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.