[SERVER-26943] Non-replacement updates to the config.shards collection can crash the CSRS secondary after rollback Created: 07/Nov/16  Updated: 19/Nov/16  Resolved: 10/Nov/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.4.0-rc2
Fix Version/s: 3.4.0-rc4

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Esha Maharishi (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

No deterministic way. Hit through the continuous stepdown suite.

Sprint: Sharding 2016-11-21
Participants:
Linked BF Score: 0

 Description   

The config servers have a special opObserver insert hook to intercept updates from a legacy v3.2 mongos to the config.shards collection and maintain the shard identity.

This hook always always expects that a complete shard document is inserted (which is correct on the primaries). However on a secondary, which is recovering from a rollback, if an update is followed by delete, it may end up trying to apply the update after a previously applied deletion, which will convert the update to an upsert and cause an invariant, because this results in an incomplete shard document.

For example, the following sequence:

c23012| 2016-11-07T19:01:26.092+0000 D ASIO     [NetworkInterfaceASIO-RS-0] Request 286 finished with response: { cursor: { firstBatch: [ { ts: Timestamp 1478545283000|1, t: 4, h: 565510199539623323, v: 2, op: "u", ns: "config.shards", o2: { _id: "shard0001" }, o: { $set: { draining: true } } }, { ts: Timestamp 1478545285000|8, t: 4, h: -4558147567226493446, v: 2, op: "d", ns: "config.shards", o: { _id: "shard0001" } }, ok: 1.0 }
 
c23012| 2016-11-07T19:01:26.092+0000 I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: { ts: Timestamp 1478545283000|1, t: 3 }. source's GTE: { ts: Timestamp 1478545283000|1, t: 4 } hashes: (-6821259113153738378/565510199539623323)
 
c23012| 2016-11-07T19:01:26.107+0000 D ASIO     [rsBackgroundSync] startCommand: RemoteCommand 298 -- target:ip-10-152-38-201:23013 db:local expDate:2016-11-07T19:01:31.107+0000 cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp 1478545272000|5 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 4 }
 
c23012| 2016-11-07T19:01:26.108+0000 D ASIO     [NetworkInterfaceASIO-RS-0] Request 298 finished with response: { cursor: { firstBatch: [ { ts: Timestamp 1478545283000|1, t: 4, h: 565510199539623323, v: 2, op: "u", ns: "config.shards", o2: { _id: "shard0001" }, o: { $set: { draining: true } } }, { ts: Timestamp 1478545285000|8, t: 4, h: -4558147567226493446, v: 2, op: "d", ns: "config.shards", o: { _id: "shard0001" } }, ok: 1.0 }

Results in this fatal exception:

c23012| 2016-11-07T19:01:26.109+0000 F REPL     [repl writer worker 15] writer worker caught exception: 4 Missing expected field "host" on: { ts: Timestamp 1478545283000|1, t: 4, h: 565510199539623323, v: 2, op: "u", ns: "config.shards", o2: { _id: "shard0001" }, o: { $set: { draining: true } } }
c23012| 2016-11-07T19:01:26.109+0000 I -        [repl writer worker 15] Fatal assertion 16359 NoSuchKey: Missing expected field "host" at src/mongo/db/repl/sync_tail.cpp 1054
c23012| 2016-11-07T19:01:26.109+0000 I -        [repl writer worker 15]
c23012|
c23012| ***aborting after fassert() failure



 Comments   
Comment by Githook User [ 10/Nov/16 ]

Author:

{u'username': u'EshaMaharishi', u'name': u'Esha Maharishi', u'email': u'esha.maharishi@mongodb.com'}

Message: SERVER-26943 make OpObservers for config.shards handle for writes while non-primary correctly
Branch: master
https://github.com/mongodb/mongo/commit/e24e12cfd678296e68adab52bfb40e862a66d0fa

Comment by Spencer Brody (Inactive) [ 07/Nov/16 ]

That sounds like a good way to fix it to me.

Comment by Kaloian Manassiev [ 07/Nov/16 ]

spencer, I am thinking of fixing this by moving this check for primary to happen in the onInsert handler instead of in the recovery unit change handler. Does this seem like a proper solution or there something better we could do if we thread through knowledge that this is a rollback?

Generated at Thu Feb 08 04:13:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.