[SERVER-21096] 3.2 pv0 logs an n-op on promotion to primary which can cause problems with 3.0 nodes Created: 23/Oct/15  Updated: 25/Nov/15  Resolved: 20/Nov/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.2.0-rc0
Fix Version/s: 3.2.0-rc4

Type: Bug Priority: Major - P3
Reporter: Matt Dannenberg Assignee: Scott Hernandez (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Repl B (10/30/15), Repl C (11/20/15), Repl D (12/11/15)
Participants:

 Description   

In order to support readConcern, 3.2 nodes log an n-op to the oplog, effectively creating a floor for what they will consider committed. 3.0 nodes ignore n-ops while processing the oplog. This can cause a problem wherein no node believes themselves to be electable:

  • 3.2 low priority node is elected and logs an n-op
  • that node steps down from primary due to a heartbeat from a higher priority node
  • that higher priority node (3.0) will not run because it is not the freshest
  • the 3.2 node will not run because it is not highest priority

Log containing example for mixed_storage_version_replication.js: https://logkeeper.mongodb.org/build/56295972be07c42d836c7ac3/test/5629599abe07c42d836c7ec7?raw=1



 Comments   
Comment by Githook User [ 20/Nov/15 ]

Author:

{u'username': u'scotthernandez', u'name': u'Scott Hernandez', u'email': u'scotthernandez@gmail.com'}

Message: SERVER-21096: only record election win in PV1
Branch: master
https://github.com/mongodb/mongo/commit/bfbe3dd12eb0aca46db1eacd2c4424f388b1b528

Comment by Scott Hernandez (Inactive) [ 19/Nov/15 ]

Yes, we are going to not record the oplog n-op when becoming primary under PV0. This means that readConcern.majority will not work until a user-write is recorded after the election. Even once reconfig'n to PV1, a writeCocnern.majority write should be done before the readConcern.majority.

No changes to 3.0 are needed for this solution.

Comment by Eric Milkie [ 02/Nov/15 ]

Question, if we made logging an 'n' op for elections only occur for pv1, would that also fix this issue?

Comment by Scott Hernandez (Inactive) [ 31/Oct/15 ]

I added code to update the optime at the end of each batch and everything tests fine; The patch is minor.

Comment by Eric Milkie [ 23/Oct/15 ]

Let's investigate making 'n' ops update the optime in 3.0, to see if that solves this problem without introducing other issues.

Comment by Scott Hernandez (Inactive) [ 23/Oct/15 ]

When restarting the 3.0 node it will load its last optime from the oplog, even if op="n". This is a 3.0 bug we need to fix, in that all oplog entries need to cause the system to update its optime since those entries are still part of the persisted oplog.

Generated at Thu Feb 08 03:56:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.