[SERVER-55371] 4.2.13 removed rs.initiate() return object time values Created: 19/Mar/21  Updated: 27/Oct/23  Resolved: 04/Aug/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.13
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Anna Henningsen Assignee: Jack Mulrow
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-47568 No keys found for HMAC in RECOVERING ... Closed
is related to SERVER-56471 Donor's listDatabases response missin... Closed
Operating System: ALL
Sprint: Sharding 2021-04-05, Sharding 2021-04-19, Sharding 2021-05-03
Participants:

 Description   

Problem Description

The return value of rs.initiate() includes $clusterTime and operationTime properties in all other server versions (4.0.x, 4.4.x, latest-alpha and 4.2.x up to 4.2.12). 4.2.13 breaks this, and subsequently also our integration tests.

Steps to Reproduce

Spin up mongod servers with a shared replset id and no other configuration. Connect to them with the mongo shell and run rs.initiate() with minimal configuration.

Expected Results

As on 4.2.12:

MongoDB Enterprise > rs.initiate({ _id: 'rs0', members: [ { _id: 0, host: 'localhost:42121', priority: 1 }, { _id: 1, host: 'localhost:42122', priority: 1 } ] })
{
	"ok" : 1,
	"$clusterTime" : {
		"clusterTime" : Timestamp(1616176864, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	},
	"operationTime" : Timestamp(1616176864, 1)
}
MongoDB Enterprise rs0:SECONDARY>

Actual Results

On 4.2.13:

MongoDB Enterprise > rs.initiate({ _id: 'rs0', members: [ { _id: 0, host: 'localhost:42131', priority: 1 }, { _id: 1, host: 'localhost:42132', priority: 1 } ] })
{ "ok" : 1 }
MongoDB Enterprise rs0:SECONDARY>

Additional Notes

Not a critical change for us, but surprising. Happens with both community and enterprise servers.



 Comments   
Comment by Jack Mulrow [ 04/Aug/21 ]

Cluster time metadata is only returned when cluster time signing keys are available, which isn't guaranteed immediately after initiating a replica set. This has been the behavior since causal consistency was introduced, the change in SERVER-47568 should have only increased the window to more reliably extend beyond the time it takes for the replSetInitiate command to execute, so closing this as "works as designed."

Comment by Daniel Gottlieb (Inactive) [ 23/Mar/21 ]

Thanks for that information jack.mulrow! I'll see further vet my claim and see if it makes sense doing that.

Comment by Jack Mulrow [ 23/Mar/21 ]

The problem SERVER-47568 was meant to address is that a node in an unreadable state could return a cluster time validation error which broke things like cloud monitoring. The fix Misha originally proposed disabled gossiping cluster times in and out when in an unreadable state, but to fix the specific problem we really only need to disable gossiping times in, so restoring the operationTime and clusterTime for replSetInitiate is definitely feasible. My main concern is that if a node isn't gossiping in times, it will reject afterClusterTime/atClusterTime reads for a cluster time it hasn't seen even if it is valid, and gossiping out will make such reads more likely because the client's latest notion of cluster time will advance through gossiping, but server nodes' won't. If we scope a fix to just this command though, then that might not be a problem because the node should be guaranteed to be in a readable state after completing it, so it should also have re-enabled gossiping.

So all that is to say I do think your idea is reasonable, so long as after executing replSetInitiate the node is guaranteed to be in a readable state.

Comment by Daniel Gottlieb (Inactive) [ 22/Mar/21 ]

Thanks jack.mulrow. I believe the command causes mongod to move from unreadable -> readable state, but I couldn't say whether that's synchronous with the command returning.

The replSetInitiate command traditionally returns with an operationTime that matches the "initiating replica set" oplog entry (and clusterTime, but I'm not sure if that's really a guarantee), which I think is worth keeping.

From the perspective of what SERVER-47568 set out to accomplish, would it be reasonable for replSetInitiate to manually tack on the operationTime + cluserTime? Or does that reintroduce a problematic code path?

Comment by Jack Mulrow [ 22/Mar/21 ]

The goal of SERVER-47568 was to no longer return cluster time metadata or validate it when a node is in an unreadable state (because doing so may require reading from a local collection), so if replSetInitiate is completing before the node is in a readable state, then this is the intended behavior, although it wasn't specifically my goal to affect replSetInitiate.

I backported SERVER-47568 to 4.4.5 and 4.0.24 so yes this same behavior should show up in those releases without changes. I'm going to make the same change to master soon (there was a significant refactoring to cluster time gossiping that makes that change more involved), so eventually this behavior will be on that branch as well.

Comment by Daniel Gottlieb (Inactive) [ 22/Mar/21 ]

Also jack.mulrow, can you confirm that without any changes, the same behavior to omit cluster times will show up in the next releases of 4.4 and 4.0?

Comment by Daniel Gottlieb (Inactive) [ 22/Mar/21 ]

Hey anna.henningsen, thanks for filing a ticket. I'm able to reproduce/confirm the mongod server sends this response back in 4.2.13.

I've traced this back to SERVER-47568. jack.mulrow was it intended to remove optimes from the replSetInitiate command? Or was that just an unintended consequence.

Generated at Thu Feb 08 05:36:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.