[SERVER-39218] Simultaneous index builds improvements to currentOp output for createIndexes Created: 28/Jan/19  Updated: 27/Oct/23  Resolved: 13/Apr/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Gone away Votes: 0
Labels: techdebt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-43044 IndexBuildsCoordinator does not repor... Closed
related to SERVER-46621 Empty "ns" field in currentOp for ind... Closed
is related to SERVER-44821 retrieving storage stats for currentO... Closed
is related to SERVER-36759 Overhaul currentOp output for createI... Closed
is related to SERVER-43988 shutdown ({force:false}) should refus... Closed
Assigned Teams:
Storage Execution
Participants:

 Description   

Add the new fields to currentOp detailed in the design document.

Note, the field indicating that the index build has become recoverable on restart must only be set when the index build's phase transition to 'draining' has been checkpointed.



 Comments   
Comment by Benety Goh [ 13/Apr/20 ]

When a secondary is applying a commitIndexBuild/abortIndexBuild oplog entry, the db.currentOp() output of the associated index build will include a field (commitIndexBuild=true/abortIndexBuild=true). See SERVER-44821.

Comment by Benety Goh [ 13/Apr/20 ]

Part of the motivation for this ticket, described in the design document, was to support improved shutdown semantics for downstream systems. The changes to the shutdown command in SERVER-43988 makes some of these proposed changes less relevant now.

Comment by Benety Goh [ 13/Apr/20 ]

As of commit e5b21a774777fd1a1d89959b9e78c6eb18580c32, the db.currentOp() looks like:

primary:

foo:PRIMARY> db.currentOp({ns: 'test.t'})
{
	"inprog" : [
		{
			"type" : "op",
			"host" : "benetymbp3.fios-router.home:30001",
			"desc" : "IndexBuildsCoordinatorMongod-0",
			"active" : true,
			"currentOpTime" : "2020-04-13T14:25:30.281-04:00",
			"opid" : 1185,
			"secs_running" : NumberLong(151),
			"microsecs_running" : NumberLong(151964560),
			"op" : "command",
			"ns" : "test.t",
			"command" : {
				"createIndexes" : "t",
				"indexes" : [
					{
						"v" : 2,
						"key" : {
							"a" : 1
						},
						"name" : "a_1"
					}
				],
				"lsid" : {
					"id" : UUID("e34da77e-3858-453a-9f07-8dbcace3d8be")
				},
				"$clusterTime" : {
					"clusterTime" : Timestamp(1586802160, 1),
					"signature" : {
						"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
						"keyId" : NumberLong(0)
					}
				},
				"$db" : "test"
			},
			"msg" : "Index Build: scanning collection Index Build: scanning collection: 1/1 100%",
			"progress" : {
				"done" : 1,
				"total" : 1
			},
			"numYields" : 229112,
			"locks" : {
				"ReplicationStateTransition" : "w",
				"Global" : "w",
				"Database" : "w",
				"Collection" : "w"
			},
			"waitingForLock" : false,
			"lockStats" : {
				"ReplicationStateTransition" : {
					"acquireCount" : {
						"w" : NumberLong(229115)
					}
				},
				"Global" : {
					"acquireCount" : {
						"w" : NumberLong(229115)
					}
				},
				"Database" : {
					"acquireCount" : {
						"w" : NumberLong(229115)
					}
				},
				"Collection" : {
					"acquireCount" : {
						"w" : NumberLong(229114),
						"W" : NumberLong(1)
					}
				},
				"Mutex" : {
					"acquireCount" : {
						"r" : NumberLong(3)
					}
				}
			},
			"waitingForFlowControl" : false,
			"flowControlStats" : {
				"acquireCount" : NumberLong(229114),
				"timeAcquiringMicros" : NumberLong(326892)
			}
		},
		{
			"type" : "op",
			"host" : "benetymbp3.fios-router.home:30001",
			"desc" : "conn11",
			"connectionId" : 11,
			"client" : "127.0.0.1:57506",
			"appName" : "MongoDB Shell",
			"clientMetadata" : {
				"application" : {
					"name" : "MongoDB Shell"
				},
				"driver" : {
					"name" : "MongoDB Internal Client",
					"version" : "0.0.0"
				},
				"os" : {
					"type" : "Darwin",
					"name" : "Mac OS X",
					"architecture" : "x86_64",
					"version" : "18.7.0"
				}
			},
			"active" : true,
			"currentOpTime" : "2020-04-13T14:25:30.281-04:00",
			"opid" : 1184,
			"lsid" : {
				"id" : UUID("e34da77e-3858-453a-9f07-8dbcace3d8be"),
				"uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
			},
			"secs_running" : NumberLong(152),
			"microsecs_running" : NumberLong(152000272),
			"op" : "command",
			"ns" : "test.t",
			"command" : {
				"createIndexes" : "t",
				"indexes" : [
					{
						"key" : {
							"a" : 1
						},
						"name" : "a_1"
					}
				],
				"lsid" : {
					"id" : UUID("e34da77e-3858-453a-9f07-8dbcace3d8be")
				},
				"$clusterTime" : {
					"clusterTime" : Timestamp(1586802160, 1),
					"signature" : {
						"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
						"keyId" : NumberLong(0)
					}
				},
				"$db" : "test"
			},
			"numYields" : 0,
			"waitingForLatch" : {
				"timestamp" : ISODate("2020-04-13T18:22:58.420Z"),
				"captureName" : "FutureResolution"
			},
			"locks" : {
				
			},
			"waitingForLock" : false,
			"lockStats" : {
				"ParallelBatchWriterMode" : {
					"acquireCount" : {
						"r" : NumberLong(3)
					}
				},
				"ReplicationStateTransition" : {
					"acquireCount" : {
						"w" : NumberLong(5)
					}
				},
				"Global" : {
					"acquireCount" : {
						"r" : NumberLong(1),
						"w" : NumberLong(4)
					}
				},
				"Database" : {
					"acquireCount" : {
						"r" : NumberLong(1),
						"w" : NumberLong(3)
					}
				},
				"Collection" : {
					"acquireCount" : {
						"w" : NumberLong(1),
						"W" : NumberLong(1)
					}
				},
				"Mutex" : {
					"acquireCount" : {
						"r" : NumberLong(4)
					}
				}
			},
			"waitingForFlowControl" : false,
			"flowControlStats" : {
				"acquireCount" : NumberLong(3),
				"timeAcquiringMicros" : NumberLong(8)
			}
		}
	],
	"ok" : 1,
	"$clusterTime" : {
		"clusterTime" : Timestamp(1586802327, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	},
	"operationTime" : Timestamp(1586802327, 1)
}

secondary:

foo:SECONDARY> db.currentOp({ns: 'test.t'})
{
	"inprog" : [
		{
			"type" : "op",
			"host" : "benetymbp3.fios-router.home:30002",
			"desc" : "IndexBuildsCoordinatorMongod-0",
			"active" : true,
			"currentOpTime" : "2020-04-13T14:26:07.106-04:00",
			"opid" : 5352,
			"secs_running" : NumberLong(188),
			"microsecs_running" : NumberLong(188746613),
			"op" : "command",
			"ns" : "test.t",
			"command" : {
				"createIndexes" : "t",
				"indexes" : [
					{
						"v" : 2,
						"key" : {
							"a" : 1
						},
						"name" : "a_1"
					}
				]
			},
			"msg" : "Index Build: scanning collection Index Build: scanning collection: 1/1 100%",
			"progress" : {
				"done" : 1,
				"total" : 1
			},
			"numYields" : 283709,
			"locks" : {
				"ReplicationStateTransition" : "w",
				"Global" : "w",
				"Database" : "w",
				"Collection" : "w"
			},
			"waitingForLock" : false,
			"lockStats" : {
				"ReplicationStateTransition" : {
					"acquireCount" : {
						"w" : NumberLong(283711)
					}
				},
				"Global" : {
					"acquireCount" : {
						"w" : NumberLong(283711)
					}
				},
				"Database" : {
					"acquireCount" : {
						"w" : NumberLong(283711)
					}
				},
				"Collection" : {
					"acquireCount" : {
						"w" : NumberLong(283710),
						"W" : NumberLong(1)
					}
				},
				"Mutex" : {
					"acquireCount" : {
						"r" : NumberLong(2)
					}
				}
			},
			"waitingForFlowControl" : false,
			"flowControlStats" : {
				"acquireCount" : NumberLong(283711),
				"timeAcquiringMicros" : NumberLong(408548)
			}
		}
	],
	"ok" : 1,
	"$clusterTime" : {
		"clusterTime" : Timestamp(1586802357, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	},
	"operationTime" : Timestamp(1586802357, 1)
}

Comment by Benety Goh [ 23/Sep/19 ]

This enhancement is beneficial mostly for testing. Moving out of project and assigning 'techdebt' label.

Comment by Dianna Hohensee (Inactive) [ 28/Jan/19 ]

Ensuring that the state transition is caught in the checkpoint prior to shutdown is tricky because of the 'ghost' timestamps we use on secondaries for index build writes (SERVER-38986). It is possible that we select a ghost timestamp T at a time that has already been checkpointed, meaning that a replicated write occurred at time T and got checkpointed and then we use that same time T to do an index build related write (to update the index catalog entry's buildPhase field from scanning to draining) after the checkpoint was taken. The result is that the index build write timestamp T is not checkpointed, but looks like it is if compared to the checkpoint_timestamp. This all works out for correct recover to a stable timestamp, but doesn't work out for server restart guarantees we wish to make in currentOp.

Viable option thus far:
Wait to set the currentOp flag until voteCommitIndexBuild has succeeded and the secondary can see its own vote, i.e. it has replicated to the secondary casting the vote. Ensures that a majority write after the state transition can be in the checkpoint taken on clean shutdown, meaning the older state transition write will be in the checkpoint.

Generated at Thu Feb 08 04:51:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.