[SERVER-33890] Insertion into system.indexes on mongos causes 6 index builds on shard Created: 14/Mar/18  Updated: 29/Oct/23  Resolved: 16/Apr/18

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Sharding
Affects Version/s: None
Fix Version/s: 3.7.4

Type: Bug Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Steps To Reproduce:

(function() {
  let st = ShardingTest({shards: 1});
  let db = st.s.getDB("test");
  st.shard0.getDB("test").setLogLevel(1);
  assert.commandWorked(db.adminCommand({enableSharding: "test"}));
  assert.commandWorked(db.adminCommand({shardCollection: "test.coll", key: {_id: 1}}));
 
  // Fails.
  assert.commandWorked(db.system.indexes.insert({ns: "test.coll", v: 2, key: {_id: 1, a: 1}, name: "a_1"}));
 
  st.stop();
})();

Sprint: Sharding 2018-04-23
Participants:

 Description   

Attempting to create an index on a sharded collection using insertion into system.indexes fails with the following error:

Error: write failed with error: {
 	"nInserted" : 0,
 	"writeError" : {
 		"code" : 82,
 		"errmsg" : "no progress was made executing batch write op in test.system.indexes after 5 rounds (0 ops completed in 6 rounds total)"
 	}

It forward the insertion to the shard six times. Each time the shard creates the index, but then uasserts due to an epoch mismatch on system.indexes and reverts the index build.

build index on: test.coll properties: { v: 2, key: { a: 1.0 }, name: "a_1", ns: "test.coll" }
building index using bulk method; build may temporarily use up to 500 megabytes of RAM
bulk commit starting for index: a_1
done building bottom layer, going to commit
build index done.  scanned 0 total records. 0 secs
User Assertion: StaleConfig{ ns: "test.system.indexes", vReceived: Timestamp(1, 0), vReceivedEpoch: ObjectId('5aa977e5b6b4bb7493628419'), vWanted: Timestamp(0, 0), vWantedEpoch: ObjectId('000000000000000000000000') }: shard version not ok: version epoch mismatch detected for test.system.indexes, the collection may have been dropped and recreated src/mongo/db/s/collection_sharding_state.cpp 260

The shard does not end up with the index built. It seems acceptable for insertion into system.indexes to fail on mongos, but it would be better to not do index builds on the shard before failing.



 Comments   
Comment by Githook User [ 16/Apr/18 ]

Author:

{'name': 'Matthew Saltz', 'email': 'matthew.saltz@mongodb.com'}

Message: SERVER-33890 Unblacklist cannot_create_system_dot_indexes.js from sharded_causally_consistent_jscore_passthrough
Branch: master
https://github.com/mongodb/mongo/commit/f9f33ab038f03076581eef1db9e22c577e815110

Comment by Matthew Saltz (Inactive) [ 16/Apr/18 ]

Code review: https://mongodbcr.appspot.com/201170001/

Comment by Tess Avitabile (Inactive) [ 16/Apr/18 ]

Actually, before closing this ticket, we should unblacklist cannot_create_system_dot_indexes.js from sharded_causally_consistent_jscore_passthrough. This would also give use test coverage to ensure this does not regress.

Comment by Tess Avitabile (Inactive) [ 16/Apr/18 ]

You are correct, this does not currently fail on master. It also does not fail on the 3.6 branch. It does fail on this commit, which was the state of the master branch when I filed the ticket. I think it is safe to close this as Gone Away, unless the Sharding team is interested in figuring out what caused or fixed this, or adding a test to ensure this does not regress.

Comment by Matthew Saltz (Inactive) [ 13/Apr/18 ]

I'm observing the same behavior on master. The test passes for me. I just added it to the sharding directory and ran it with the sharding suite and it passed. tess.avitabile was there any extra setup involved with this? Maybe something has gotten fixed since this ticket was created?

Comment by Kaloian Manassiev [ 13/Apr/18 ]

If it worked in 3.6 it should work in 4.0 as well. However I just tried to run the attached repro script and it passes for me. I do see 3 indexes being built, but the first two are for the config.cache.chunks collection:

d20000| 2018-04-12T20:26:20.010-0400 I INDEX    [ShardServerCatalogCacheLoader-1] build index on: config.cache.chunks.test.coll properties: { v: 2, key: { lastmod: 1 }, name: "lastmod_1", ns: "config.cache.chunks.test.coll" }
d20000| 2018-04-12T20:26:20.010-0400 I INDEX    [ShardServerCatalogCacheLoader-1] 	 building index using bulk method; build may temporarily use up to 500 megabytes of RAM
d20000| 2018-04-12T20:26:20.010-0400 D INDEX    [ShardServerCatalogCacheLoader-1] 	 bulk commit starting for index: lastmod_1
d20000| 2018-04-12T20:26:20.011-0400 D INDEX    [ShardServerCatalogCacheLoader-1] 	 done building bottom layer, going to commit

Does this bug still exist at all?

Comment by Matthew Saltz (Inactive) [ 12/Apr/18 ]

kaloian.manassiev Should we make this work or just explicitly disallow creating indexes on sharded collections this way?

Generated at Thu Feb 08 04:34:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.