[SERVER-20764] create_index_gle.js intermittently fails, getLastError returns error "could not target full range of test.user; metadata not found" Created: 02/Oct/15 Updated: 25/Jan/17 Resolved: 06/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 3.1.9 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | J Rassi | Assignee: | Andy Schwerin |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | Sharding A (10/09/15) | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
create_index_gle.js started intermittently failing since yesterday afternoon. The failure symptom is that a getLastError command returns a "could not target full range of test.user; metadata not found". I am unable to reproduce this failure on my desktop. A list follows of recent failures: Linux:
SSL Ubuntu 14.04: SSL Amazon Linux: SSL RHEL 7.0: Excerpt from first failure in above list:
|
| Comments |
| Comment by Githook User [ 06/Oct/15 ] |
|
Author: {u'username': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@mongodb.com'}Message: The problem with create_index_gle.js that causes |
| Comment by Andy Schwerin [ 05/Oct/15 ] |
|
Beyond the dependency on |
| Comment by Andy Schwerin [ 02/Oct/15 ] |
|
The root of this problem is that sharding code is sending some commands to shard member nodes that might be secondaries without setting $secondaryOk. It happens to manifest now because the switch to PV1 has apparently introduced some extra (spurious?) failover or step down events. In this particular manifestation, while trying to choose a shard on which to place a new database, mongos attempts to find out the storage size of every shard via the listDatabases command. It uses a primary-preferred read preference for target selection, but does not set $secondaryOk on the command metadata. This is the result of an API deficiency, that makes this too easy to do. I'll file and link a server ticket. |
| Comment by J Rassi [ 02/Oct/15 ] |
|
I am also unable to reproduce this issue on a Linux spawnhost (rhel55-test host type, binaries from 466f0596, command line: python buildscripts/resmoke.py --repeat=10 --storageEngine=mmapv1 --executor=gle_auth jstests/gle/create_index_gle.js). |
| Comment by J Rassi [ 02/Oct/15 ] |
|
Assigning to Spencer for initial triage. Spencer, could you provide an interpretation of this error message? I suspect that one of the replication commits from yesterday is the actual culprit, but thought you might be able to identify next steps and next assignee. |