[SERVER-34733] mongos needs to also refresh db version after receiving onImplicitCreate error Created: 27/Apr/18 Updated: 29/Oct/23 Resolved: 14/Feb/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.7.9 |
| Fix Version/s: | 4.1.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Randolph Tan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Steps To Reproduce: | see db_ver.js |
||||
| Sprint: | Sharding 2018-08-13, Sharding 2018-09-10, Sharding 2018-09-24, Sharding 2018-10-08, Sharding 2018-10-22, Sharding 2018-11-05, Sharding 2018-11-19, Sharding 2019-02-25 | ||||
| Participants: | |||||
| Linked BF Score: | 17 | ||||
| Description |
|
When a mongos with stale db version tries to send a write to an unsharded collection to the wrong shard, it will trigger the onImplicit create error since the collection does not exist on the shard. It will perform a refresh on the collection version and retry again, but since the database version was not refreshed, it will be sent to the same wrong shard and hit the same error again. This will keep on happening until it ran out of retries. |
| Comments |
| Comment by Githook User [ 14/Feb/19 ] |
|
Author: {'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}Message: |
| Comment by Randolph Tan [ 08/Feb/19 ] |
|
sounds good to me |
| Comment by Kaloian Manassiev [ 08/Feb/19 ] |
|
This is happening quite a bit (there are 200+ occurrences), so I am marking it as WFBF-day. renctan, I am thinking if we add a check for CannotImplicitlyCreateCollection at this point (or perhaps inside the targeter itself so we don't cause unit-test errors) and if there is an error, call CatalogCache::invalidateDatabaseEntry(), this should fix the issue. What do you think? |
| Comment by Kaloian Manassiev [ 09/May/18 ] |
|
I am fine with moving this ticket to 4.1 Required, however we may have to live with the associated BF for a while. I am not sure how hard it is to pinpoint that this is the cause for an incoming BF vs doing this fix. Is it only one test which fails? |
| Comment by Randolph Tan [ 09/May/18 ] |
|
Note: this won't be needed once we finish the track unsharded collections project which is currently planned late 4.2. |