[SERVER-50192] mongod --replSet --repair fails with NotMasterNoSlaveOk error Created: 08/Aug/20  Updated: 29/Oct/23  Resolved: 09/Aug/21

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 4.0.19, 4.2.8, 4.4.0
Fix Version/s: 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Jackson Xie (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File r0.log     Text File r1.log     Text File r2.log    
Issue Links:
Documented
is documented by DOCS-14721 [SERVER]Investigate changes in SERVER... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2021-08-09, Execution Team 2021-08-23
Participants:
Case:

 Description   

Create a 3-node replica set, shut it down, then restart with --repair. The repair does not complete, failing part way through with the following error:

{"t":\{"$date":"2020-08-08T08:02:45.266-04:00"},"s":"E", "c":"STORAGE", "id":20557, "ctx":"initandlisten","msg":"DBException in initAndListen, terminating","attr":\{"error":"NotMasterNoSlaveOk: not master and slaveOk=false"}}

This happens on all 3 nodes. Debug level 5 logs attached.



 Comments   
Comment by Githook User [ 09/Aug/21 ]

Author:

{'name': 'Jackson Xie', 'email': 'jackson.xie@mongodb.com', 'username': 'jacksonx9'}

Message: SERVER-50192: use better return error message when --replSet and --repair used together
Branch: master
https://github.com/mongodb/mongo/commit/811251543cb0ad333c6018f5b53717784a863ef3

Comment by Eric Milkie [ 10/Aug/20 ]

Hmm. I did try by starting, exiting, and then running repair. But I also initiated the replica set as part of the initial startup/shutdown, which would more likely represent the data on servers in production.

Comment by Bruce Lucas (Inactive) [ 10/Aug/20 ]

I saw those errors if I started a brand new fresh mongod with both --replSet and --repair, which is a very unusual scenario. I found that if I started mongod, then exited, then started again with --\replSet and --repair the repair worked on 4.0 and 4.2 without problems. Are you seeing something different?

In that sense I think this is a bit of a regression in 4.4.

Comment by Eric Milkie [ 10/Aug/20 ]

Same error on 4.0.

Comment by Eric Milkie [ 10/Aug/20 ]

The error on 4.2:

exception in initAndListen: NotMaster: Not primary while creating collection admin.system.version, terminating

Comment by Bruce Lucas (Inactive) [ 08/Aug/20 ]

So this only happens if mongod is run with both --replSet and --repair, so could be considered user error. However possibly we should detect that condition and exit with a helpful error message.

Generated at Thu Feb 08 05:22:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.