[SERVER-20796] Warning on replica that collection lacks a unique index on _id, though index is present Created: 07/Oct/15 Updated: 09/Jun/16 Resolved: 13/Dec/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 2.6.9, 2.6.10 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marcin Gozdalik | Assignee: | Wan Bachtiar |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
I set up a replica against a master server. The master has a few databases.
The warnings are not correct as the collections have default indexes on _id, and it's not possible to insert a document with duplicated _id on master:
The indexes are visible on replica:
If I create a new database, the replica does not complain, so it seems there is something off with the old databases, though they have been created with the default setting, and _id index was not dropped.
I'm not sure if any of it matters, but all of the information for completeness. There is a lot of data in the DBs:
and setting up the replica fails sometimes (probably because the oplog is too small). Replication configuration
Replication status:
Master is running:
Replica is running:
|
| Comments |
| Comment by Wan Bachtiar [ 13/Dec/15 ] | |||||||||||||||||
|
Hi Marzin, We haven’t heard back from you for some time, so I’m Regards, Wan. | |||||||||||||||||
| Comment by Wan Bachtiar [ 25/Nov/15 ] | |||||||||||||||||
|
Hi Marzin, As Ramon has previously mentioned, the warnings appear after an unclean shutdown. The member received a termination signal 15 (SIGTERM):
The member was terminated during the synchronisation of oplog but before the index building stage. This has caused some collections to exist without indexes. After the member was restarted, a startup check process discovered that some collections exist without the _id index which triggered the warning messages. The startup process then reset the initial-sync process, by dropping all the databases and restarting the cloning process from the primary :
This time the initial-sync process completed successfully, and the member entered the SECONDARY state.
If the primary oplog is too small, the primary would be vetoed from being the source of the sync and there would be entries in the log file as well. In this case, there are no entries in the log file that suggested the primary had been vetoed. The log entries should be similar to:
Can you upload the primary log between 2015-09-02T02:42:04 and 2015-09-03T06:02:07? This is to confirm whether there are lots of short-lived connections caused by replication updates on deleted documents (as described in Based on the output of your rs.conf(), you only have two members in the replica set which means failure on either of the members will result in no primary. The minimum recommended replica set deployment for production system is a three-member replica set. Adding a third member to your replica set will provide fault tolerance and high availability. See Three Member Replica Set for more information. Kind Regards, Wan. | |||||||||||||||||
| Comment by Marcin Gozdalik [ 29/Oct/15 ] | |||||||||||||||||
| |||||||||||||||||
| Comment by Ramon Fernandez Marina [ 28/Oct/15 ] | |||||||||||||||||
|
Thanks for uploading the logs gozdal, and apologies it took a while to go through them. What I see is that these warnings appear once after what seems to be an unclean shutdown, so I need to ask you:
Thanks, | |||||||||||||||||
| Comment by Marcin Gozdalik [ 14/Oct/15 ] | |||||||||||||||||
|
ramon.fernandez can I do something more? | |||||||||||||||||
| Comment by Marcin Gozdalik [ 08/Oct/15 ] | |||||||||||||||||
|
I attach logs. They are huge (7.5GB uncompressed) because of many replication errors. We had to apply workaround described in https://jira.mongodb.org/browse/SERVER-18721 to be able to synchronize with primary and first attempts failed. | |||||||||||||||||
| Comment by Marcin Gozdalik [ 08/Oct/15 ] | |||||||||||||||||
|
The actual log file is compressed with bzip2 but split into two parts because of Jira 150MB file size limitation.
WARNING: the unpacked file is 7.5GB. | |||||||||||||||||
| Comment by Marcin Gozdalik [ 08/Oct/15 ] | |||||||||||||||||
|
Here is rs.conf():
| |||||||||||||||||
| Comment by Ramon Fernandez Marina [ 07/Oct/15 ] | |||||||||||||||||
|
gozdal, can you please send the logs for the secondary that's missing indexes from the time it was started until it became a secondary? Also, can you please post the output of rs.conf()? Thanks, |