[SERVER-30842] Don't try to set last optime for client backwards after rollback Created: 25/Aug/17  Updated: 30/Oct/23  Resolved: 28/Aug/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.4.10, 3.5.13

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Spencer Brody (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File optime_log_messages.png    
Issue Links:
Backports
Depends
Related
is related to SERVER-47695 Write commands run by threads that ca... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.4
Sprint: Repl 2017-09-11
Participants:
Linked BF Score: 0

 Description   

A write that fails may result in a call to setLastOpToSystemLastOpTime. We invariant that the last optime on a client never goes backwards, but the system optime can indeed go backwards after a rollback. If that happens we should just refuse to set the last optime on the client backwards, but we shouldn't crash.



 Comments   
Comment by Scott Glajch [ 21/Jun/18 ]

I am referencing that log line.

Given your explanation, it's really odd that we are seeing these messages at all, much less that often.  We actually run 2 mongo clusters, one for our main storage (where we store data in a number of ways), and one for our raw events, where we have only 1 type of collection, which is essentially each raw event.  The raw event cluster is the cluster we have migrated to 3.6 and is having this problem (we migrate this cluster first as it is less mission critical and we can sniff out upgrade problems first there).  Our writes are therefore only written once and never updated (so no $sets there), and then occasionally read, either individually or in aggregations for some simple reporting.  The collection is using ObjectIDs for the _id of the collection, so I doubt DuplicateKeyErrors are happening.

However I have found a pattern in the logs.  These log lines are coming from the primary shard for a replica set, and right before these log lines happen, I see something like this:

2018-06-21T01:11:57.742+0000 I SHARDING [ConfigServerCatalogCacheLoader-299] Cache loader remotely refreshed for collection config.system.sessions from version 0|0||000000000000000000000000 and no metadata was found.
2018-06-21T01:11:57.742+0000 I SHARDING [ConfigServerCatalogCacheLoader-299] Refresh for collection config.system.sessions took 4 ms and found the collection is not sharded
2018-06-21T01:11:57.742+0000 I COMMAND  [ShardServerCatalogCacheLoader-328] CMD: drop config.cache.chunks.config.system.sessions
2018-06-21T01:12:03.665+0000 I ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Connecting to p1ecfg4.prod.evergage.com:27017
2018-06-21T01:12:03.672+0000 I SHARDING [ConfigServerCatalogCacheLoader-299] Cache loader remotely refreshed for collection databaseName.collectionName from collection version 0|0||000000000000000000000000 and found collection version 313|1||54ee6b224c0a4ab1d04d28eb
2018-06-21T01:12:03.672+0000 I SHARDING [ConfigServerCatalogCacheLoader-299] Cache loader found enqueued metadata from 6|4||54ee6b224c0a4ab1d04d28eb to 313|1||54ee6b224c0a4ab1d04d28eb and no persisted metadata, GTE cache version 0|0||000000000000000000000000
2018-06-21T01:12:03.673+0000 I STORAGE  [ShardServerCatalogCacheLoader-328] createCollection: config.cache.chunks.databaseName.collectionName with no UUID.
2018-06-21T01:12:03.674+0000 I SHARDING [conn7438] Marking collection databaseName.collectionName as sharded with collection version: 313|1||54ee6b224c0a4ab1d04d28eb, shard version: 313|1||54ee6b224c0a4ab1d04d28eb
2018-06-21T01:12:03.685+0000 I INDEX    [ShardServerCatalogCacheLoader-328] build index on: config.cache.chunks.databaseName.collectionName properties: { v: 2, key: { lastmod: 1 }, name: "lastmod_1", ns: "config.cache.chunks.databaseName.collectionName" }
2018-06-21T01:12:03.685+0000 I INDEX    [ShardServerCatalogCacheLoader-328]      building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2018-06-21T01:12:03.688+0000 I INDEX    [ShardServerCatalogCacheLoader-328] build index done.  scanned 0 total records. 0 secs
2018-06-21T01:12:03.689+0000 I REPL     [ShardServerCatalogCacheLoader-328] Not setting the last OpTime for this Client from { ts: Timestamp(1529543523, 606), t: 9 } to the current system time of { ts: Timestamp(1529543523, 588), t: 9 } as that would be moving the OpTime backwards.  This should only happen if there was a rollback recently
2018-06-21T01:12:03.689+0000 I REPL     [ShardServerCatalogCacheLoader-328] Not setting the last OpTime for this Client from { ts: Timestamp(1529543523, 606), t: 9 } to the current system time of { ts: Timestamp(1529543523, 588), t: 9 } as that would be moving the OpTime backwards.  This should only happen if there was a rollback recently

Maybe config.system.sessions not being sharded is a problem?  Or maybe the index build on these config.cache.chunks collections do writes on "lastmod" that aren't unique?

 

Comment by Spencer Brody (Inactive) [ 20/Jun/18 ]

I assume by "the log line" you are referring to this message?
That log message should only occur after a replication rollback, and if you are doing write operations that turn out to be no-ops (such as doing an update with $set to set a field to the value it already has, or an insert that hits a DuplicateKeyError, for example). Assuming you're seeing it only around rollbacks, the next question to ask would be why you're having so many unplanned failovers that are triggering rollbacks?

Comment by Scott Glajch [ 20/Jun/18 ]

Here is a screenshot showing how often these messages happened during the 3.4 rolling upgrade (mostly after the stepdowns on the shard nodes).  They still happen from time to time after the upgrade, though usually they max at a few hundred per minute and not 100K.

Comment by Scott Glajch [ 20/Jun/18 ]

This fix has caused the log line to be logged a lot on our systems after upgrading from 3.4.9 to 3.6.5.  To the point where I'd consider doing a custom build to remove this log line or up the level in which this is logged.  At one point it made it hard for our log parser to keep up with the amount of lines being logged.

Is there any plan to lessen the extent of this?  If I'm seeing this message a lot, is it worrying?

Comment by Githook User [ 05/Oct/17 ]

Author:

{'email': 'spencer@mongodb.com', 'name': 'Spencer T Brody', 'username': 'stbrody'}

Message: SERVER-30842 Don't try to set last optime for client backwards after rollback

(cherry picked from commit cee116558d7771b520c0fd4dd239c0a987dcc2ec)
Branch: v3.4
https://github.com/mongodb/mongo/commit/88049437b3bdf8df78051b9885b9600892640141

Comment by Githook User [ 28/Aug/17 ]

Author:

{'username': 'stbrody', 'name': 'Spencer T Brody', 'email': 'spencer@mongodb.com'}

Message: SERVER-30842 Don't try to set last optime for client backwards after rollback
Branch: master
https://github.com/mongodb/mongo/commit/cee116558d7771b520c0fd4dd239c0a987dcc2ec

Generated at Thu Feb 08 04:25:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.