[SERVER-77828] listIndexes should report expireAfterSeconds as a float Created: 06/Jun/23  Updated: 11/Jan/24  Resolved: 11/Jan/24

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.3.0-rc0

Type: Bug Priority: Minor - P4
Reporter: Felipe Gasper Assignee: Daotang Yang
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
Related
related to SERVER-81192 Introduce background maintenance task... Closed
is related to SERVER-73442 The type of the "bits" field can diff... Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution NAMR Team 2023-07-24, Execution NAMR Team 2023-08-21, Execution NAMR Team 2023-09-04, Execution Team 2023-12-11, Execution Team 2023-12-25, Execution Team 2024-01-08, Execution Team 2024-01-22
Participants:
Linked BF Score: 135

 Description   

Per a conversation with Max just now, it seems chunk migration uses `listIndexes` to fetch index stats, which (again, per Max) reports expireAfterSeconds as an int. This value, though, is internally a floating-point, which can trigger rounding problems.

This seems to have caused REP-2567, in which the source cluster’s shards built the `x_before_epoch` index with different expireAfterSeconds values.

That happened because ttl_index_options.js sends a UNIX timestamp. Assumedly that value, these days, is just high enough that rounding errors will happen, where previously they may not have.



 Comments   
Comment by Githook User [ 11/Jan/24 ]

Author:

{'name': 'mongodt', 'email': '146988481+mongodt@users.noreply.github.com', 'username': 'mongodt'}

Message: SERVER-77828 Ensure expireAfterSeconds is stored as integer type on-disk (#17983)

GitOrigin-RevId: ee7b5dbe682a4381d1445b7fdb2dc5f3766dd199
Branch: master
https://github.com/mongodb/mongo/commit/62dcd52509c0bfecd50fbbccf8fbf86d04281c45

Comment by Githook User [ 12/Sep/23 ]

Author:

{'name': 'Dan Larkin-York', 'email': 'dan.larkin-york@mongodb.com', 'username': 'dhly-etc'}

Message: Revert "SERVER-77828 Ensure expireAfterSeconds is stored as integer type on-disk"

This reverts commit 07d91ec750e7ad0942f7bd535606f486a3647118.
Branch: master
https://github.com/mongodb/mongo/commit/555e288f0ccc22a2ce80e28bad4cd30cfb0a4697

Comment by Githook User [ 12/Sep/23 ]

Author:

{'name': 'Dan Larkin-York', 'email': 'dan.larkin-york@mongodb.com', 'username': 'dhly-etc'}

Message: Revert "SERVER-77828 Ensure expireAfterSeconds is stored as integer type on-disk"

This reverts commit 07d91ec750e7ad0942f7bd535606f486a3647118.
Branch: v7.1
https://github.com/mongodb/mongo/commit/39b086c4b28a9d07a603cfacbd68fb902a0e4d2a

Comment by Githook User [ 25/Aug/23 ]

Author:

{'name': 'Dan Larkin-York', 'email': 'dan.larkin-york@mongodb.com', 'username': 'dhly-etc'}

Message: SERVER-77828 Ensure expireAfterSeconds is stored as integer type on-disk
Branch: master
https://github.com/mongodb/mongo/commit/07d91ec750e7ad0942f7bd535606f486a3647118

Comment by Dan Larkin-York [ 02/Aug/23 ]

max.hirschhorn@mongodb.com I'll be digging into this in the next few weeks. The plan you suggested (coerce on create + upgrade rewrite) sounds like it might be an option, but I'll have to chase down the details to be sure.

As to why it cares about the current time value, there's an explanation in the comment with the code you linked. The restriction only applies to clustered collections, to handle the fact that time series collections are clustered by OID, which has a 32-bit unsigned timestamp. It's definitely janky, and could almost certainly be handled another way (e.g. by doing the restriction at TTL-delete time rather than index creation time. I suspect though that changing this doesn't have a great deal of value for most customers, as it would certainly be uncommon to have an expireAfterSeconds value that large.

Comment by Max Hirschhorn [ 02/Aug/23 ]

dan.larkin-york@mongodb.com, connie.chen@mongodb.com, has a path forward on SERVER-77828 been defined? I will reiterate that I do not consider the problem here to reside in how chunk migration fetches the index specifications from the donor shard.

My expectation for SERVER-77828 is for the Storage Execution team to do the following:

  1. Change the create, createIndexes, and collMod commands to implicitly truncate their expireAfterSeconds to an integral value. The TTL monitor thread calls BSONElement::safeNumberLong() and so this truncation was already happening for how the expiry is applied.
  2. Perform a one-time FCV upgrade transition to rewrite any stored expireAfterSeconds values in the local _mdb_catalog.wt as an integral value.

Separately it would be nice to receive an explanation for why the expireAfterSeconds validation cares about the current Unix epoch time value. It doesn't make immediate sense to me why a create command at time T can fail with InvalidOptions yet at time T+1 can succeed without issue. What am I missing?

Comment by Felipe Gasper [ 09/Jun/23 ]

Actually, int32 should round-trip safely through an IEEE 754 double:

#include <stdint.h>
#include <stdio.h>
 
int main() {
    int32_t myval = INT32_MAX;
    double valf = myval;
    int32_t myval2 = valf;
 
    printf("numbers: %d -> %f -> %d\n", myval, valf, myval2);
 
    return 0;
}

So, as long as `Math.floor(Date.now() / 1000) + 1000` (from ttl_index_options.js) fits within an int32, it shouldn’t explain the disparity seen in REP-2567.

Comment by Max Hirschhorn [ 06/Jun/23 ]

Chunk migration should fetch expireAfterSeconds as a float

I prefer we change the listIndexes command to return the true value stored in the local mdb catalog and have asked this ticket be assigned to the Storage Execution team for this reason. If we aren't going to have the createIndexes command use the NewIndexSpec struct to coerce the index specification to have an integral expireAfterSeconds then I don't see a reason for the listIndexes command to do the coercion. The implemented behavior is misleading. For example, running a createIndexes command with the output of the listIndexes command can return an ok:0 error response rather than an ok:1 "all indexes already exist" response.

Generated at Thu Feb 08 06:36:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.