-
Type:
Spec Change
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Component/s: BSON
-
None
-
Needed
Summary
Determine and clarify whether the BSON Datetime element is a Unix timestamp or a UTC timestamp. Audit implementations to ensure consistency.
Motivation
Who is the affected end user?
All implementations of the BSON specification, and users of said implementations, which implement the BSON Datetime element as one when it should be the other.
How does this affect the end user?
The bsonspec.org specification currently defines a BSON element with type ID 9 is a "UTC datetime" with the following note:
UTC datetime - The int64 is UTC milliseconds since the Unix epoch.
As a duration, there is no distinction between UTC and Unix, as they share the same epoch and the same millisecond-based increment (for dates later than 1972-01-01; it's complicated). However, as a time point (that is, when this duration is interpreted as a calendrical date), there is an important distinction. Quoting the Unix Time Wikipedia page:
Unix time differs from both Coordinated Universal Time (UTC) ... in its handling of leap seconds. UTC includes leap seconds that adjust for the discrepancy between precise time, as measured by atomic clocks, and solar time, relating to the position of the earth in relation to the sun. ... In Unix time, every day contains exactly 86400 seconds. Each leap second uses the timestamp of a second that immediately precedes or follows it.
The current Drivers specification for the Extended JSON format also references ISO-8601 (UTC) and RFC 3339 (UTC), but makes no explicit mention of leap seconds (and per git history, never has):
- Datetime [year from 1970 to 9999 inclusive]: ... ISO-8601 Internet Date/Time Format as described in RFC-33394 with maximum time precision of milliseconds as a string ...
- Datetime [year before 1970 or after 9999]: ... Same as Canonical Extended JSON ...
The current set of Drivers specification tests which test the calendrical date representation of BSON datetime elements are very few and do not include edge cases like "23:59:60Z" (leap second).
In short, there is a discrepancy between specification and some implementations whether BSON datetime should account for leap seconds (among other UTC-specific intricacies) when the "milliseconds since the Unix epoch" (duration) is converted to/from a calendrical date (time point). The specification implies "yes" (UTC), but some implementations suggest "no" (Unix). For example, quoting comments in SERVER-101955:
We're likely to have the first negative leap second in 2029 (irrelevant here except that they aren't dead yet). What happens when we get one? I don't know. ... I don't think accepting these timestamps adds ambiguity. The ambiguity is already there. ... I think we should try to think about how to handle it in a sensible way that doesn't reject valid input. ... If there's a 23:59:59 stamp on a negative leap second, specifying an ISO second that doesn't exist, we don't reject it. Just naively map it to a Unix time that does exist. No code change required for this behavior. ... I don't think there's any answer without downsides. We should think about it and do something about it.
How likely is it that this problem or use case will occur?
Unclear. Many implementations seem to currently treat BSON Date as a Unix timestamp despite the specification's current wording indicating it should be a UTC timestamp. Depending on the BSON implementation being used, as well as how such an implementation is being used to convert a BSON Datetime into a calendrical date (i.e. as a string representation in extended JSON, logs, etc.), there may be several (leap) seconds of discrepancy between UTC and Unix (current 27 seconds in total?).
If the problem does occur, what are the consequences and how severe are they?
Unclear. This may be a consistency issue across BSON implementations and user-facing API. The scope and consequences of this discrepancy between Unix time and UTC time depend on the needs of the user. How many users have applications which actually require the BSON Datetime to be interpreted as a UTC time point instead of a Unix time point? Are they fine with nearly half a minute difference between the observed time point vs. the expected UTC (or local-converted) time point?
If most (all?) implementations are currently interpreting this value as Unix time point rather than a UTC time point, the specification should be updated to more accurately describe real-world practice. Regardless, implementations may need to be carefully audited to ensure consistent and clearly-disambiguated behavior when this BSON element is converted to/from calendrical date representations as either a UTC time point, Unix time point, or both.
Is this issue urgent?
Unsure.
Is this ticket required by a downstream team?
This ticket is required by any team implementing or using the BSON specification, specifically concerning the interpretation of the BSON Datetime element as a time point (calendrical date).
Is this ticket only for tests?
No, this is an "intent of the specification" clarification and update ticket which may require spec changes ("UTC" -> "Unix") or implementation audits (Unix -> UTC).
Acceptance Criteria
- Determine the current BSON implementation and usage situation.
- How many implementations currently interpret BSON Datetime as Unix? as UTC?
- What is the scope and impact of "fixing" implementations to switch to, or add support for, one or the other?
- Decide and clarify whether BSON Datetime is Unix or UTC.
- Audit BSON implementations and usage to ensure consistency with the clarified BSON specification as either UTC or Unix.
- is related to
-
SERVER-101955 time_support.cpp: dateFromISOString rejects leap seconds
-
- In Progress
-