[SERVER-71937] Validating a time-series view cannot find the bucket collection Created: 07/Dec/22  Updated: 29/Oct/23  Resolved: 25/Jan/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.3.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Yuhong Zhang Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-72709 Should dropDatabase remove the views ... Closed
Assigned Teams:
Storage Execution
Backwards Compatibility: Fully Compatible
Sprint: Execution Team 2023-02-06
Participants:
Linked BF Score: 5

 Description   

As dropping the time-series bucket collection and view is done in two storage transactions and we drop the bucket collection first, there could be scenarios where validation can't find the underlying bucket collection for a time-series view. We should make it a warning instead.



 Comments   
Comment by Githook User [ 25/Jan/23 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-71937 Change dropDatabase to drop the views collection first to ensure time-series deletion oplog order of view THEN buckets
Branch: master
https://github.com/mongodb/mongo/commit/3770a31b999e708bca5fb7500267bf829ddc2a2e

Comment by Githook User [ 25/Jan/23 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@mongodb.com', 'username': 'DiannaHohensee'}

Message: SERVER-71937 Change dropDatabase to drop the views collection first to ensure time-series deletion oplog order of view THEN buckets
Branch: master
https://github.com/10gen/mongo-enterprise-modules/commit/8cb48ab0796a6246f629af546213d8fdb692723a

Comment by Dianna Hohensee (Inactive) [ 12/Jan/23 ]

Spoke offline with Yuhong. We think the best approach is to change dropDatabase to drop the views collection first – instead of randomly – before the other collections. That way we can ensure that a buckets collection can exist without a view entry, but not the other way around. Then, validate will not longer be able to find a view and subsequently fail to find a buckets collection and return an error.

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

We appear to hold a MODE_X database lock across dropDatabase for the duration of all collection drops.

I think this leads to the conclusion that dropDatabase and dropCollection both block access to both the view and buckets namespaces while the time-series collection state is removed. So the validate code should only see either both or neither. If validate finds neither a collection or view, the code throws a NamespaceNotFound error. It seems like it would make sense to throw a NamespaceNotFound error with a different err msg if neither the view nor buckets is found – checked under a continuous lock. I don't know what's appropriate for actually finding either the view or buckets and not the other – what should the user do in that case? The oplog could be chopped in between by rollback/crash recovery, I imagine.

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

The dropCollection logic removes both namespaces under locks on both namespaces. So in that case the validate cmd should never find one without the other because it checks under at least one lock. I'm not sure about the dropDatabase cmd yet.

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

Ah. This failure scenario is specific to dropDatabase – what the BF failure logs show occurred --, not dropCollection. dropDatabase does buckets drop first, then view deletion; whereas dropCollection deletes view first, then buckets drop. Makes sense now that it's an OK scenario for the buckets collection to be missing when the view is still present.

// create
 
// Timestamp(1673386580, 1)
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("9d2bb203-b2e3-4fbe-a56a-98adad321128"), "o" : { "create" : "system.buckets.ts", "validator" : { "$jsonSchema" : { "bsonType" : "object", "required" : [ "_id", "control", "data" ], "properties" : { "_id" : { "bsonType" : "objectId" }, "control" : { "bsonType" : "object", "required" : [ "version", "min", "max" ], "properties" : { "version" : { "bsonType" : "number" }, "min" : { "bsonType" : "object", "required" : [ "t" ], "properties" : { "t" : { "bsonType" : "date" } } }, "max" : { "bsonType" : "object", "required" : [ "t" ], "properties" : { "t" : { "bsonType" : "date" } } }, "closed" : { "bsonType" : "bool" }, "count" : { "bsonType" : "number", "minimum" : 1 } }, "additionalProperties" : false }, "data" : { "bsonType" : "object" }, "meta" : {  } }, "additionalProperties" : false } }, "clusteredIndex" : true, "timeseries" : { "timeField" : "t", "granularity" : "minutes", "bucketMaxSpanSeconds" : 86400 } }, "ts" : Timestamp(1673386580, 1), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:36:20.226Z") }
 
// Timestamp(1673386580, 2)
{ "op" : "i", "ns" : "godb.system.views", "ui" : UUID("bff8ee7d-88d2-4088-9f4b-2e063808e2d3"), "o" : { "_id" : "godb.ts", "viewOn" : "system.buckets.ts", "pipeline" : [ { "$_internalUnpackBucket" : { "timeField" : "t", "bucketMaxSpanSeconds" : 86400 } } ] }, "o2" : { "_id" : "godb.ts" }, "ts" : Timestamp(1673386580, 2), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:36:20.227Z") }
 
 
// dropDatabase
 
// Timestamp(1673386591, 1) -- buckets
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("9d2bb203-b2e3-4fbe-a56a-98adad321128"), "o" : { "drop" : "system.buckets.ts" }, "o2" : { "numRecords" : 0 }, "ts" : Timestamp(1673386591, 1), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:36:31.142Z") }
 
// Timestamp(1673386591, 2) -- view
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("bff8ee7d-88d2-4088-9f4b-2e063808e2d3"), "o" : { "drop" : "system.views" }, "o2" : { "numRecords" : 1 }, "ts" : Timestamp(1673386591, 2), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:36:31.144Z") }
 
// Timestamp(1673386591, 3) -- database
{ "op" : "c", "ns" : "godb.$cmd", "o" : { "dropDatabase" : 1 }, "ts" : Timestamp(1673386591, 3), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:36:31.236Z") }

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

Ah. The validate logic releases and locks a new namespace here before the check, so the time-series view namespace might also be gone at this point: we don't recheck for it.

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

db.oplog.rs.find({}).limit(20).sort({ts:-1})
 
// create
 
// Timestamp(1673385204, 1)
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("5b2a849b-af1c-47be-bd6e-d060040c66d9"), "o" : { "create" : "system.buckets.ts", "validator" : { "$jsonSchema" : { "bsonType" : "object", "required" : [ "_id", "control", "data" ], "properties" : { "_id" : { "bsonType" : "objectId" }, "control" : { "bsonType" : "object", "required" : [ "version", "min", "max" ], "properties" : { "version" : { "bsonType" : "number" }, "min" : { "bsonType" : "object", "required" : [ "t" ], "properties" : { "t" : { "bsonType" : "date" } } }, "max" : { "bsonType" : "object", "required" : [ "t" ], "properties" : { "t" : { "bsonType" : "date" } } }, "closed" : { "bsonType" : "bool" }, "count" : { "bsonType" : "number", "minimum" : 1 } }, "additionalProperties" : false }, "data" : { "bsonType" : "object" }, "meta" : {  } }, "additionalProperties" : false } }, "clusteredIndex" : true, "timeseries" : { "timeField" : "t", "granularity" : "minutes", "bucketMaxSpanSeconds" : 86400 } }, "ts" : Timestamp(1673385204, 1), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:13:24.434Z") }
 
// Timestamp(1673385204, 2)
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("bff8ee7d-88d2-4088-9f4b-2e063808e2d3"), "o" : { "create" : "system.views", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_" } }, "ts" : Timestamp(1673385204, 2), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:13:24.458Z") }
 
// Timestamp(1673385204, 3)
{ "op" : "i", "ns" : "godb.system.views", "ui" : UUID("bff8ee7d-88d2-4088-9f4b-2e063808e2d3"), "o" : { "_id" : "godb.ts", "viewOn" : "system.buckets.ts", "pipeline" : [ { "$_internalUnpackBucket" : { "timeField" : "t", "bucketMaxSpanSeconds" : 86400 } } ] }, "o2" : { "_id" : "godb.ts" }, "ts" : Timestamp(1673385204, 3), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:13:24.466Z") }
 
..........
 
// drop
 
// Timestamp(1673385313, 1)
{ "op" : "d", "ns" : "godb.system.views", "ui" : UUID("bff8ee7d-88d2-4088-9f4b-2e063808e2d3"), "o" : { "_id" : "godb.ts" }, "ts" : Timestamp(1673385313, 1), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:15:13.066Z") }
 
// Timestamp(1673385313, 2)
{ "op" : "c", "ns" : "godb.$cmd", "ui" : UUID("5b2a849b-af1c-47be-bd6e-d060040c66d9"), "o" : { "drop" : "system.buckets.ts" }, "o2" : { "numRecords" : 0 }, "ts" : Timestamp(1673385313, 2), "t" : NumberLong(8), "v" : NumberLong(2), "wall" : ISODate("2023-01-10T21:15:13.070Z") }
 

Comment by Dianna Hohensee (Inactive) [ 10/Jan/23 ]

It looks like the view is dropped before the underlying buckets collection of a time-series namespace. So a buckets collection could exist without a view registered on the time-series namespace.

It isn't yet obvious how we can run into the "Cannot validate a time-series collection without its bucket collection" error, to make sure being in such a state is not error-worthy. I'll keep digging.

Generated at Thu Feb 08 06:20:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.