[SERVER-26854] LockStats for sub-operations no longer include time for previous sub ops Created: 31/Oct/16  Updated: 26/Dec/18  Resolved: 08/Sep/18

Status: Closed
Project: Core Server
Component/s: Concurrency, Storage
Affects Version/s: None
Fix Version/s: 3.6.9, 4.0.4, 4.1.3

Type: Bug Priority: Major - P3
Reporter: Geert Bosch Assignee: Xiangyu Yao (Inactive)
Resolution: Done Votes: 0
Labels: nyc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6
Steps To Reproduce:

Something involving multiple DBDirectClient calls doing writes.

Sprint: Storage NYC 2018-09-10
Participants:

 Description   
Issue Status as of Oct 01, 2018

ISSUE DESCRIPTION AND IMPACT
This ticket fixes some previously redundant accounting on operation times related to sub-operations (operations triggered by another operation rather than a user command):

  • Before this change, lock statistics were not re-set when starting new sub-operations, causing each sub-operation to report the aggregate lock time/lock count of itself and all preceding operations.
  • After this change, each sub-operation will only report its own lock statistics in both log and currentOp() output.

To illustrate the change, below are the before and after log snippets for an aggregation operation using $out blocked by a fsyncLock for 2 seconds (for illustration purposes). Before this change:

2018-09-10T10:27:22.344-0400 I COMMAND  [conn2] command test.tmp.agg_out.1 appName: "MongoDB Shell" command: create { create: "tmp.agg_out.1", temp: true, $db: "test" } numYields:0 reslen:38 locks:{ Global: { acquireCount: { r: 7, w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 5, W: 1 } }, Collection: { acquireCount: { r: 4 } } } protocol:op_msg 2230ms
2018-09-10T10:27:22.344-0400 I COMMAND  [conn2] command test.system.indexes appName: "MongoDB Shell" command: insert { insert: "system.indexes", ordered: true, $db: "test" } ninserted:0 numYields:0 locks:{ Global: { acquireCount: { r: 8, w: 2 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 5, W: 2 } }, Collection: { acquireCount: { r: 4 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.344-0400 I COMMAND  [conn2] command admin.$cmd appName: "MongoDB Shell" command: getLastError { getlasterror: 1, $db: "admin" } numYields:0 reslen:95 locks:{ Global: { acquireCount: { r: 8, w: 2 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 5, W: 2 } }, Collection: { acquireCount: { r: 4 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.345-0400 I COMMAND  [conn2] command test.tmp.agg_out.1 appName: "MongoDB Shell" command: insert { insert: "tmp.agg_out.1", ordered: true, $db: "test" } ninserted:1 keysInserted:1 numYields:0 locks:{ Global: { acquireCount: { r: 11, w: 3 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 7, w: 1, W: 2 } }, Collection: { acquireCount: { r: 6, w: 1 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.345-0400 I COMMAND  [conn2] command admin.$cmd appName: "MongoDB Shell" command: getLastError { getlasterror: 1, $db: "admin" } numYields:0 reslen:95 locks:{ Global: { acquireCount: { r: 11, w: 3 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 7, w: 1, W: 2 } }, Collection: { acquireCount: { r: 6, w: 1 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.345-0400 I COMMAND  [conn2] command test.$cmd appName: "MongoDB Shell" command: listCollections { listCollections: 1, filter: { name: " favorite" }, cursor: {}, $readPreference: { mode: "secondaryPreferred" }, $db: "test" } numYields:0 reslen:302 locks:{ Global: { acquireCount: { r: 14, w: 3, W: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 8, w: 1, W: 2 } }, Collection: { acquireCount: { r: 7, w: 1 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.345-0400 I COMMAND  [conn2] command test.favorite appName: "MongoDB Shell" command: listIndexes { listIndexes: "favorite", cursor: {}, $db: "test" } numYields:0 reslen:190 locks:{ Global: { acquireCount: { r: 15, w: 3, W: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 9, w: 1, W: 2 } }, Collection: { acquireCount: { r: 8, w: 1 } } } protocol:op_msg 0ms
2018-09-10T10:27:22.345-0400 I COMMAND  [conn2] renameCollectionForCommand: rename test.tmp.agg_out.1 to test.favorite and drop test.favorite.
2018-09-10T10:27:22.345-0400 I STORAGE  [conn2] Finishing collection drop for test.favorite (0cc548b7-6bfb-4f56-be1f-80d73b3663e0).
2018-09-10T10:27:22.345-0400 I STORAGE  [conn2] renameCollection: renaming collection 25304fc9-3257-49f6-bb9f-5b43d1884850 from test.tmp.agg_out.1 to test.favorite
2018-09-10T10:27:22.362-0400 I COMMAND  [conn2] command test.tmp.agg_out.1 appName: "MongoDB Shell" command: renameCollection { renameCollection: "test.tmp.agg_out.1", to: "test.favorite", dropTarget: true, $db: "admin" } numYields:0 reslen:38 locks:{ Global: { acquireCount: { r: 16, w: 4, W: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 9, w: 1, W: 3 } }, Collection: { acquireCount: { r: 8, w: 1 } } } protocol:op_msg 16ms
2018-09-10T10:27:22.362-0400 I COMMAND  [conn2] command test.favorite appName: "MongoDB Shell" command: aggregate { aggregate: "books", pipeline: [ { $match: { title: "1984" } }, { $out: "favorite" } ], cursor: {}, lsid: { id: UUID("0956a2cb-cfd4-4438-8842-a9c58ae0afe7") }, $db: "test" } planSummary: COLLSCAN keysExamined:0 docsExamined:3 cursorExhausted:1 numYields:1 nreturned:0 reslen:99 locks:{ Global: { acquireCount: { r: 16, w: 4, W: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2211786 } }, Database: { acquireCount: { r: 9, w: 1, W: 3 } }, Collection: { acquireCount: { r: 8, w: 1 } } } protocol:op_msg 2250ms

After this change, note that each sub-operation does not report the

timeAcquiringMicros: {w: 2213296}

of the entire operation:

2018-09-07T21:06:43.061-0400 I COMMAND  [conn2] command test.tmp.agg_out.2 appName: "MongoDB Shell" command: create { create: "tmp.agg_out.2", temp: true,$db: "test" } numYields:0 reslen:38 locks:{ Global: { acquireCount: { r: 1, w: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2213296 } }, Database: { acquireCount: { W: 1 } } } protocol:op_msg 2232ms
2018-09-07T21:06:43.062-0400 I COMMAND  [conn2] command test.tmp.agg_out.2 appName: "MongoDB Shell" command: createIndexes { createIndexes: "tmp.agg_out.2", indexes: [ { v: 2, key: { _id: 1 }, name: "_id_", ns: "test.tmp.agg_out.2" } ], $db: "test" } numYields:0 reslen:150 locks:{ Global: { acquireCount: { r: 3, w: 1 } }, Database: { acquireCount: { r: 1, W: 1 } } } protocol:op_msg 0ms
2018-09-07T21:06:43.064-0400 I COMMAND  [conn2] command test.$cmd appName: "MongoDB Shell" command: listCollections { listCollections: 1, filter: { name: "favorite" }, cursor: {}, $readPreference: { mode: "secondaryPreferred" }, $db: "test" } numYields:0 reslen:302 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } }} protocol:op_msg 0ms
2018-09-07T21:06:43.064-0400 I COMMAND  [conn2] command test.favorite appName: "MongoDB Shell" command: listIndexes { listIndexes: "favorite", cursor: {},$db: "test" } numYields:0 reslen:190 locks:{ Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } protocol:op_msg 0ms
2018-09-07T21:06:43.065-0400 I COMMAND  [conn2] renameCollectionForCommand: rename test.tmp.agg_out.2 to test.favorite and drop test.favorite.
2018-09-07T21:06:43.065-0400 I STORAGE  [conn2] Finishing collection drop for test.favorite (068270d6-084f-4fb3-aa3d-b091d53109e2).
2018-09-07T21:06:43.066-0400 I STORAGE  [conn2] renameCollection: renaming collection fca8ced0-fb60-4ca1-a74f-b58001db3c88 from test.tmp.agg_out.2 to test.favorite
2018-09-07T21:06:43.073-0400 I COMMAND  [conn2] command test.tmp.agg_out.2 appName: "MongoDB Shell" command: renameCollection { renameCollection: "test.tmp.agg_out.2", to: "test.favorite", dropTarget: true, $db: "admin" } numYields:0 reslen:38 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_msg 8ms
2018-09-07T21:06:43.073-0400 I COMMAND  [conn2] CMD: drop test.tmp.agg_out.2
2018-09-07T21:06:43.074-0400 I COMMAND  [conn2] command test.$cmd appName: "MongoDB Shell" command: drop { drop: "tmp.agg_out.2", writeConcern: { w: 0, wtimeout: 0 }, $db: "test" } numYields:0 ok:0 errMsg:"ns not found" errName:NamespaceNotFound errCode:26 reslen:105 locks:{ Global: { acquireCount: { r: 1, w: 1 } }, Database: { acquireCount: { W: 1 } } } protocol:op_msg 0ms
2018-09-07T21:06:43.074-0400 I COMMAND  [conn2] command test.tmp.agg_out.2 appName: "MongoDB Shell" command: aggregate { aggregate: "books", pipeline: [ {$match: { title: "1984" } }, { $out: "favorite" } ], cursor: {}, lsid: { id: UUID("6da968ae-a6ae-445d-b11e-265eb2d31cb8") }, $db: "test" } planSummary: COLLSCAN keysExamined:0 docsExamined:3 ninserted:1 cursorExhausted:1 keysInser$ed:1 numYields:1 nreturned:0 queryHash:6E0D6672 reslen:99 locks:{ Global: { acquireCount: { r: 19, w: 5, W: 1 }, acquireWaitCount: { w: 1 }, timeAcquiringMicros: { w: 2213296 } }, Database: { acquireCount: { r: 10, w: 1, W: 4 } },Collection: { acquireCount: { r: 8, w: 1 } } } protocol:op_msg 2248ms

AFFECTED VERSIONS
This change is available in MongoDB 4.2.

Original description

Because we don't reset LockStats when starting new sub-operations, each sub-operation will report the aggregate time of itself and all preceding operations, as well as any locking time incurred by profiling. This means the timeAcquiringMicros can quickly exceed elapsed time for the sub-operation and lead to confusion, especially as the log messages do not even indicate that the operation is a sub-operation.



 Comments   
Comment by Githook User [ 04/Oct/18 ]

Author:

{'name': 'Xiangyu Yao', 'email': 'xiangyu.yao@mongodb.com', 'username': 'xy24'}

Message: SERVER-26854 LockStats for sub-operations should not include wait time for previous operations

(cherry picked from commit b5ca922655a1d0a438c39c38b2c4434016069cb7)
Branch: v4.0
https://github.com/mongodb/mongo/commit/9eb0287a1cdad5395b0bd0853141c67e65d5874f

Comment by Githook User [ 02/Oct/18 ]

Author:

{'name': 'Xiangyu Yao', 'email': 'xiangyu.yao@mongodb.com', 'username': 'xy24'}

Message: SERVER-26854 LockStats for sub-operations should not include wait time for previous operations

(cherry picked from commit b5ca922655a1d0a438c39c38b2c4434016069cb7)
Branch: v3.6
https://github.com/mongodb/mongo/commit/d8402f63a60fa45bd24b43dfe6959b5af4905cc7

Comment by Xiangyu Yao (Inactive) [ 10/Sep/18 ]

Yes, I think this fix can be backported to 4.0.
milkie Do we want to create a backport ticket now?

Comment by Alexander Gorrod [ 10/Sep/18 ]

xiangyu.yao did you consider whether we should backport this change? I think it'd be nice to get this fix in 4.0.

Comment by Githook User [ 08/Sep/18 ]

Author:

{'name': 'Xiangyu Yao', 'email': 'xiangyu.yao@mongodb.com', 'username': 'xy24'}

Message: SERVER-26854 LockStats for sub-operations should not include wait time for previous operations
Branch: master
https://github.com/mongodb/mongo/commit/b5ca922655a1d0a438c39c38b2c4434016069cb7

Generated at Thu Feb 08 04:13:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.