[SERVER-39350] OpMsg passed to ServiceEntryPoint::handleRequest() in AsyncWorkScheduler::scheduleRemoteCommand should outlive CurOp Created: 01/Feb/19  Updated: 29/Oct/23  Resolved: 03/Apr/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.1.10

Type: Bug Priority: Major - P3
Reporter: James Wahlin Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

To reproduce, apply the following patch and run the sharding_csrs_continuous_config_stepdown suite in Evergreen against Enterprise RHEL 6.2. The invariant added to builder.h will trigger when encountered.

diff --git a/etc/evergreen.yml b/etc/evergreen.yml
index a26137de34..7a8bbbbb6b 100644
--- a/etc/evergreen.yml
+++ b/etc/evergreen.yml
@@ -6227,7 +6227,7 @@ tasks:
   - func: "do setup"
   - func: "run tests"
     vars:
-      resmoke_args: --suites=sharding_continuous_config_stepdown --storageEngine=wiredTiger
+      resmoke_args: --repeat=10 --suites=sharding_continuous_config_stepdown --storageEngine=wiredTiger jstests/sharding/txn_commit_coordination_is_robust_to_killop.js
 
 - name: sharding_csrs_continuous_config_stepdown_gen
   commands:
@@ -6236,7 +6236,7 @@ tasks:
       task: sharding_csrs_continuous_config_stepdown
       suite: sharding_continuous_config_stepdown
       use_large_distro: "true"
-      resmoke_args: --storageEngine=wiredTiger
+      resmoke_args: --storageEngine=wiredTiger --repeat=10 jstests/sharding/txn_commit_coordination_is_robust_to_killop.js
       fallback_num_sub_suites: 22
 
 - <<: *task_template
diff --git a/src/mongo/bson/util/builder.h b/src/mongo/bson/util/builder.h
index b316fee068..c2b25de3c4 100644
--- a/src/mongo/bson/util/builder.h
+++ b/src/mongo/bson/util/builder.h
@@ -346,6 +346,7 @@ private:
         if (minSize > BufferMaxSize) {
             std::stringstream ss;
             ss << "BufBuilder attempted to grow() to " << minSize << " bytes, past the 64MB limit.";
+            invariant(0);
             msgasserted(13548, ss.str().c_str());
         }
 

 

Sprint: Sharding 2019-02-25, Sharding 2019-03-25, Sharding 2019-04-08
Participants:
Linked BF Score: 18

 Description   

In AsyncWorkScheduler::scheduleRemoteCommand(), we create an OpMsg object and pass to ServiceEntryPoint::handleRequest() to execute. This OpMsg object is torn down after we return the response status to the caller. This leaves us with a CurOp::_opDescription field that points to an unowned BSONObj that no longer exists. This object is unowned because copying is expensive and we expect that any OpMsg passed to ServiceEntryPoint::handleRequest() will outlive the client's CurOp object.

 

 



 Comments   
Comment by Githook User [ 03/Apr/19 ]

Author:

{'name': 'Kaloian Manassiev', 'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com'}

Message: SERVER-39350 Make `opMsgRequestFromAnyProtocol` return owned request messages
Branch: master
https://github.com/mongodb/mongo/commit/8804404d94ccada2b1060b13f1ca7b9c24692178

Comment by Esha Maharishi (Inactive) [ 11/Feb/19 ]

greg.mckeon this is a pretty bad server bug; I'm pulling it into my current sprint (not just wfbf day).

Generated at Thu Feb 08 04:51:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.