[SERVER-13339] One constructor for Client::Context does not properly set the curop Created: 25/Mar/14 Updated: 03/Jan/18 Resolved: 07/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Diagnostics, Internal Code |
| Affects Version/s: | 2.4.9, 2.6.0-rc2 |
| Fix Version/s: | 3.2.0-rc0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Linda Qin | Assignee: | Kaloian Manassiev |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Steps To Reproduce: |
|
||||
| Sprint: | Sharding A (10/09/15) | ||||
| Participants: | |||||
| Case: | (copied to CRM) | ||||
| Description |
|
The two-argument constructor for Client::Context does not call enter() on the curop object, so the current op is never properly updated. This causes some weird behavior, such as the entries with empty field names in the output of the "top" command. I suspect (though have not confirmed) that this also has an impact on the profiler and the output of the inprog command (db.currentOp()). This problem was exacerbated in 2.6 by the introduction of more callers of this 2-argument form of the Client::Context constructor. PREVIOUS DESCRIPTION: In 2.4, if a node is/was a secondary, there is a sub-document with empty field name in the output of top command as below. It has records for "writeLock" and "total".
In 2.6, it shows up on all the nodes in the replica set, with records for "readLock", "writeLock" (only for secondaries) and "total".
|
| Comments |
| Comment by Githook User [ 07/Oct/15 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: |
| Comment by Kaloian Manassiev [ 05/Oct/15 ] |
|
In 3.0 and 3.2, we tightened the rules for when CurOp is being entered and instantiated, so this bug should not be present anymore. In 3.2 this constructor is no longer used, so I'll remove it. |
| Comment by Githook User [ 20/Feb/15 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: |
| Comment by Githook User [ 13/Feb/15 ] |
|
Author: {u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}Message: |
| Comment by Spencer Brody (Inactive) [ 04/Apr/14 ] |
|
An alternative patch that may be slightly safer is to not call of of _finishInit from that constructor, but just add a call to _client_curOp->enter(this), which is the part of _finishInit that sets the namespace to the CurOp object properly. There is already another form of the Context constructor that does just that: https://github.com/mongodb/mongo/blob/master/src/mongo/db/client.cpp#L347 MCI run with this smaller change: https://mci.10gen.com/ui/version/533ef3d53ff1221570000004_0 Edit: The first patch seems to have seriously broken things, and the second seems to have broken profiling, at least. It seems like neither of these approaches are correct. Someone with a better understanding of the implications of changes to the Client::Context constructors should take a look. |
| Comment by Spencer Brody (Inactive) [ 04/Apr/14 ] |
|
Okay, it looks like the data that goes into the object returned by "top" comes from the UsageMap maintained in top.cpp. Entries get added into that map from Top::record, which is called by CurOp::recordGlobalTime, which is called in the destructor for Client::Context. The namespace added to the map is set in CurOp::recordGlobalTime and comes from the _ns stored in the CurOp object. The problem seems to be coming from the fact that there is one constructor of Client::Context that doesn't call _finishInit and thus doesn't call _client->_curOp->enter(this), which is what sets the _ns field in the CurOp object. This is the two-argument form of the Context contructor, that takes a string ns and a Database* db. Anyone who uses this form of Client::Context will result in adding entries to the Top::UsageMap with an empty string as the namespace, as the underlying CurOp object never gets initialized to the namespace of the Context, but the Context's destructor still calls _client->_curOp->recordGlobalTime. The weird thing is that this behavior seems to be intentional? This comment seems to suggest that this form of the Context constructor not calling _finishInit is intentional, but it's not clear why. The main caller of that form of the Context constructor that I was seeing is from the cursor management code in collection_cursor_cache.cpp, though there are other callers as well. It seems like that was introduced in https://github.com/mongodb/mongo/commit/5548dbb7a13936911b1b482ddd417cf5879e01b4. Adding _finishInit() to the end of the two-argument Client::Context constructor seems to fix the problem, but I'm worried what other effects it may have, especially as I don't understand why it was omitted in the first place. I'm running a patch that does that through MCI right now: https://mci.10gen.com/ui/version/533ef1923ff1221570000003_0 |