[SERVER-2672] output of db.currentOp() is messed up Created: 03/Mar/11  Updated: 16/Nov/21  Resolved: 04/Mar/11

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: 1.6.5
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Sergei Tulentsev Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

Sometimes operations have wrong collection names. Take a look.
First of all, there's a leading question mark instead of first letter (that should be 'p').
Apart from that, first entry looks ok.
Second entry is wrong. This query is one of several that are ran against pravdorub_production.answers collection. Also, it claims to be an 'update', but doesn't look like one (where's actual update part of the query?)
Third entry is also wrong. Answers' _id type is ObjectId, not Integer (vk_users has integer _id).

I am pretty positive that my code queries appropriate collections.

Also note that first and third queries are the same, only collection name is different.

{
"opid" : "moscow:1562954",
"active" : false,
"lockType" : "read",
"waitingForLock" : true,
"op" : "query",
"ns" : "?ravdorub_production.vk_users",
"query" :

{ "_id" : 13871986 }

,
"client" : "10.0.0.6:42150",
"desc" : "conn"
},
{
"opid" : "moscow:1562963",
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "?ravdorub_production.vk_users",
"query" : {
"_id" : ObjectId("4d6ebda99b3b49619400014d"),
"uiq" :

{ "$in" : [ 96474834, "96474834" ] }

},
"client" : "10.0.0.6:42157",
"desc" : "conn"
},
{
"opid" : "moscow:1562955",
"active" : false,
"lockType" : "read",
"waitingForLock" : true,
"op" : "query",
"ns" : "?ravdorub_production.answers",
"query" :

{ "_id" : 13871986 }

,
"client" : "10.0.0.6:42174",
"desc" : "conn"
},



 Comments   
Comment by Eliot Horowitz (Inactive) [ 14/Oct/11 ]

When the ? mark is there the namespace isn't meant to be accurate, it could be outdated.
This was an old debugging tool that has been removed in 2.0, now it'll just be blank when not relevant.

Comment by eli jones [ 13/Oct/11 ]

I can also verify that this is an issue on 1.8.3.

db.currentOp() will sometimes display the wrong ns.

an op will show something like this:

{
"opid" : 1212121212121,
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "?ydbname.mycoll.1211",
"query" :

{ "some_id" : 3333, "email" : "test@email.com" }

,
"client" : "12.12.12.12:38271",
"desc" : "conn"
}

where db = 'mydbname' and the collection name = 'mycoll.1211'

BUT, that op is actually running against the collection name = 'mycoll'

I know this because the 'some_id' value in the query is used to route the update to the correct collection. Certain 'some_id' values have their own specially defined collection.. while the bulk of them just exist in the 'mycoll' collection.

I tripled checked this because.. as you can imagine.. I freaked the hell out when I saw that in the currentOp() output.. it would mean all of my updates were randomly going to the wrong collection. So, I put in specific code to verify that the ultimate collection I was sending an update to was the actual collection that I expected.. and I verified that no weird updates were going to these wrongly shown collections.

I have not received a single notification of a mis-routed update.

Like Michael mentioned, I was able to easily notice this because we had a brief span of time yesterday when our active dataset wash pushed out of RAM.. so I was examining all of the piling up ops. And, it was trivial to see the seeming mis-routed updates since.. an update routed to "mycoll.1111" would have a query with the 'some_id' property removed.

I'd say, if you created 10 processes that scattered updates across two collections with similar names.. but one with a ".somethingelse" at the end.. on a mongodb with not enough ram to hold the dataset.. you should see it fairly easily.

Comment by Michael Schurter [ 10/Oct/11 ]

I've seen this on MongoDB 1.8.1 but only when the server is under heavy load (75% lock, thousands of queries/updates per second).

I'm afraid I don't have the time to spend to build a test case capable of reproducing these circumstances.

Comment by Eliot Horowitz (Inactive) [ 01/Jun/11 ]

Can you provide an example or a reproducible case?

Comment by Alexey Palazhchenko [ 01/Jun/11 ]

Same issue with collection name happens in Graylog2. For updates Mongo 1.8.1 sometimes reports ns as graylog2.messages (we write there heavy), while it should be server_values. Looks like a race. Please re-open this ticket.

Comment by Eliot Horowitz (Inactive) [ 04/Apr/11 ]

If you are seeing an issue, can you post the output or something we can look at

Comment by Stas Krichevsky [ 04/Apr/11 ]

We're seeing similar issue with version 1.8.1-rc1. Regular (not sharded) setup. When we run just one thread that executes queries, the db.currentOp() output is correct. But if we run more than one thread, the collection names are mixed up.
It's seems that the issue is just in the currentOp() output. The queries are executed on correct collections.

Comment by Sergei Tulentsev [ 04/Mar/11 ]

Collection name. Mongos is asked to look in answers, but the command specifies vk_users.

Comment by Eliot Horowitz (Inactive) [ 04/Mar/11 ]

What looks wrong to you?

Comment by Sergei Tulentsev [ 04/Mar/11 ]

Here's proof.

Correct query came to mongos:
Fri Mar 4 21:04:21 [conn149] Request::process ns: pravdorub_production.answers msg id:267 attempt: 0
Fri Mar 4 21:04:21 [conn149] shard query: pravdorub_production.answers { $query: { hdn:

{ $ne: 1 }

, uiq:

{ $in: [ "122197136", 122197136 ] }

, res:

{ $in: [ 1, -1, 10 ] }

}, $orderby:

{ _id: -1 }

}

And here's what we have in current ops
http://pastebin.com/cUpptR5M

Comment by Eliot Horowitz (Inactive) [ 04/Mar/11 ]

I would check your code carefully.

The ? is there on purpose.
Once something is finished, it puts that there to prevent corruption.

We don't put the actual update in the display because that is often very large.

Generated at Thu Feb 08 03:00:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.