[SERVER-10056] Shard not returning some data when querying from MongoS Created: 28/Jun/13  Updated: 10/Dec/14  Resolved: 12/Aug/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.1, 2.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tamer Hassan Assignee: David Hows
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

linux, AWS


Operating System: Linux
Participants:

 Description   

Setup: 30 shards, no replication, AWS Hi1.4x Large with SSD high IOPs
Database: sharded ouput of a routine map reduce operation (merged)
Problem: Some of the records of this sharded collection will only return when I'm logged into the particular shard. In MongoS, I get no results, except when I do a count(), then I get an expected return. Example:

I run the following query on MongoS -

var d = new Date(2013,5,26);
db.hours.find({"_id.date" : {$gt : d}})

I expect several records but nothing returns. It just goes back to a blank command line.
However, when I run a count on the same query, I get the expected result -

mongos> db.hours.find({"date" : {$gt : d}}).count()
11318035

When I log directly into the MongoD shard that has this date range of records and run the query there I get a cursor of documents back, as expected. It appears I can only access these records by logging into the particular shard directly. This is the case for all records after the 25th of June.

The map reduce query that creates and updates this sharded collection is as follows:

//MAP REDUCE COMMAND
db.sessions.mapReduce (
mapFxn,
reduceFxn,
{ out :

{ merge : "hour", db : "summary", sharded : true, nonAtomic : true }

,
query: { "date" : {$gte : lastRunTime, $lte : transactionTime}}

}
);

-----------------------------------------
Looking at my sharding status -

summary.hour
shard key:

{ "_id" : 1 }

chunks:
shard0001 3
shard0009 3
shard0002 3
shard0010 3
shard0003 3
shard0011 3
shard0004 3
shard0005 3
shard0014 3
shard0013 3
shard0008 3
shard0012 3
shard0015 3
shard0016 2
shard0017 2
shard0018 2
shard0019 2
shard0020 2
shard0021 2
shard0022 2
shard0023 2
shard0024 2
shard0025 2
shard0026 2
shard0027 2
shard0028 2
shard0029 2
shard0030 2
shard0031 2
shard0006 3
shard0007 3
shard0000 3
too many chunks to print, use verbose if you want to force print

Shard0000 is the primary shard and also the shard that contains these records. Here is a sample of the verbose listing of chunks for that collection. The date key does not seem to be updating past the 25th of June, even though there are records all the way through the 27th -

on : shard0000

{ "t" : 77, "i" : 2 }

{

"_id" :

{ "date" : ISODate("2013-06-18T02:00:00Z"), "ci" : "4352", "site" : "foo.com", "page" : "home", "Origin_Domain" : "bar.com", "pl" : "pub", "v" : "1.6", } }

-->> {

"_id" :

{ "date" : ISODate("2013-06-25T11:00:00Z"), "ci" : "4543", "site" : "foo.com", "page" : "home", "Origin_Domain" : "unavailable", "pl" : "pub", "v" : "1.6" }

} on : shard0000

{ "t" : 77, "i" : 3 }

{

"_id" :

{ "date" : ISODate("2013-06-25T11:00:00Z"), "ci" : "3453", "site" : "bar.com", "page" : "home", "Origin_Domain" : "unavailable", "pl" : "pub", "v" : "1.6" }

} -->> { "_id" :

{ "$maxKey" : 1 }

} on : shard0012

{ "t" : 76, "i" : 0 }

{ "_id" : "summmary", "partitioned" : false, "primary" : "shard0021" }

========================



 Comments   
Comment by David Hows [ 16/Jul/13 ]

Hi Tamer,

Sorry for the delay in getting back to you.

Has your rebuild fixed this issue?

If not, then it looks like the next place to examine would be the MongoD's - as none of the shards seem to be returning a cursor to begin retrieving documents.

Here is my reproduction i get results that look like the following, which are basically the same, except one of the shards returns a non-empty cursor, which is where the results come from. My next question would be, why is nothing returned by the MongoD?

Tue Jul 16 12:03:15.980 [conn4] pcursorcreating pcursor over QSpec { ns: "test.test2", n2skip: 0, n2return: 0, options: 0, query: {}, fields: {} } and CInfo { v_ns: "", filter: {} }
Tue Jul 16 12:03:15.980 [conn4] pcursorinitializing over 3 shards required by [test.test2 @ 19|1||51de4f46ac52fae2143a86ef]
Tue Jul 16 12:03:15.980 [conn4] pcursorinitializing on shard shard0000:Pixl.local:30000, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false }
Tue Jul 16 12:03:15.982 [conn4] pcursorinitialized query (lazily) on shard shard0000:Pixl.local:30000, current connection state is { state: { conn: "Pixl.local:30000", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.982 [conn4] pcursorinitializing on shard shard0001:Pixl.local:30001, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false }
Tue Jul 16 12:03:15.983 [conn4] pcursorinitialized query (lazily) on shard shard0001:Pixl.local:30001, current connection state is { state: { conn: "Pixl.local:30001", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.983 [conn4] pcursorinitializing on shard shard0002:Pixl.local:30002, current connection state is { state: {}, retryNext: false, init: false, finish: false, errored: false }
Tue Jul 16 12:03:15.983 [conn4] pcursorinitialized query (lazily) on shard shard0002:Pixl.local:30002, current connection state is { state: { conn: "Pixl.local:30002", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.983 [conn4] pcursorfinishing over 3 shards
Tue Jul 16 12:03:15.983 [conn4] pcursorfinishing on shard shard0000:Pixl.local:30000, current connection state is { state: { conn: "Pixl.local:30000", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.983 [conn4] pcursorfinished on shard shard0000:Pixl.local:30000, current connection state is { state: { conn: "(done)", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: { _id: ObjectId('51de4f5506cd8150cda334b3'), loc: "MWN", ser_num: 2403.0 }, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Tue Jul 16 12:03:15.984 [conn4] pcursorfinishing on shard shard0001:Pixl.local:30001, current connection state is { state: { conn: "Pixl.local:30001", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.984 [conn4] pcursorfinished on shard shard0001:Pixl.local:30001, current connection state is { state: { conn: "(done)", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Tue Jul 16 12:03:15.984 [conn4] pcursorfinishing on shard shard0002:Pixl.local:30002, current connection state is { state: { conn: "Pixl.local:30002", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: false, errored: false }
Tue Jul 16 12:03:15.984 [conn4] pcursorfinished on shard shard0002:Pixl.local:30002, current connection state is { state: { conn: "(done)", vinfo: "test.test2 @ 19|1||51de4f46ac52fae2143a86ef", cursor: "(empty)", count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }

Please let me know where you are with this issue and we can continue from there.

Thanks,
David

Comment by Tamer Hassan [ 10/Jul/13 ]

a close look at the query in some of the log records from above -

Wed Jul 10 11:04:44.016 [conn36] [pcursor] creating pcursor over QSpec { ns: "summary.$cmd", n2skip: 0, n2return: 1, options: 0, query: { count: "hour", query: { _id.date:

{ $gte: new Date(1373241600000) }

} }, fields: {} } and CInfo { v_ns: "summary.hour", filter: { _id.date:

{ $gte: new Date(1373241600000) }

} }
Wed Jul 10 11:04:44.016 [conn36] [pcursor] initializing over 8 shards required by [summary.hour @ 8|1||51d464ff93fab1f02c087a5d]
Wed Jul 10 11:04:44.016 [conn36] [pcursor] initialized command (lazily) on shard shard0000:queen0.ozoki.com:11000, current connection state is { state:

{ conn: "queen0.ozoki.com:11000", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.016 [conn36] [pcursor] initialized command (lazily) on shard shard0001:queen0.ozoki.com:11001, current connection state is { state:

{ conn: "queen0.ozoki.com:11001", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0002:queen0.ozoki.com:11002, current connection state is { state:

{ conn: "queen0.ozoki.com:11002", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0003:queen0.ozoki.com:11003, current connection state is { state:

{ conn: "queen0.ozoki.com:11003", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0004:queen0.ozoki.com:11004, current connection state is { state:

{ conn: "queen0.ozoki.com:11004", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0005:queen0.ozoki.com:11005, current connection state is { state:

{ conn: "queen0.ozoki.com:11005", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0006:queen0.ozoki.com:11006, current connection state is { state:

{ conn: "queen0.ozoki.com:11006", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] initialized command (lazily) on shard shard0010:queen0.ozoki.com:11010, current connection state is { state:

{ conn: "queen0.ozoki.com:11010", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finishing on shard shard0000:queen0.ozoki.com:11000, current connection state is { state:

{ conn: "queen0.ozoki.com:11000", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finished on shard shard0000:queen0.ozoki.com:11000, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 77.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finishing on shard shard0001:queen0.ozoki.com:11001, current connection state is { state:

{ conn: "queen0.ozoki.com:11001", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finished on shard shard0001:queen0.ozoki.com:11001, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finishing on shard shard0002:queen0.ozoki.com:11002, current connection state is { state:

{ conn: "queen0.ozoki.com:11002", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finished on shard shard0002:queen0.ozoki.com:11002, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finishing on shard shard0003:queen0.ozoki.com:11003, current connection state is { state:

{ conn: "queen0.ozoki.com:11003", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finished on shard shard0003:queen0.ozoki.com:11003, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.017 [conn36] [pcursor] finishing on shard shard0004:queen0.ozoki.com:11004, current connection state is { state:

{ conn: "queen0.ozoki.com:11004", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.018 [conn36] [pcursor] finished on shard shard0004:queen0.ozoki.com:11004, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.018 [conn36] [pcursor] finishing on shard shard0005:queen0.ozoki.com:11005, current connection state is { state:

{ conn: "queen0.ozoki.com:11005", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.254 [conn36] [pcursor] finished on shard shard0005:queen0.ozoki.com:11005, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 1276334.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.254 [conn36] [pcursor] finishing on shard shard0006:queen0.ozoki.com:11006, current connection state is { state:

{ conn: "queen0.ozoki.com:11006", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.254 [conn36] [pcursor] finished on shard shard0006:queen0.ozoki.com:11006, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:44.254 [conn36] [pcursor] finishing on shard shard0010:queen0.ozoki.com:11010, current connection state is { state:

{ conn: "queen0.ozoki.com:11010", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:44.254 [conn36] [pcursor] finished on shard shard0010:queen0.ozoki.com:11010, current connection state is { state: { conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor:

{ n: 0.0, ok: 1.0 }

, count: 0, done: false }, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:45.653 [Balancer] collection : summary.hour
Wed Jul 10 11:04:46.246 [conn36] Request::process begin ns: summary.hour msg id: 39 op: 2004 attempt: 0
Wed Jul 10 11:04:46.246 [conn36] shard query: summary.hour { _id.date:

{ $gte: new Date(1373241600000) }

}
Wed Jul 10 11:04:46.246 [conn36] [pcursor] creating pcursor over QSpec { ns: "summary.hour", n2skip: 0, n2return: 0, options: 0, query: { _id.date:

{ $gte: new Date(1373241600000) }

}, fields: {} } and CInfo { v_ns: "", filter: {} }
Wed Jul 10 11:04:46.246 [conn36] [pcursor] initializing over 8 shards required by [summary.hour @ 8|1||51d464ff93fab1f02c087a5d]
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0000:queen0.ozoki.com:11000, current connection state is { state:

{ conn: "queen0.ozoki.com:11000", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0001:queen0.ozoki.com:11001, current connection state is { state:

{ conn: "queen0.ozoki.com:11001", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0002:queen0.ozoki.com:11002, current connection state is { state:

{ conn: "queen0.ozoki.com:11002", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0003:queen0.ozoki.com:11003, current connection state is { state:

{ conn: "queen0.ozoki.com:11003", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0004:queen0.ozoki.com:11004, current connection state is { state:

{ conn: "queen0.ozoki.com:11004", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0005:queen0.ozoki.com:11005, current connection state is { state:

{ conn: "queen0.ozoki.com:11005", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.247 [conn36] [pcursor] initialized query (lazily) on shard shard0006:queen0.ozoki.com:11006, current connection state is { state:

{ conn: "queen0.ozoki.com:11006", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] initialized query (lazily) on shard shard0010:queen0.ozoki.com:11010, current connection state is { state:

{ conn: "queen0.ozoki.com:11010", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0000:queen0.ozoki.com:11000, current connection state is { state:

{ conn: "queen0.ozoki.com:11000", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finished on shard shard0000:queen0.ozoki.com:11000, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0001:queen0.ozoki.com:11001, current connection state is { state:

{ conn: "queen0.ozoki.com:11001", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finished on shard shard0001:queen0.ozoki.com:11001, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0002:queen0.ozoki.com:11002, current connection state is { state:

{ conn: "queen0.ozoki.com:11002", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finished on shard shard0002:queen0.ozoki.com:11002, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0003:queen0.ozoki.com:11003, current connection state is { state:

{ conn: "queen0.ozoki.com:11003", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finished on shard shard0003:queen0.ozoki.com:11003, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0004:queen0.ozoki.com:11004, current connection state is { state:

{ conn: "queen0.ozoki.com:11004", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finished on shard shard0004:queen0.ozoki.com:11004, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:46.248 [conn36] [pcursor] finishing on shard shard0005:queen0.ozoki.com:11005, current connection state is { state:

{ conn: "queen0.ozoki.com:11005", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:50.480 [conn36] [pcursor] finished on shard shard0005:queen0.ozoki.com:11005, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:50.480 [conn36] [pcursor] finishing on shard shard0006:queen0.ozoki.com:11006, current connection state is { state:

{ conn: "queen0.ozoki.com:11006", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:50.480 [conn36] [pcursor] finished on shard shard0006:queen0.ozoki.com:11006, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:50.480 [conn36] [pcursor] finishing on shard shard0010:queen0.ozoki.com:11010, current connection state is { state:

{ conn: "queen0.ozoki.com:11010", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: false, errored: false }
Wed Jul 10 11:04:50.481 [conn36] [pcursor] finished on shard shard0010:queen0.ozoki.com:11010, current connection state is { state:

{ conn: "(done)", vinfo: "summary.hour @ 8|1||51d464ff93fab1f02c087a5d", cursor: "(empty)", count: 0, done: false }

, retryNext: false, init: true, finish: true, errored: false }
Wed Jul 10 11:04:50.481 [conn36] Request::process end ns: summary.hour msg id: 39 op: 2004 attempt: 0 4234ms

Comment by Tamer Hassan [ 10/Jul/13 ]

David, I'm going to have to tear down and rebuild this db soon due to our inability to provide reporting on our dashboard after 7/4 (due to this bug). Is there any other info you'd like me to give you before I do that?

Comment by Tamer Hassan [ 10/Jul/13 ]

Also, looking at stats -

mongos> db.hour.stats()
{
"sharded" : true,
"ns" : "summary.hour",
"count" : 3316196,
"numExtents" : 53,
"size" : 1176244000,
"storageSize" : 1662603264,
"totalIndexSize" : 1677568032,
"indexSizes" :

{ "_id.date_1__id.ci_1" : 185137344, "_id_" : 1492430688 }

,
"avgObjSize" : 354.696767018596,
"nindexes" : 2,
"nchunks" : 8,
"shards" : {
"shard0000" : {
"ns" : "summary.hour",
"count" : 37235,
"size" : 11347744,
"avgObjSize" : 304.76014502484225,
"storageSize" : 37797888,
"numExtents" : 8,
"nindexes" : 2,
"lastExtentSize" : 15290368,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 13065248,
"indexSizes" :

{ "_id_" : 10661504, "_id.date_1__id.ci_1" : 2403744 }

,
"ok" : 1
},
"shard0001" : {
"ns" : "summary.hour",
"count" : 0,
"size" : 0,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 8192,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 16352,
"indexSizes" :

{ "_id_" : 8176, "_id.date_1__id.ci_1" : 8176 }

,
"ok" : 1
},
"shard0002" : {
"ns" : "summary.hour",
"count" : 24672,
"size" : 7474128,
"avgObjSize" : 302.9396887159533,
"storageSize" : 11182080,
"numExtents" : 6,
"nindexes" : 2,
"lastExtentSize" : 8388608,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 11250176,
"indexSizes" :

{ "_id_" : 10031952, "_id.date_1__id.ci_1" : 1218224 }

,
"ok" : 1
},
"shard0003" : {
"ns" : "summary.hour",
"count" : 31068,
"size" : 7704608,
"avgObjSize" : 247.9917600103,
"storageSize" : 11182080,
"numExtents" : 6,
"nindexes" : 2,
"lastExtentSize" : 8388608,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 8920016,
"indexSizes" :

{ "_id_" : 6949600, "_id.date_1__id.ci_1" : 1970416 }

,
"ok" : 1
},
"shard0004" : {
"ns" : "summary.hour",
"count" : 30174,
"size" : 7701200,
"avgObjSize" : 255.22635381454234,
"storageSize" : 11182080,
"numExtents" : 6,
"nindexes" : 2,
"lastExtentSize" : 8388608,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 8069712,
"indexSizes" :

{ "_id_" : 6990480, "_id.date_1__id.ci_1" : 1079232 }

,
"ok" : 1
},
"shard0005" : {
"ns" : "summary.hour",
"count" : 3163919,
"size" : 1134186320,
"avgObjSize" : 358.4751442751853,
"storageSize" : 1580060672,
"numExtents" : 19,
"nindexes" : 2,
"lastExtentSize" : 415145984,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 1628013296,
"indexSizes" :

{ "_id_" : 1450610448, "_id.date_1__id.ci_1" : 177402848 }

,
"ok" : 1
},
"shard0006" : {
"ns" : "summary.hour",
"count" : 29127,
"size" : 7829808,
"avgObjSize" : 268.816149963951,
"storageSize" : 11182080,
"numExtents" : 6,
"nindexes" : 2,
"lastExtentSize" : 8388608,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 8216880,
"indexSizes" :

{ "_id_" : 7170352, "_id.date_1__id.ci_1" : 1046528 }

,
"ok" : 1
},
"shard0010" : {
"ns" : "summary.hour",
"count" : 1,
"size" : 192,
"avgObjSize" : 192,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 8192,
"paddingFactor" : 1,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 16352,
"indexSizes" :

{ "_id_" : 8176, "_id.date_1__id.ci_1" : 8176 }

,
"ok" : 1
}
},
"ok" : 1
}

Comment by David Hows [ 10/Jul/13 ]

Hi Tamer,

Can you attach the logs for the MongoS in question?

That it spends a few seconds running the query only to return nothing is a little strange, I'd like to see what that MongoS reports.

I'm going to see if I can reproduce this locally in the meantime.

Thanks for the updated details.

Regards,
David

Comment by Tamer Hassan [ 10/Jul/13 ]

summary.hour
shard key:

{ "_id" : 1 }

chunks:
shard0001 1
shard0003 1
shard0004 1
shard0006 1
shard0000 1
shard0002 1
shard0005 1
shard0010 1

Comment by Tamer Hassan [ 10/Jul/13 ]

If I log directly into a shard and run the same find() query I get expected documents returned.

Comment by Tamer Hassan [ 10/Jul/13 ]

Sorry for the discrepancies here. I was doing a write up, not a direct cut and paste of the queries, so these discrepancies are typos. The collection name is "hour". The date field is under "_id.date" as the result of the map/reduce. More details to follow.

Comment by Tamer Hassan [ 10/Jul/13 ]

Thank you!

Comment by David Hows [ 10/Jul/13 ]

Hi Tamer,

Not a problem and congratulations!

Il leave this open waiting for you.

Regards,
David

Comment by Tamer Hassan [ 10/Jul/13 ]

David, please leave this open. I was offline for the past week (had a baby) and am syncing back up now.

I dropped the database and rebuilt from scratch and the issue recurred. I'll reply to your specific requests later this evening.

Comment by David Hows [ 01/Jul/13 ]

Hi Tamer,

I've just looked over what you posted and I'm a little confused. From the output of your sh.status() and the map reduce it appears that you have the "hour" collection sharded. But when you run the two queries above they are on the "hours" collection (note the "s" on the end).

I've also noticed that you run the find against "_id.date" and the count against "date". Was this deliberate? If not, does the difference in syntax explain the discrepancy?

You also say that you do get data back from the MongoD when you run these queries? Can you confirm which of the two above queries you run to get these results?

Would it be possible for you to do the following:

  • Confirm if the collection is "hour" or "hours". If the collection is "hours" can you attach the sharding details for that collection as only "hour" is posted above
  • Confirm the correct query syntax "_id.date" or "date"?
  • Run the correct query on the correct collection via the MongoS and attach the output
  • Run the correct query on the correct collection directly via the MongoD and attach the output
  • Can you provide a sample document from the correct collection mentioned above

Thanks,
David

Comment by Tamer Hassan [ 30/Jun/13 ]

Hi Ian,
Any thoughts on this?

Thanks for taking a look.

Tamer

Generated at Thu Feb 08 03:22:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.