[SERVER-2237] Loss Data with Sharding Created: 16/Dec/10 Updated: 17/Mar/11 Resolved: 17/Dec/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Stability |
| Affects Version/s: | 1.6.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Krishna Maddireddy | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux CentOs 5 |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
While testing sharding , i can see the data loss. trace from shell ); ERRORs from monos logs Thu Dec 16 14:41:39 [conn1] autosplitting demo_contacts.contacts size: 1048648 shard: ns:demo_contacts.contacts at: shard0000:localhost:30001 lastmod: 101|31 min: { aid: 4.0, contact_id: 676362.0 }max: { aid: 5.0, contact_id: 1283.0 }on: { aid: 4.0, contact_id: 680372.0 }(splitThreshold 1048576) max: { aid: 5.0, contact_id: 1283.0 }on: { aid: 4.0, contact_id: 681072.0 }(splitThreshold 1048576) max: { aid: 5.0, contact_id: 1283.0 }on: { aid: 4.0, contact_id: 681771.0 }(splitThreshold 1048576) max: { aid: 5.0, contact_id: 1283.0 }on: { aid: 4.0, contact_id: 682471.0 }(splitThreshold 1048576) max: { aid: 5.0, contact_id: 1283.0 }on: { aid: 4.0, contact_id: 683170.0 }(splitThreshold 1048576) Mongd log from one of the shard Thu Dec 16 14:43:05 [conn12] query admin.$cmd ntoreturn:1 command: { moveChunk: "demo_contacts.contacts", from: "localhost:30001", to: "localhost:30002", min: { aid: 4.0, contact_id: 249176.0 }, max: { aid: 4.0, contact_id: 253273.0 }, shardId: "demo_contacts.contacts-aid_4.0contact_id_249176.0", configdb: "localhost:20001" } reslen:53 1084msThu Dec 16 14:43:07 [conn9] Assertion: 13388:[demo_contacts.contacts] shard version not ok in Client::Context: your version is too old ns: demo_contacts.contacts global: 116|1 client: 106|1 , max: { aid: 4.0, contact_id: 256729.0 }, shardId: "demo_contacts.contacts-aid_4.0contact_id_253273.0", configdb: "localhost:20001" } , max: { aid: 4.0, contact_id: 256729.0 }, s |
| Comments |
| Comment by Krishna Maddireddy [ 17/Dec/10 ] |
|
After using Centos Mongodb packages , i don't see the issue any more. |
| Comment by Eliot Horowitz (Inactive) [ 16/Dec/10 ] |
|
Are you sure its not an off by 1 error? |
| Comment by Krishna Maddireddy [ 16/Dec/10 ] |
|
Yes it should be 2200000. In this case there is a loss of one record, but i have see major data lost with other tests. yes here are the indexes > db.contacts.getIndexes() }, , |
| Comment by Eliot Horowitz (Inactive) [ 16/Dec/10 ] |
|
Are you sure expected number is 2200000? An occasionally higher count is expected sometimes right now. |