[SERVER-4950] Assertion on BSON size (0xEEEEEEEE) Created: 13/Feb/12 Updated: 30/Mar/12 Resolved: 16/Feb/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code |
| Affects Version/s: | 2.0.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Yosef Dinerstein | Assignee: | Ben Becker |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows, Azure |
||
| Issue Links: |
|
||||
| Operating System: | Windows | ||||
| Participants: | |||||
| Description |
|
Shard server log repeatedly shows assertion on invalid object size. Looks like corruption: Mon Feb 13 07:15:55 [rsSync] replSet syncing to: mongo1:27000 , _id: ObjectId('4f38461bbf5d8f040700b119') } } This repeats every 30 seconds and mentions the same object. |
| Comments |
| Comment by Ben Becker [ 16/Feb/12 ] | |||||||||||||
|
@Viktor, thanks for the details. It sounds like the issue is a bit different from what was initially reported, although the symptoms are certainly similar. I will keep an eye out for similar issues, but if you are ever able to reproduce this issue, please create a new issue or reopen this one. | |||||||||||||
| Comment by Viktor Petersson [ 16/Feb/12 ] | |||||||||||||
|
@Ben, no it was installed from scratch and we added it into the replica set. Once it was up to speed as a secondary, we promoted it to primary (by bumping up its priority). I resolved the issue for now by simply re-syncing the FreeBSD-box from the primary. The FreeBSD-box is currently in secondary mode and is working. | |||||||||||||
| Comment by Ben Becker [ 16/Feb/12 ] | |||||||||||||
|
Apologies for the confusion. @Viktor, when the machine was migrated from FreeBSD to Ubuntu, was the filesystem kept in tact, or did it change from something like UFS to Ext3/4/XFS/etc.? | |||||||||||||
| Comment by Viktor Petersson [ 16/Feb/12 ] | |||||||||||||
|
Ben,
See log-snippet below.
On the primary (Ubuntu):
On the secondary (FreeBSD):
Not exactly sure what you mean, but the data is transferred over a private LAN, and the servers are all running in a public cloud. On a related note, I did a few benchmarks and compared the performance on MongoDB/GridFS on CentOS, FreeBSD and Ubuntu and posted them here. This was the primary reason why we switched away from FreeBSD on the Primary. | |||||||||||||
| Comment by Yosef Dinerstein [ 16/Feb/12 ] | |||||||||||||
|
In the logs there are no other assertions on the same mongo server in general and from rsSync thread in particular. The primary server for the same replica set shows the following assertion: Thu Feb 16 11:10:40 [conn2411] local.oplog.rs warning assertion failure !e->firstRecord.isNull() db\oplog.cpp 478 The last one was not always accompanying the 1st one, but recently it started to appear as well. A few more details about the data:
The question about FreeBSD->UBuntu was to Victor. | |||||||||||||
| Comment by Ben Becker [ 15/Feb/12 ] | |||||||||||||
|
The error message 'Invalid BSONObj size: -286331154 (0xEEEEEEEE)' generally indicates a record has either been deleted or read/created from an invalid location. @Yosef: Can you check for any other assertions originating from the '[rsSync]' thread and paste them (including the backtrace)? Anything from the primary's log file around the same time? How was the data transferred from the FreeBSD primary to the Ubuntu primary? @Viktor: Regarding your second post, is that the only assertion failure you've seen on the config server? Could you also check the primary's log around Mon Feb 13 07:15:55 and attach a bit more surrounding those entries in the log files? | |||||||||||||
| Comment by Viktor Petersson [ 15/Feb/12 ] | |||||||||||||
|
I can confirm a similar error. My setup is an Ubuntu 10.04 server with MongoDB 2.0.2 from 10gen as primary and a FreeBSD 9.0 server with MongoDB 2.0.2 from Ports as secondary. We just migrated from using FreeBSD on the primary to Ubuntu, and that is when this started occurring. Wed Feb 15 13:04:30 [rsSync] Assertion: 10334:Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: _id: ObjectId('4f3af4e47573df1109000394') } Restarting MongoDB on the secondary server did not do the trick either. | |||||||||||||
| Comment by Yosef Dinerstein [ 13/Feb/12 ] | |||||||||||||
|
Restarting servers does not eliminate the assertion. | |||||||||||||
| Comment by Yosef Dinerstein [ 13/Feb/12 ] | |||||||||||||
|
Similar assertion occurred once on configuration server: Mon Jan 30 19:20:35 [conn1008] Assertion: 10334:Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: _id: "RD00155D328113:26997:1327514447:41" , ping: { $lt: new Date(1327605634446) }} exception: Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: _id: "RD00155D328113:26997:1327514447:41" code:10334 0ms |