Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16696

WiredTigerRecordStore is corrupted from power cycle

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide

      Repetitive power cycle test (over 10 minutes) , where the mongod host is power cycled (non-clean shutdown) randomly (every 30 to 120 seconds) , while clients are sending various CRUD operations.

      The operations which produce this error are:

      coll.update({x: {$gte: 34}}, {$inc: {x: 10000}, $set: {n: "hello"}},{upsert: true, multi: true})
      coll.remove({x: {$gt: 34}})

      Interesting enough, other remove & update operations succeed without error.

      Show
      Repetitive power cycle test (over 10 minutes) , where the mongod host is power cycled (non-clean shutdown) randomly (every 30 to 120 seconds) , while clients are sending various CRUD operations. The operations which produce this error are: coll.update({x: {$gte: 34}}, {$inc: {x: 10000}, $set: {n: "hello"}},{upsert: true, multi: true}) coll.remove({x: {$gt: 34}}) Interesting enough, other remove & update operations succeed without error.

      Description

      The WiredTigerRecordStore was corrupted during power cycle testing. Certain operations from the client fail with the following message in the mongod log:

      2014-12-30T14:45:39.550-0500 I -        [conn2] Assertion: 28556:Didn't find RecordId in WiredTigerRecordStore
      2014-12-30T14:45:39.559-0500 I CONTROL  [conn2]
       0xf133b9 0xebd1d1 0xea30df 0xd35ea5 0x8ee276 0x9d7bfb 0x9e50f6 0xa0e58c 0xbae50a 0xbaf29d 0x98d8c0 0x98eb12 0x98f055 0x99110d 0x9ad054 0x9adf93 0x9aea4b 0xb7ca1a 0xa8fcd5 0x7e41f0 0xed1381 0x7f2291fb2851 0x7f229115494d
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"B133B9"},{"b":"400000","o":"ABD1D1"},{"b":"400000","o":"AA30DF"},{"b":"400000","o":"935EA5"},{"b":"400000","o":"4EE276"},{"b":"400000","o":"5D7BFB"},{"b":"400000","o":"5E50F6"},{"b":"400000","o":"60E58C"},{"b":"400000","o":"7AE50A"},{"b":"400000","o":"7AF29D"},{"b":"400000","o":"58D8C0"},{"b":"400000","o":"58EB12"},{"b":"400000","o":"58F055"},{"b":"400000","o":"59110D"},{"b":"400000","o":"5AD054"},{"b":"400000","o":"5ADF93"},{"b":"400000","o":"5AEA4B"},{"b":"400000","o":"77CA1A"},{"b":"400000","o":"68FCD5"},{"b":"400000","o":"3E41F0"},{"b":"400000","o":"AD1381"},{"b":"7F2291FAB000","o":"7851"},{"b":"7F229106C000","o":"E894D"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc4", "gitVersion" : "3ad571742911f04b307f0071979425511c4f2570", "uname" : { "sysname" : "Linux", "release" : "2.6.32-358.23.2.el6.x86_64", "version" : "#1 SMP Wed Oct 16 18:37:12 UTC 2013", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFF6FAFF000", "elfType" : 3 }, { "b" : "7F2291FAB000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7F2291DA3000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7F2291B9F000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7F2291899000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7F2291615000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7F22913FF000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7F229106C000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7F22921C8000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf133b9]
       mongod(_ZN5mongo10logContextEPKc+0xE1) [0xebd1d1]
       mongod(_ZN5mongo11msgassertedEiPKc+0xAF) [0xea30df]
       mongod(_ZNK5mongo21WiredTigerRecordStore7dataForEPNS_16OperationContextERKNS_8RecordIdE+0xC5) [0xd35ea5]
       mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x16) [0x8ee276]
       mongod(_ZN5mongo10FetchStage4workEPm+0x2BB) [0x9d7bfb]
       mongod(_ZN5mongo18KeepMutationsStage4workEPm+0xE6) [0x9e50f6]
       mongod(_ZN5mongo11UpdateStage4workEPm+0x7C) [0xa0e58c]
       mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x9A) [0xbae50a]
       mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbaf29d]
       mongod(_ZN5mongo18WriteBatchExecutor10execUpdateERKNS_12BatchItemRefEPNS_7BSONObjEPPNS_16WriteErrorDetailE+0x610) [0x98d8c0]
       mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestEPSt6vectorIPNS_19BatchedUpsertDetailESaIS6_EEPS4_IPNS_16WriteErrorDetailESaISB_EE+0x1F2) [0x98eb12]
       mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x395) [0x98f055]
       mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x15D) [0x99110d]
       mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9ad054]
       mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC13) [0x9adf93]
       mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9aea4b]
       mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERNS_5CurOpES3_b+0x76A) [0xb7ca1a]
       mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0xB25) [0xa8fcd5]
       mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xE0) [0x7e41f0]
       mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x411) [0xed1381]
       libpthread.so.0(+0x7851) [0x7f2291fb2851]
       libc.so.6(clone+0x6D) [0x7f229115494d]
      -----  END BACKTRACE  -----

      The output from validate:

      {
      	"ns" : "test.foo",
      	"nrecords" : 0,
      	"wiredTiger" : {
      		"uri" : "statistics:table:collection-41-2378667660490186394",
      		"LSM" : {
      			"bloom filters in the LSM tree" : 0,
      			"bloom filter false positives" : 0,
      			"bloom filter hits" : 0,
      			"bloom filter misses" : 0,
      			"bloom filter pages evicted from cache" : 0,
      			"bloom filter pages read into cache" : 0,
      			"total size of bloom filters" : 0,
      			"sleep for LSM checkpoint throttle" : 0,
      			"chunks in the LSM tree" : 0,
      			"highest merge generation in the LSM tree" : 0,
      			"queries that could have benefited from a Bloom filter that did not exist" : 0,
      			"sleep for LSM merge throttle" : 0
      		},
      		"block-manager" : {
      			"file allocation unit size" : 4096,
      			"blocks allocated" : 2453085,
      			"checkpoint size" : 782336,
      			"allocations requiring file extension" : 148330,
      			"blocks freed" : 2500780,
      			"file magic number" : 120897,
      			"file major version number" : 1,
      			"minor version number" : 0,
      			"file bytes available for reuse" : 1046859776,
      			"file size in bytes" : 1046880256
      		},
      		"btree" : {
      			"column-store variable-size deleted values" : 0,
      			"column-store fixed-size leaf pages" : 0,
      			"column-store internal pages" : 0,
      			"column-store variable-size leaf pages" : 0,
      			"pages rewritten by compaction" : 0,
      			"number of key/value pairs" : 0,
      			"fixed-record size" : 0,
      			"maximum tree depth" : 134,
      			"maximum internal page key size" : 307,
      			"maximum internal page size" : 4096,
      			"maximum leaf page key size" : 2457,
      			"maximum leaf page size" : 32768,
      			"maximum leaf page value size" : 1048576,
      			"overflow pages" : 0,
      			"row-store internal pages" : 0,
      			"row-store leaf pages" : 0
      		},
      		"cache" : {
      			"bytes read into cache" : NumberLong("1400608251528"),
      			"bytes written from cache" : 50045358719,
      			"checkpoint blocked page eviction" : 70,
      			"unmodified pages evicted" : 66059020,
      			"modified pages evicted" : 2255273,
      			"data source pages selected for eviction unable to be evicted" : 133080,
      			"hazard pointer blocked page eviction" : 4808,
      			"internal pages evicted" : 1188,
      			"in-memory page splits" : 43,
      			"overflow values cached in memory" : 0,
      			"pages read into cache" : 68313416,
      			"overflow pages read into cache" : 0,
      			"pages written from cache" : 2452820
      		},
      		"compression" : {
      			"raw compression call failed, no additional data available" : 0,
      			"raw compression call failed, additional data available" : 0,
      			"raw compression call succeeded" : 0,
      			"compressed pages read" : 68311810,
      			"compressed pages written" : 2435989,
      			"page written failed to compress" : 0,
      			"page written was too small to compress" : 16831
      		},
      		"cursor" : {
      			"create calls" : 132,
      			"insert calls" : 2318006,
      			"bulk-loaded cursor-insert calls" : 0,
      			"cursor-insert key and value bytes inserted" : 23822147740,
      			"next calls" : 169814932,
      			"prev calls" : 1,
      			"remove calls" : 2402522,
      			"cursor-remove key bytes removed" : 9610088,
      			"reset calls" : 188154658,
      			"search calls" : 185123426,
      			"search near calls" : 700694,
      			"update calls" : 2750722,
      			"cursor-update value bytes updated" : 28293926492
      		},
      		"reconciliation" : {
      			"dictionary matches" : 0,
      			"internal page multi-block writes" : 94,
      			"leaf page multi-block writes" : 342,
      			"maximum blocks required for a page" : 4999,
      			"internal-page overflow keys" : 0,
      			"leaf-page overflow keys" : 0,
      			"overflow values written" : 0,
      			"pages deleted" : 1570111,
      			"page checksum matches" : 204439,
      			"page reconciliation calls" : 2926944,
      			"page reconciliation calls for eviction" : 2188478,
      			"leaf page key bytes discarded using prefix compression" : 0,
      			"internal page key bytes discarded using suffix compression" : 1382043
      		},
      		"session" : {
      			"object compaction" : 0,
      			"open cursor count" : 1680
      		},
      		"transaction" : {
      			"update conflicts" : 148456
      		}
      	},
      	"nIndexes" : 2,
      	"keysPerIndex" : {
      		"test.foo.$_id_" : 0,
      		"test.foo.$x_1" : 6
      	},
      	"indexDetails" : {
      		"test.foo.$_id_" : {
      			"metadata" : {
      				"formatVersion" : 1,
      				"infoObj" : "{ \"v\" : 1, \"key\" : { \"_id\" : 1 }, \"name\" : \"_id_\", \"ns\" : \"test.foo\" }"
      			},
      			"creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1,infoObj={ \"v\" : 1, \"key\" : { \"_id\" : 1 }, \"name\" : \"_id_\", \"ns\" : \"test.foo\" }),block_allocation=best,block_compressor=,cache_resident=0,checkpoint=(WiredTigerCheckpoint.184=(addr=\"01808080808080c54181e4cc3a79fc808080e3677fc0e30b3fc0\",order=184,time=1419968721,size=745472,write_gen=13212)),checkpoint_lsn=(137,95205632),checksum=uncompressed,collator=mongo_index,columns=,dictionary=0,format=btree,huffman_key=,huffman_value=,id=48,internal_item_max=0,internal_key_max=0,internal_key_truncate=,internal_page_max=4KB,key_format=u,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=16k,leaf_value_max=0,memory_page_max=5MB,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=,prefix_compression_min=4,split_pct=75,value_format=u,version=(major=1,minor=1)",
      			"type" : "file",
      			"uri" : "statistics:table:index-42-2378667660490186394",
      			"LSM" : {
      				"bloom filters in the LSM tree" : 0,
      				"bloom filter false positives" : 0,
      				"bloom filter hits" : 0,
      				"bloom filter misses" : 0,
      				"bloom filter pages evicted from cache" : 0,
      				"bloom filter pages read into cache" : 0,
      				"total size of bloom filters" : 0,
      				"sleep for LSM checkpoint throttle" : 0,
      				"chunks in the LSM tree" : 0,
      				"highest merge generation in the LSM tree" : 0,
      				"queries that could have benefited from a Bloom filter that did not exist" : 0,
      				"sleep for LSM merge throttle" : 0
      			},
      			"block-manager" : {
      				"file allocation unit size" : 4096,
      				"blocks allocated" : 6520,
      				"checkpoint size" : 745472,
      				"allocations requiring file extension" : 913,
      				"blocks freed" : 6670,
      				"file magic number" : 120897,
      				"file major version number" : 1,
      				"minor version number" : 0,
      				"file bytes available for reuse" : 6782976,
      				"file size in bytes" : 6791168
      			},
      			"btree" : {
      				"column-store variable-size deleted values" : 0,
      				"column-store fixed-size leaf pages" : 0,
      				"column-store internal pages" : 0,
      				"column-store variable-size leaf pages" : 0,
      				"pages rewritten by compaction" : 0,
      				"number of key/value pairs" : 0,
      				"fixed-record size" : 0,
      				"maximum tree depth" : 4,
      				"maximum internal page key size" : 307,
      				"maximum internal page size" : 4096,
      				"maximum leaf page key size" : 1228,
      				"maximum leaf page size" : 16384,
      				"maximum leaf page value size" : 6144,
      				"overflow pages" : 0,
      				"row-store internal pages" : 0,
      				"row-store leaf pages" : 0
      			},
      			"cache" : {
      				"bytes read into cache" : 1878534919,
      				"bytes written from cache" : 44322318,
      				"checkpoint blocked page eviction" : 0,
      				"unmodified pages evicted" : 257917,
      				"modified pages evicted" : 4773,
      				"data source pages selected for eviction unable to be evicted" : 137,
      				"hazard pointer blocked page eviction" : 36,
      				"internal pages evicted" : 76,
      				"in-memory page splits" : 2,
      				"overflow values cached in memory" : 0,
      				"pages read into cache" : 262158,
      				"overflow pages read into cache" : 0,
      				"pages written from cache" : 6273
      			},
      			"compression" : {
      				"raw compression call failed, no additional data available" : 0,
      				"raw compression call failed, additional data available" : 0,
      				"raw compression call succeeded" : 0,
      				"compressed pages read" : 0,
      				"compressed pages written" : 0,
      				"page written failed to compress" : 0,
      				"page written was too small to compress" : 0
      			},
      			"cursor" : {
      				"create calls" : 67,
      				"insert calls" : 2318000,
      				"bulk-loaded cursor-insert calls" : 0,
      				"cursor-insert key and value bytes inserted" : 62586000,
      				"next calls" : 168413214,
      				"prev calls" : 3,
      				"remove calls" : 2405685,
      				"cursor-remove key bytes removed" : 45708015,
      				"reset calls" : 4724459,
      				"search calls" : 0,
      				"search near calls" : 775,
      				"update calls" : 0,
      				"cursor-update value bytes updated" : 0
      			},
      			"reconciliation" : {
      				"dictionary matches" : 0,
      				"internal page multi-block writes" : 91,
      				"leaf page multi-block writes" : 285,
      				"maximum blocks required for a page" : 10,
      				"internal-page overflow keys" : 0,
      				"leaf-page overflow keys" : 0,
      				"overflow values written" : 0,
      				"pages deleted" : 4896,
      				"page checksum matches" : 722,
      				"page reconciliation calls" : 8445,
      				"page reconciliation calls for eviction" : 4556,
      				"leaf page key bytes discarded using prefix compression" : 63906915,
      				"internal page key bytes discarded using suffix compression" : 0
      			},
      			"session" : {
      				"object compaction" : 0,
      				"open cursor count" : 1613
      			},
      			"transaction" : {
      				"update conflicts" : 1370
      			}
      		},
      		"test.foo.$x_1" : {
      			"metadata" : {
      				"formatVersion" : 1,
      				"infoObj" : "{ \"v\" : 1, \"key\" : { \"x\" : 1 }, \"name\" : \"x_1\", \"ns\" : \"test.foo\" }"
      			},
      			"creationString" : "allocation_size=4KB,app_metadata=(formatVersion=1,infoObj={ \"v\" : 1, \"key\" : { \"x\" : 1 }, \"name\" : \"x_1\", \"ns\" : \"test.foo\" }),block_allocation=best,block_compressor=,cache_resident=0,checkpoint=(WiredTigerCheckpoint.194=(addr=\"01c24781e42d2d3f0cc24d81e407bf6fa7c2b981e4a00c9d98808080e3b0ffc0e30bffc0\",order=194,time=1419968721,size=794624,write_gen=128400)),checkpoint_lsn=(137,95205632),checksum=uncompressed,collator=mongo_index,columns=,dictionary=0,format=btree,huffman_key=,huffman_value=,id=49,internal_item_max=0,internal_key_max=0,internal_key_truncate=,internal_page_max=4KB,key_format=u,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=16k,leaf_value_max=0,memory_page_max=5MB,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=,prefix_compression_min=4,split_pct=75,value_format=u,version=(major=1,minor=1)",
      			"type" : "file",
      			"uri" : "statistics:table:index-43-2378667660490186394",
      			"LSM" : {
      				"bloom filters in the LSM tree" : 0,
      				"bloom filter false positives" : 0,
      				"bloom filter hits" : 0,
      				"bloom filter misses" : 0,
      				"bloom filter pages evicted from cache" : 0,
      				"bloom filter pages read into cache" : 0,
      				"total size of bloom filters" : 0,
      				"sleep for LSM checkpoint throttle" : 0,
      				"chunks in the LSM tree" : 0,
      				"highest merge generation in the LSM tree" : 0,
      				"queries that could have benefited from a Bloom filter that did not exist" : 0,
      				"sleep for LSM merge throttle" : 0
      			},
      			"block-manager" : {
      				"file allocation unit size" : 4096,
      				"blocks allocated" : 76509,
      				"checkpoint size" : 794624,
      				"allocations requiring file extension" : 1675,
      				"blocks freed" : 76306,
      				"file magic number" : 120897,
      				"file major version number" : 1,
      				"minor version number" : 0,
      				"file bytes available for reuse" : 11583488,
      				"file size in bytes" : 11608064
      			},
      			"btree" : {
      				"column-store variable-size deleted values" : 0,
      				"column-store fixed-size leaf pages" : 0,
      				"column-store internal pages" : 0,
      				"column-store variable-size leaf pages" : 0,
      				"pages rewritten by compaction" : 0,
      				"number of key/value pairs" : 0,
      				"fixed-record size" : 0,
      				"maximum tree depth" : 4,
      				"maximum internal page key size" : 307,
      				"maximum internal page size" : 4096,
      				"maximum leaf page key size" : 1228,
      				"maximum leaf page size" : 16384,
      				"maximum leaf page value size" : 6144,
      				"overflow pages" : 0,
      				"row-store internal pages" : 0,
      				"row-store leaf pages" : 0
      			},
      			"cache" : {
      				"bytes read into cache" : 1559966632,
      				"bytes written from cache" : 235312667,
      				"checkpoint blocked page eviction" : 21,
      				"unmodified pages evicted" : 260962,
      				"modified pages evicted" : 104439,
      				"data source pages selected for eviction unable to be evicted" : 13246,
      				"hazard pointer blocked page eviction" : 147,
      				"internal pages evicted" : 49,
      				"in-memory page splits" : 0,
      				"overflow values cached in memory" : 0,
      				"pages read into cache" : 325598,
      				"overflow pages read into cache" : 0,
      				"pages written from cache" : 76243
      			},
      			"compression" : {
      				"raw compression call failed, no additional data available" : 0,
      				"raw compression call failed, additional data available" : 0,
      				"raw compression call succeeded" : 0,
      				"compressed pages read" : 0,
      				"compressed pages written" : 0,
      				"page written failed to compress" : 0,
      				"page written was too small to compress" : 0
      			},
      			"cursor" : {
      				"create calls" : 132,
      				"insert calls" : 4920197,
      				"bulk-loaded cursor-insert calls" : 0,
      				"cursor-insert key and value bytes inserted" : 113164531,
      				"next calls" : 208875635,
      				"prev calls" : 13,
      				"remove calls" : 5006589,
      				"cursor-remove key bytes removed" : 115151547,
      				"reset calls" : 14239718,
      				"search calls" : 0,
      				"search near calls" : 4309767,
      				"update calls" : 0,
      				"cursor-update value bytes updated" : 0
      			},
      			"reconciliation" : {
      				"dictionary matches" : 0,
      				"internal page multi-block writes" : 113,
      				"leaf page multi-block writes" : 686,
      				"maximum blocks required for a page" : 5,
      				"internal-page overflow keys" : 0,
      				"leaf-page overflow keys" : 0,
      				"overflow values written" : 0,
      				"pages deleted" : 20616,
      				"page checksum matches" : 230,
      				"page reconciliation calls" : 130012,
      				"page reconciliation calls for eviction" : 113199,
      				"leaf page key bytes discarded using prefix compression" : 313612525,
      				"internal page key bytes discarded using suffix compression" : 0
      			},
      			"session" : {
      				"object compaction" : 0,
      				"open cursor count" : 1678
      			},
      			"transaction" : {
      				"update conflicts" : 1870
      			}
      		}
      	},
      	"valid" : true,
      	"errors" : [ ],
      	"ok" : 1
      }

        Attachments

        1. mongod.log.gz
          411 kB
        2. powercycle.js
          2 kB
        3. wt.tar.gz
          7.43 MB

          Issue Links

            Activity

              People

              Assignee:
              mark.benvenuto Mark Benvenuto
              Reporter:
              jonathan.abrahams Jonathan Abrahams
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: