Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27565

Secondary Mongod failed with read checksum error

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.5
    • Component/s: WiredTiger
    • Labels:
      None
    • ALL

      On a 28 shard cluster with each shard being a 3 node replicaset running MongoDb 3.2.5 WiredTiger, I saw a single secondary Mongod failed with read checksum error as below.
      The environment is CentOS Linux release 7.0.1406 and the mongod process writes to local disk. Also attaching the full log life of the secondary that failed.

      2016-12-28T07:11:01.763-0500 I INDEX    [repl writer worker 13] build index done.  scanned 6723 total records. 0 secs
      2016-12-28T07:11:07.614-0500 I COMMAND  [conn18188] command local.oplog.rs command: getMore { getMore: 14529216539, collection: "oplog.rs", maxTimeMS: 5000, term: 41, lastKnownCommittedOpTime: { ts: Timestamp 1482927067000|217, t: 41 } } cursorid:14529216539 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:3 reslen:105611 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 287484 } }, oplog: { acquireCount: { r: 1 } } } protocol:op_command 287ms
      2016-12-28T07:11:19.374-0500 E STORAGE  [thread2] WiredTiger (0) [1482927079:374811][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: read checksum error for 4096B block at offset 339968: block header checksum of 1570021396 doesn't match expected checksum of 111389135
      2016-12-28T07:11:19.374-0500 E STORAGE  [thread2] WiredTiger (0) [1482927079:374861][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: trancheinfodb_20161228/collection-392--4692130608470797293.wt: encountered an illegal file format or internal value
      2016-12-28T07:11:19.374-0500 E STORAGE  [thread2] WiredTiger (-31804) [1482927079:374871][17531:0x7faa199e1700], file:trancheinfodb_20161228/collection-392--4692130608470797293.wt, WT_SESSION.checkpoint: the process must exit and restart: WT_PANIC: WiredTiger library panic
      2016-12-28T07:11:19.374-0500 I -        [thread2] Fatal Assertion 28558
      

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            darshan.shah@interactivedata.com Darshan Shah
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: