Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10578

got writeback waitfor for older id (Sharded Environment)

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.5
    • Component/s: None
    • Labels:
      None
    • Environment:
      CentOS 6.4 x86_64 + pymongo 2.5.2
    • Linux

      The issue only came about when we moved from a single mongod instance to a sharded environment. After a few million inserts I start getting the following OperationFailure error and inserts slow down to a crawl.

      I assume this is a bug as the source states:
      // it should be impossible because once we call GLE, no other
      // writebacks should happen with that connection id

      OperationFailure Errors:
      ERROR: error inserting 1 documents to shard shard0002:mongo3:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d98f0000000000000021 s.id: 52133f4c0000000000000082 ident: localhost.localdomain:2
      ERROR: error inserting 1 documents to shard shard0001:mongo2:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d975000000000000001c s.id: 52133f4c0000000000000083 ident: localhost.localdomain:5
      ERROR: error inserting 3 documents to shard shard0001:mongo2:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d98f0000000000000020 s.id: 52133f4c000000000000007d ident: localhost.localdomain:36
      (It happens over and over)

      mongos log output:
      Tue Aug 20 16:36:00.735 [conn58] warning: exception during insert (continue on
      error set) :: caused by :: 16460 error inserting 1 documents to shard shard000
      0:mongo1:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got
      writeback waitfor for older id oid: 5212d975000000000000001c s.id: 52133f4c00
      00000000000083 ident: localhost.localdomain:5
      Tue Aug 20 16:36:00.735 [conn58] warning: swallowing exception during insert
      :: caused by :: 16460 error inserting 1 documents to shard
      shard0000:mongo1:2701
      7 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback
      waitfor for older id oid: 5212d975000000000000001c s.id:
      52133f4c000000000000008
      3 ident: localhost.localdomain:5
      Tue Aug 20 16:36:00.736 [conn58] Assertion: 14041:got writeback waitfor for
      older id oid: 5212d975000000000000001c s.id: 52133f4c0000000000000083 ident:
      loc
      alhost.localdomain:5
      0xa86db1 0xa4cbbb 0xa4d0fc 0x9f2b4f 0x8d376f 0x8d7413 0x9c2960 0x9c522c
      0x994011 0x66a161 0xa7351e 0x3bb3e07851 0x3bb36e890d
      mongos(_ZN5mongo15printStackTraceERSo+0x21) [0xa86db1]
      mongos(_ZN5mongo11msgassertedEiPKc+0x9b) [0xa4cbbb]
      mongos() [0xa4d0fc]
      mongos(_ZN5mongo17WriteBackListener7waitForERKNS0_15ConnectionIdentERKNS_3OIDE+0xa7f)
      [0x9f2b4f]
      mongos(_ZN5mongo10ClientInfo17_handleWriteBacksERKSt6vectorINS0_6WBInfoESaIS2_EEb+0x9f)
      [0x8d376f]
      mongos(_ZN5mongo10ClientInfo12getLastErrorERKSsRKNS_7BSONObjERNS_14BSONObjBuilderERSsb+0x15d3)
      [0x8d7413]
      mongos(_ZN5mongo13ShardStrategy7_insertERKSsRNS_9DbMessageEiRNS_7RequestE+0x640)
      [0x9c2960]
      mongos(_ZN5mongo13ShardStrategy7writeOpEiRNS_7RequestE+0x53c) [0x9c522c]
      mongos(_ZN5mongo7Request7processEi+0xd1) [0x994011]
      mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)
      [0x66a161]
      mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xa7351e]
      /lib64/libpthread.so.0() [0x3bb3e07851]
      /lib64/libc.so.6(clone+0x6d) [0x3bb36e890d]

      It's probably also worth noting that the python script doing these inserts is multiprocess.

            Assignee:
            Unassigned Unassigned
            Reporter:
            bchivari Brad C.
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: