Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10768

add proper support for SIGSTOP and SIGCONT (currently, on replica set primary can cause data loss)

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Critical - P2 Critical - P2
    • None
    • Affects Version/s: 2.4.6
    • Component/s: Replication
    • Labels:
    • Environment:
      Distributor ID: Ubuntu
      Description: Ubuntu 12.04.2 LTS
      Release: 12.04
      Codename: precise
    • Linux
    • Hide

      Untar and put all files under MONGDB. Then execute MONGODB/my.py can
      reproduce the bug. It triggers the following sequence of events.

      • start a replica set of three replicas: S1, S2, S3.
        Say S1 is primary, S2/S3 are backups
      • insert x:1 into S1
      • Suspend (i.e. send the mongod process a SIGSTOP signal) S1
      • wait for S2/S3 to elect a new leader, say S2
      • insert x:2 (sent to S2) with w=2
      • Resume (i.e. send the mongod process a SIGCONT signal) S1
      • S2 and S3 roll back x:2! But x:2 is supposed to be durable!
      Show
      Untar and put all files under MONGDB. Then execute MONGODB/my.py can reproduce the bug. It triggers the following sequence of events. start a replica set of three replicas: S1, S2, S3. Say S1 is primary, S2/S3 are backups insert x:1 into S1 Suspend (i.e. send the mongod process a SIGSTOP signal) S1 wait for S2/S3 to elect a new leader, say S2 insert x:2 (sent to S2) with w=2 Resume (i.e. send the mongod process a SIGCONT signal) S1 S2 and S3 roll back x:2! But x:2 is supposed to be durable!

      I am not sure if the following "problem" is assumed to be non-realistic, or is it a bug of MongoDB. The problem is that MongoDB may discard data that is replicated at a majority of servers. This is actually a terrible semantic (note that nothing crashes!).

            Assignee:
            matt.dannenberg Matt Dannenberg
            Reporter:
            ydmao Yandong Mao
            Votes:
            1 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: