Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28366

mongod --repair fails on a secondary when --replSet is passed.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Done
    • Affects Version/s: 3.2.12
    • Fix Version/s: None
    • Component/s: Admin
    • Labels:
      None
    • Environment:
      GNU/Linux Debian Wheezy x86_64
      Kernel 4.4.2
      xfs file system
    • Operating System:
      ALL
    • Steps To Reproduce:
      Hide
      1. create replica set with auth enabled

        mkdir -p /data/repairtest/primary /data/repairtest/secondary_mmapv1 /data/repairtest/secondary_wiredTiger
        openssl rand -base64 741 >/data/repairtest/keyfile
        mongod --port 27066 --storageEngine mmapv1 --dbpath /data/repairtest/primary --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile
        mongod --port 27067 --storageEngine mmapv1 --dbpath /data/repairtest/secondary_mmapv1 --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile
        mongod --port 27068 --storageEngine wiredTiger --dbpath /data/repairtest/secondary_wiredTiger --replSet repairtest --oplogSize=100 --auth --keyFile /data/repairtest/keyfile
        echo 'rs.initiate({_id:"repairtest",version:1,members:[{ _id:0,host:"localhost:27066"}]})' | mongo --port=27066 --host localhost
        echo 'db.createUser( { user: "root", pwd: "genericpassword", roles: [ { role: "root", db: "admin" } ] } )' | mongo admin --port 27066 --host localhost
        echo 'rs.add("localhost:27067")' | mongo admin --port 27066 --host localhost -u root -p genericpassword
        echo 'rs.add("localhost:27068")' | mongo admin --port 27066 --host localhost -u root -p genericpassword
        

      2. shutdown mmapv1 secondary and restart it with --repair

        mongod --port 27067 --storageEngine mmapv1 --dbpath /data/repairtest/secondary_mmapv1 --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile --repair
        

      Show
      create replica set with auth enabled mkdir -p /data/repairtest/primary /data/repairtest/secondary_mmapv1 /data/repairtest/secondary_wiredTiger openssl rand -base64 741 > /data/repairtest/keyfile mongod --port 27066 --storageEngine mmapv1 --dbpath /data/repairtest/primary --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile mongod --port 27067 --storageEngine mmapv1 --dbpath /data/repairtest/secondary_mmapv1 --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile mongod --port 27068 --storageEngine wiredTiger --dbpath /data/repairtest/secondary_wiredTiger --replSet repairtest --oplogSize=100 --auth --keyFile /data/repairtest/keyfile echo 'rs.initiate({_id:"repairtest",version:1,members:[{ _id:0,host:"localhost:27066"}]})' | mongo --port=27066 --host localhost echo 'db.createUser( { user: "root", pwd: "genericpassword", roles: [ { role: "root", db: "admin" } ] } )' | mongo admin --port 27066 --host localhost echo 'rs.add("localhost:27067")' | mongo admin --port 27066 --host localhost -u root -p genericpassword echo 'rs.add("localhost:27068")' | mongo admin --port 27066 --host localhost -u root -p genericpassword shutdown mmapv1 secondary and restart it with --repair mongod --port 27067 --storageEngine mmapv1 --dbpath /data/repairtest/secondary_mmapv1 --replSet repairtest --smallfiles --noprealloc --oplogSize=100 --auth --keyFile /data/repairtest/keyfile --repair

      Description

      We use repairDatabase (https://docs.mongodb.com/v3.2/reference/command/repairDatabase/) as the way to reclaim disk space. After upgrade to 3.2.12 (from 3.0.8) it doesn't work with mmapv1 and repair process ends with fatal assertion. With wiredTiger storage engine repair works.

      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] MongoDB starting : pid=16501 port=27067 dbpath=/data/repairtest/secondary_mmapv1 64-bit host=somegenerichost
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] db version v3.2.12
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] git version: ef3e1bc78e997f0d9f22f45aeb1d8e3b6ac14a14
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e 11 Feb 2013
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] allocator: tcmalloc
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] modules: none
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] build environment:
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten]     distmod: debian71
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten]     distarch: x86_64
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten]     target_arch: x86_64
      2017-03-17T12:36:57.822+0100 I CONTROL  [initandlisten] options: { net: { port: 27067 }, repair: true, replication: { oplogSizeMB: 100, replSet: "repairtest" }, security: { authorization: "enabled", keyFile: "/data/repairtest/keyfile" }, storage: { dbPath: "/data/repairtest/secondary_mmapv1", engine: "mmapv1", mmapv1: { preallocDataFiles: false, smallFiles: true } } }
      2017-03-17T12:36:57.892+0100 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
      2017-03-17T12:36:57.892+0100 I CONTROL  [initandlisten]
      2017-03-17T12:36:57.893+0100 I STORAGE  [initandlisten] repairDatabase local
      2017-03-17T12:36:57.896+0100 I INDEX    [initandlisten] allocating new ns file /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/local.ns, filling with zeroes...
      2017-03-17T12:36:57.926+0100 I STORAGE  [FileAllocator] allocating new datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/local.0, filling with zeroes...
      2017-03-17T12:36:57.926+0100 I STORAGE  [FileAllocator] creating directory /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/_tmp
      2017-03-17T12:36:57.927+0100 I STORAGE  [FileAllocator] done allocating datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/local.0, size: 16MB,  took 0 secs
      2017-03-17T12:36:57.929+0100 I INDEX    [initandlisten] build index on: local.me properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.me" }
      2017-03-17T12:36:57.929+0100 I INDEX    [initandlisten] 	 building index using bulk method
      2017-03-17T12:36:57.930+0100 I STORAGE  [FileAllocator] allocating new datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/local.1, filling with zeroes...
      2017-03-17T12:36:57.930+0100 I STORAGE  [FileAllocator] done allocating datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/local.1, size: 128MB,  took 0 secs
      2017-03-17T12:36:57.931+0100 I INDEX    [initandlisten] build index on: local.replset.minvalid properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.replset.minvalid" }
      2017-03-17T12:36:57.931+0100 I INDEX    [initandlisten] 	 building index using bulk method
      2017-03-17T12:36:57.931+0100 I INDEX    [initandlisten] build index on: local.startup_log properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.startup_log" }
      2017-03-17T12:36:57.931+0100 I INDEX    [initandlisten] 	 building index using bulk method
      2017-03-17T12:36:57.932+0100 I INDEX    [initandlisten] build index on: local.system.replset properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "local.system.replset" }
      2017-03-17T12:36:57.932+0100 I INDEX    [initandlisten] 	 building index using bulk method
      2017-03-17T12:36:57.943+0100 I STORAGE  [initandlisten] repairDatabase admin
      2017-03-17T12:36:57.945+0100 I INDEX    [initandlisten] allocating new ns file /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/admin.ns, filling with zeroes...
      2017-03-17T12:36:57.974+0100 I STORAGE  [FileAllocator] allocating new datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/admin.0, filling with zeroes...
      2017-03-17T12:36:57.974+0100 I STORAGE  [FileAllocator] creating directory /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/_tmp
      2017-03-17T12:36:57.975+0100 I STORAGE  [FileAllocator] done allocating datafile /data/repairtest/secondary_mmapv1/_tmp_repairDatabase_0/admin.0, size: 16MB,  took 0 secs
      2017-03-17T12:36:57.977+0100 F REPL     [initandlisten] logOp() but can't accept write to collection admin.$cmd
      2017-03-17T12:36:57.977+0100 I -        [initandlisten] Fatal Assertion 17405
      2017-03-17T12:36:57.977+0100 I -        [initandlisten]
       
      ***aborting after fassert() failure
       
       
      2017-03-17T12:36:57.996+0100 F -        [initandlisten] Got signal: 6 (Aborted).
       
       0x1315b12 0x1314a39 0x1315242 0x7fb3cf3070a0 0x7fb3cef9d125 0x7fb3cefa03a0 0x12981d2 0xe81dcc 0xe837b3 0xe83972 0xcd341b 0xa9cfd7 0x10491ac 0xe23cd7 0x9462e3 0x9481b0 0x949a3d 0x7fb3cef89ead 0x942549
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"400000","o":"F15B12","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"F14A39"},{"b":"400000","o":"F15242"},{"b":"7FB3CF2F8000","o":"F0A0"},{"b":"7FB3CEF6B000","o":"32125","s":"gsignal"},{"b":"7FB3CEF6B000","o":"353A0","s":"abort"},{"b":"400000","o":"E981D2","s":"_ZN5mongo13fassertFailedEi"},{"b":"400000","o":"A81DCC"},{"b":"400000","o":"A837B3","s":"_ZN5mongo4repl6_logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_bRKSsNS0_22ReplicationCoordinator4ModeEb"},{"b":"400000","o":"A83972","s":"_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_b"},{"b":"400000","o":"8D341B","s":"_ZN5mongo10OpObserver18onCreateCollectionEPNS_16OperationContextERKNS_15NamespaceStringERKNS_17CollectionOptionsE"},{"b":"400000","o":"69CFD7","s":"_ZN5mongo8Database16createCollectionEPNS_16OperationContextENS_10StringDataERKNS_17CollectionOptionsEb"},{"b":"400000","o":"C491AC","s":"_ZN5mongo12MMAPV1Engine14repairDatabaseEPNS_16OperationContextERKSsbb"},{"b":"400000","o":"A23CD7","s":"_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKSsbb"},{"b":"400000","o":"5462E3"},{"b":"400000","o":"5481B0","s":"_ZN5mongo13initAndListenEi"},{"b":"400000","o":"549A3D","s":"main"},{"b":"7FB3CEF6B000","o":"1EEAD","s":"__libc_start_main"},{"b":"400000","o":"542549"}],"processInfo":{ "mongodbVersion" : "3.2.12", "gitVersion" : "ef3e1bc78e997f0d9f22f45aeb1d8e3b6ac14a14", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.2-lscs-2", "version" : "#1 SMP Fri Feb 19 11:08:38 CET 2016", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "A9B18E620F02487C1F2B5F355EB119AC75B94CD7" }, { "b" : "7FFF62A70000", "elfType" : 3, "buildId" : "08F635781C80161ACACC1542E77EF77D0A04D08D" }, { "b" : "7FB3D01B1000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "16DA62C831D8080E3BCB9A1E7738D399EA3A89C0" }, { "b" : "7FB3CFDB8000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "079D784F9F4A88B6D87FC3E6B0B07BFBE209540E" }, { "b" : "7FB3CFBB0000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "77CB0070BB75470C626B899800FFCF8C8184E32A" }, { "b" : "7FB3CF9AC000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "32CCD52B7228B03E969C21587F14C2322D4A9544" }, { "b" : "7FB3CF72A000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "98FAFAB6E776EA40FBFFFBCB22859A8D54028EC0" }, { "b" : "7FB3CF514000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "F980B1188708F8D8B5C35D185444AF4CB939AA1E" }, { "b" : "7FB3CF2F8000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "F260CD217A3C0D89FD80BCC80D486C1C2AD32DCA" }, { "b" : "7FB3CEF6B000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "BD7458ADC60E906DBAABB72A7DA5015D5BC8269F" }, { "b" : "7FB3D0410000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "B66F918D37C1E4122FC91C74210F2112A6A5148E" }, { "b" : "7FB3CED54000", "path" : "/lib/x86_64-linux-gnu/libz.so.1", "elfType" : 3, "buildId" : "1EFEB71FD4999C2307570D673A724EA4E1D85267" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x1315b12]
       mongod(+0xF14A39) [0x1314a39]
       mongod(+0xF15242) [0x1315242]
       libpthread.so.0(+0xF0A0) [0x7fb3cf3070a0]
       libc.so.6(gsignal+0x35) [0x7fb3cef9d125]
       libc.so.6(abort+0x180) [0x7fb3cefa03a0]
       mongod(_ZN5mongo13fassertFailedEi+0x82) [0x12981d2]
       mongod(+0xA81DCC) [0xe81dcc]
       mongod(_ZN5mongo4repl6_logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_bRKSsNS0_22ReplicationCoordinator4ModeEb+0x213) [0xe837b3]
       mongod(_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_b+0x62) [0xe83972]
       mongod(_ZN5mongo10OpObserver18onCreateCollectionEPNS_16OperationContextERKNS_15NamespaceStringERKNS_17CollectionOptionsE+0x31B) [0xcd341b]
       mongod(_ZN5mongo8Database16createCollectionEPNS_16OperationContextENS_10StringDataERKNS_17CollectionOptionsEb+0x2F7) [0xa9cfd7]
       mongod(_ZN5mongo12MMAPV1Engine14repairDatabaseEPNS_16OperationContextERKSsbb+0xF6C) [0x10491ac]
       mongod(_ZN5mongo14repairDatabaseEPNS_16OperationContextEPNS_13StorageEngineERKSsbb+0xF77) [0xe23cd7]
       mongod(+0x5462E3) [0x9462e3]
       mongod(_ZN5mongo13initAndListenEi+0x10) [0x9481b0]
       mongod(main+0x15D) [0x949a3d]
       libc.so.6(__libc_start_main+0xFD) [0x7fb3cef89ead]
       mongod(+0x542549) [0x942549]
      -----  END BACKTRACE  -----
      Aborted
      

        Attachments

        1. logfile.txt
          9 kB
        2. steps_to_reproduce.txt
          1 kB

          Activity

            People

            Assignee:
            mark.agarunov Mark Agarunov
            Reporter:
            filip.havlicek@livesport.eu Filip Havlí?ek
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: