Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5900

Shutting down then restarting shard can crash mongos

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major - P3 Major - P3
    • None
    • 2.1.1
    • Sharding, Stability
    • None
    • ALL

    Description

      I have attached a test file that reproduces a problem where if you restart a shard then do an insert you get an assertion failure in mongos:

       m30999| Tue May 22 16:36:06 [conn2]   Assertion failure _addr.size() src/mongo/s/shard.h 81
       m30999| 0x9ec27f 0xb2339c 0x92ba3c 0x7e6446 0xb46dc8 0x7c99f8 0x7d82ee 0x8106a9 0x8244e9 0x99875c 0x7fcd11214d8c 0x7fcd105b6c2d 
       m30999|  /ssd/mongo2/mongos(_ZN5mongo15printStackTraceERSo+0x27) [0x9ec27f]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo12sayDbContextEPKc+0x5e) [0xb2339c]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo12verifyFailedEPKcS1_j+0x122) [0x92ba3c]
       m30999|  /ssd/mongo2/mongos(_ZNK5mongo5Shard13getConnStringEv+0x46) [0x7e6446]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo15ShardConnectionC2ERKNS_5ShardERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEE+0x4e) [0xb46dc8]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo8Strategy7doWriteEiRNS_7RequestERKNS_5ShardEb+0xb0) [0x7c99f8]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo13ShardStrategy7writeOpEiRNS_7RequestE+0x284) [0x7d82ee]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo7Request7processEi+0x249) [0x8106a9]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xe3) [0x8244e9]
       m30999|  /ssd/mongo2/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x374) [0x99875c]
       m30999|  /lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c) [0x7fcd11214d8c]
       m30999|  /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fcd105b6c2d]
       m30999| Tue May 22 16:36:06 [conn2] 
       m30999| 
       m30999| ***aborting after verify() failure as this is a debug/test build
       m30999| 
       m30999| 
       m30999| Received signal 6
       m30999| Backtrace: 0x94a400 0x94a4ca 0x7fcd10503d80 0x7fcd10503d05 0x7fcd10507ab6 0x92bb59 0x7e6446 0xb46dc8 0x7c99f8 0x7d82ee 0x8106a9 0x8244e9 0x99875c 0x7fcd11214d8c 0x7fcd105b6c2d 
       m30999| /ssd/mongo2/mongos[0x94a400]
       m30999| /ssd/mongo2/mongos(_ZN5mongo17printStackAndExitEi+0x52)[0x94a4ca]
       m30999| /lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7fcd10503d80]
       m30999| /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fcd10503d05]
       m30999| /lib/x86_64-linux-gnu/libc.so.6(abort+0x186)[0x7fcd10507ab6]
       m30999| /ssd/mongo2/mongos(_ZN5mongo12verifyFailedEPKcS1_j+0x23f)[0x92bb59]
       m30999| /ssd/mongo2/mongos(_ZNK5mongo5Shard13getConnStringEv+0x46)[0x7e6446]
       m30999| /ssd/mongo2/mongos(_ZN5mongo15ShardConnectionC2ERKNS_5ShardERKSsN5boost10shared_ptrIKNS_12ChunkManagerEEE+0x4e)[0xb46dc8]
       m30999| /ssd/mongo2/mongos(_ZN5mongo8Strategy7doWriteEiRNS_7RequestERKNS_5ShardEb+0xb0)[0x7c99f8]
       m30999| /ssd/mongo2/mongos(_ZN5mongo13ShardStrategy7writeOpEiRNS_7RequestE+0x284)[0x7d82ee]
       m30999| /ssd/mongo2/mongos(_ZN5mongo7Request7processEi+0x249)[0x8106a9]
       m30999| /ssd/mongo2/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xe3)[0x8244e9]
       m30999| /ssd/mongo2/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x374)[0x99875c]
       m30999| /lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c)[0x7fcd11214d8c]
       m30999| /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fcd105b6c2d]
       m30999| ===
       m29000| Tue May 22 16:36:06 [conn4] end connection 127.0.0.1:42103 (4 connections now open)
       m29000| Tue May 22 16:36:06 [conn6] end connection 127.0.0.1:42105 (4 connections now open)
       m31101| Tue May 22 16:36:06 [conn7] end connection 127.0.0.1:36524 (6 connections now open)
       m31200| Tue May 22 16:36:06 [conn5] end connection 127.0.0.1:37848 (7 connections now open)
       m31200| Tue May 22 16:36:06 [conn8] end connection 127.0.0.1:37855 (7 connections now open)
       m31201| Tue May 22 16:36:06 [conn5] end connection 127.0.0.1:42880 (6 connections now open)
       m31100| Tue May 22 16:36:06 [conn13] end connection 127.0.0.1:52630 (10 connections now open)
       m29000| Tue May 22 16:36:06 [conn3] end connection 127.0.0.1:42096 (4 connections now open)
       m31100| Tue May 22 16:36:06 [conn15] end connection 127.0.0.1:52634 (10 connections now open)
       m31100| Tue May 22 16:36:06 [conn14] end connection 127.0.0.1:52632 (10 connections now open)
       m31201| Tue May 22 16:36:06 [conn7] end connection 127.0.0.1:42887 (5 connections now open)
      Tue May 22 16:36:06 Socket recv() errno:104 Connection reset by peer 127.0.1.1:30999
      Tue May 22 16:36:06 SocketException: remote: 127.0.1.1:30999 error: 9001 socket exception [1] server [127.0.1.1:30999] 
      Tue May 22 16:36:06 DBClientCursor::init call() failed
      Tue May 22 16:36:06 query failed : test.$cmd { getLastError: 1.0 } to: ubuntu:30999
      Tue May 22 16:36:06 Error: error doing query: failed src/mongo/shell/collection.js:155
      failed to load: /ssd/mongo2/jstests/sharding/remove3.js

      Attachments

        Activity

          People

            Unassigned Unassigned
            spencer@mongodb.com Spencer Brody (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: