Uploaded image for project: 'Ruby Driver'
  1. Ruby Driver
  2. RUBY-1798

Retrieval of a document containing code field via a cursor from 3.0 server triggers failure handling

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 2.9.0.rc0
    • Affects Version/s: 2.8.0
    • Component/s: Query
    • Environment:

      Using the latest Ruby Driver (2.8.0) against a MongoDB Server 3.0.x, we are bumping into the following exception when querying a collection:

      	 6: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:40:in `each'
      	 5: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/retryable.rb:43:in `read_with_retry'
      	 4: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:43:in `block in each'
      	 3: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:43:in `new'
      	 2: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/cursor.rb:72:in `initialize'
      	 1: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/cursor.rb:277:in `kill_cursors_op_spec'
      /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/server/description.rb:220:in `features': An unknown server's features are not known (ArgumentError)
      

      We have been able to isolate the issue to the following conditions:

      • Collection has enough items to cause a cursor to be returned
      • The first document in the collection has a field `code` with the value of `91` (as an Integer)
      • Using Ruby Driver 2.8.0 and MongoDB Server 3.0 (which according to the documentation is a supported combination)

      We have pinpointed the issue to this PR on the Driver, released on 2.8.0, where the new `SdamErrorDetection` module was introduced, which marks the server as 'unknown' when it detects the server is recovering. This dectection is checking the return code against a list of known 'Recovering Codes', and the issue lies on the extraction of the code itself, which is retrieving the code from the first document returned.

      With this issue, any `find` operation against a collection whose first result contains a field `code` with the value of `91` (or any other of the 'Node Recovering' values) causes the driver to fail to instantiate the Cursor object, as the server returning a Cursor ID causes the Cursor constructor to query for the Server's Features to determine the operation to be used to kill the cursor, which fails, because at this point, the server's settings have been reset. Here's an example of the entire interaction:

      D, [2019-04-26T11:36:12.756155 #14746] DEBUG -- : MONGODB | localhost:27017 | foo.find | SUCCEEDED | 0.116s
      D, [2019-04-26T11:36:12.756334 #14746] DEBUG -- : MONGODB | EVENT: #<ServerDescriptionChanged address=localhost:27017 topology=Single[localhost:27017] prev=#<Mongo::Server:Description:0x47008160147720 config={"ismaster"=>true, "maxBsonObjectSize"=>16777216, "maxMessageSizeBytes"=>48000000, "maxWriteBatchSize"=>1000, "localTime"=>2019-04-26 10:36:12 UTC, "maxWireVersion"=>3, "minWireVersion"=>0, "ok"=>1.0} average_round_trip_time=0.19650294220156989> new=#<Mongo::Server:Description:0x47008160074500 config={} average_round_trip_time=>>
      D, [2019-04-26T11:36:12.756392 #14746] DEBUG -- : MONGODB | Server description for localhost:27017 changed from 'standalone' to 'unknown'.
      D, [2019-04-26T11:36:12.756453 #14746] DEBUG -- : MONGODB | EVENT: #<TopologyChanged prev=Single[localhost:27017] new=Single[localhost:27017]>
      D, [2019-04-26T11:36:12.756479 #14746] DEBUG -- : MONGODB | Topology type 'Single' changed to type 'Single'.
      Traceback (most recent call last):
      	10: from replicate.rb:11:in `<main>'
      	 9: from replicate.rb:11:in `times'
      	 8: from replicate.rb:13:in `block in <main>'
      	 7: from replicate.rb:13:in `to_a'
      	 6: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:40:in `each'
      	 5: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/retryable.rb:43:in `read_with_retry'
      	 4: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:43:in `block in each'
      	 3: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/collection/view/iterable.rb:43:in `new'
      	 2: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/cursor.rb:72:in `initialize'
      	 1: from /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/cursor.rb:277:in `kill_cursors_op_spec'
      /home/jpc/.rvm/gems/ruby-2.5.5/gems/mongo-2.8.0/lib/mongo/server/description.rb:220:in `features': An unknown server's features are not known (ArgumentError)
      

      Whereas the Server logs the following for this query:

      2019-04-26T10:36:12.669+0000 I COMMAND  [conn8] query foo.89583e4e-8775-40da-b6c6-0fdd894b3fa0 query: { $query: {}, $orderby: { _id: 1 } } planSummary: IXSCAN { _id: 1 } cursorid:90984232145 ntoreturn:0 ntoskip:0 nscanned:101 nscannedObjects:101 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:101 reslen:3552 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { R: 1 } } } 0ms
      

      Here you can see the multiple steps: The query being executed, the server being marked as unknown, the Cursor being instantiated, and the query failing due to it trying to retrieve the Server Features, as a Cursor ID is returned.

      The same query ran against a collection which isn't large enough to cause a Cursor ID to be returned logs the following (but runs successfully):

      D, [2019-04-26T11:36:12.219914 #14746] DEBUG -- : MONGODB | localhost:27017 | foo.find | SUCCEEDED | 0.206s
      D, [2019-04-26T11:36:12.220927 #14746] DEBUG -- : MONGODB | EVENT: #<ServerDescriptionChanged address=localhost:27017 topology=Single[localhost:27017] prev=#<Mongo::Server:Description:0x47008160202500 config={"ismaster"=>true, "maxBsonObjectSize"=>16777216, "maxMessageSizeBytes"=>48000000, "maxWriteBatchSize"=>1000, "localTime"=>2019-04-26 10:36:11 UTC, "maxWireVersion"=>3, "minWireVersion"=>0, "ok"=>1.0} average_round_trip_time=0.19425613100196235> new=#<Mongo::Server:Description:0x47008160156200 config={} average_round_trip_time=>>
      D, [2019-04-26T11:36:12.221304 #14746] DEBUG -- : MONGODB | Server description for localhost:27017 changed from 'standalone' to 'unknown'.
      D, [2019-04-26T11:36:12.221937 #14746] DEBUG -- : MONGODB | EVENT: #<TopologyChanged prev=Single[localhost:27017] new=Single[localhost:27017]>
      D, [2019-04-26T11:36:12.222243 #14746] DEBUG -- : MONGODB | Topology type 'Single' changed to type 'Single'.
      101
      D, [2019-04-26T11:36:12.428751 #14746] DEBUG -- : MONGODB | EVENT: #<ServerDescriptionChanged address=localhost:27017 topology=Single[localhost:27017] prev=#<Mongo::Server:Description:0x47008160156200 config={} average_round_trip_time=> new=#<Mongo::Server:Description:0x47008160147720 config={"ismaster"=>true, "maxBsonObjectSize"=>16777216, "maxMessageSizeBytes"=>48000000, "maxWriteBatchSize"=>1000, "localTime"=>2019-04-26 10:36:12 UTC, "maxWireVersion"=>3, "minWireVersion"=>0, "ok"=>1.0} average_round_trip_time=0.19650294220156989>>
      D, [2019-04-26T11:36:12.429077 #14746] DEBUG -- : MONGODB | Server description for localhost:27017 changed from 'unknown' to 'standalone'.
      D, [2019-04-26T11:36:12.429839 #14746] DEBUG -- : MONGODB | EVENT: #<TopologyChanged prev=Single[localhost:27017] new=Single[localhost:27017]>
      D, [2019-04-26T11:36:12.430089 #14746] DEBUG -- : MONGODB | Topology type 'Single' changed to type 'Single'.
      

      Here we can see the server being marked as unknown, and the topology reload protocol kicking in, properly updating its configuration.

      I have attached a small script to replicate the issue, as well as the output we're seeing from its execution.

      In the meanwhile, as a workaround, we have rolled back the driver to 2.7.2, and are prioritizing the upgrade to a newer version of the MongoDB Server.

        1. output.log
          252 kB
          João Carvalho
        2. replicate.rb
          0.3 kB
          João Carvalho

            Assignee:
            oleg.pudeyev@mongodb.com Oleg Pudeyev (Inactive)
            Reporter:
            jcarvalho João Carvalho
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: