[CDRIVER-679] mongoc_client_kill_cursor triggers Server Selection Created: 18/May/15  Updated: 19/Oct/16  Resolved: 11/Sep/15

Status: Closed
Project: C Driver
Component/s: libmongoc
Affects Version/s: 1.2.0
Fix Version/s: 1.2-rc0

Type: Bug Priority: Major - P3
Reporter: Hannes Magnusson Assignee: A. Jesse Jiryu Davis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on CDRIVER-699 Fail-fast server selection mode Closed
Related
related to CDRIVER-833 mongoc_client_kill_cursor is untested Closed
related to PHPC-304 Failed getmore halts for 30seconds Closed
related to CDRIVER-716 _mongoc_client_sendv falls back to se... Closed

 Description   

Execute a query
Execute a getmore – server fails (socket failure, whatever)
mongoc_cursor_next() returns no doc
Call mongoc_cursor_destroy() to free the cursor.
Triggers mongoc_client_kill_cursor()
Triggers Server Selection
Waits 30seconds.
Server Selection fails.

[2015-05-18T20:40:09+00:00]     cursor: TRACE   > ENTRY: mongoc_cursor_destroy():324 
[2015-05-18T20:40:09+00:00]     cursor: TRACE   > ENTRY: _mongoc_cursor_destroy():340
[2015-05-18T20:40:09+00:00]     client: TRACE   > ENTRY: mongoc_client_kill_cursor():1295
[2015-05-18T20:40:09+00:00]    cluster: TRACE   > ENTRY: mongoc_cluster_select():1584
[2015-05-18T20:40:09+00:00]    cluster: TRACE   > ENTRY: _mongoc_cluster_select_by_optype():1436
[2015-05-18T20:40:09+00:00]     mongoc: TRACE   > ENTRY: mongoc_topology_description_select():439
[2015-05-18T20:40:09+00:00]     mongoc: TRACE   >  EXIT: mongoc_topology_description_select():453
[2015-05-18T20:40:10+00:00]     stream: TRACE   > ENTRY: mongoc_stream_writev():143  
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  writev = 0x7f2bf9061780 [58]           
[2015-05-18T20:40:10+00:00]     stream: TRACE   > 00000:  3a 00 00 00 02 00 00 00  00 00 00 00 d4 07 00 00  : . . . . . . .  . . . . . . . . 
[2015-05-18T20:40:10+00:00]     stream: TRACE   > 00010:  04 00 00 00 61 64 6d 69  6e 2e 24 63 6d 64 00 00  . . . . a d m i  n . $ c m d . . 
[2015-05-18T20:40:10+00:00]     stream: TRACE   > 00020:  00 00 00 ff ff ff ff 13  00 00 00 10 69 73 4d 61  . . . . . . . .  . . . . i s M a 
[2015-05-18T20:40:10+00:00]     stream: TRACE   > 00030:  73 74 65 72 00 01 00 00  00 00                    s t e r . . . .  . .
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Setting timeout to: 0              
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  EXIT: mongoc_stream_writev():158  
[2015-05-18T20:40:10+00:00]     buffer: TRACE   > ENTRY: _mongoc_buffer_try_append_from_stream():294
[2015-05-18T20:40:10+00:00]     stream: TRACE   > ENTRY: mongoc_stream_read():260    
[2015-05-18T20:40:10+00:00]     stream: TRACE   > ENTRY: mongoc_stream_readv():220   
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Setting timeout to: 0              
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Reading got: 0 wanted: 0           
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  readv = 0x7ffc20ad1410 [4]        
[2015-05-18T20:40:10+00:00]     stream: TRACE   > 00000:  5a 5a 5a 5a                                       Z Z Z Z 
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  EXIT: mongoc_stream_readv():231   
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  EXIT: mongoc_stream_read():272    
[2015-05-18T20:40:10+00:00]     buffer: TRACE   >  EXIT: _mongoc_buffer_try_append_from_stream():324
[2015-05-18T20:40:10+00:00]     stream: TRACE   > ENTRY: mongoc_stream_failed():75   
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Destroying RSRC#9                  
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  EXIT: mongoc_stream_failed():85   
[2015-05-18T20:40:10+00:00]     mongoc: TRACE   > ENTRY: mongoc_topology_description_select():439
[2015-05-18T20:40:10+00:00]     mongoc: TRACE   >  EXIT: mongoc_topology_description_select():453
[2015-05-18T20:40:10+00:00]     PHONGO: TRACE   > ENTRY: phongo_stream_initiator():1144
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Connecting to '192.168.112.10:2000[mongodb://192.168.112.10:2000]'
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Created: RSRC#12 as '192.168.112.10:2000[mongodb://192.168.112.10:2000]'
[2015-05-18T20:40:10+00:00]     PHONGO: TRACE   >  EXIT: phongo_stream_initiator():1268
[2015-05-18T20:40:10+00:00]     stream: TRACE   > ENTRY: mongoc_stream_failed():75   
[2015-05-18T20:40:10+00:00]     PHONGO: DEBUG   > Destroying RSRC#12                 
[2015-05-18T20:40:10+00:00]     stream: TRACE   >  EXIT: mongoc_stream_failed():85   
[2015-05-18T20:40:10+00:00]     mongoc: TRACE   > ENTRY: mongoc_topology_description_select():439
[2015-05-18T20:40:10+00:00]     mongoc: TRACE   >  EXIT: mongoc_topology_description_select():453
[2015-05-18T20:40:11+00:00]     PHONGO: TRACE   > ENTRY: phongo_stream_initiator():1144
[2015-05-18T20:40:11+00:00]     PHONGO: DEBUG   > Connecting to '192.168.112.10:2000[mongodb://192.168.112.10:2000]'
[2015-05-18T20:40:11+00:00]     PHONGO: DEBUG   > Created: RSRC#13 as '192.168.112.10:2000[mongodb://192.168.112.10:2000]'
[2015-05-18T20:40:11+00:00]     PHONGO: TRACE   >  EXIT: phongo_stream_initiator():1268
[2015-05-18T20:40:11+00:00]     stream: TRACE   > ENTRY: mongoc_stream_failed():75   
[2015-05-18T20:40:11+00:00]     PHONGO: DEBUG   > Destroying RSRC#13                 
[2015-05-18T20:40:11+00:00]     stream: TRACE   >  EXIT: mongoc_stream_failed():85   
[2015-05-18T20:40:11+00:00]     mongoc: TRACE   > ENTRY: mongoc_topology_description_select():439



 Comments   
Comment by Githook User [ 11/Jan/16 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 kill_cursor shouldn't attempt reconnect

Also fix mongoc_client_kill_cursor, which I broke in the previous commit.
Branch: 1.3.0-dev
https://github.com/mongodb/mongo-c-driver/commit/d805075ffa81cd6f27ac88660a1731bc67740bd7

Comment by Githook User [ 26/Oct/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 kill_cursor shouldn't attempt reconnect

Also fix mongoc_client_kill_cursor, which I broke in the previous commit.
Branch: debian
https://github.com/mongodb/mongo-c-driver/commit/d805075ffa81cd6f27ac88660a1731bc67740bd7

Comment by Githook User [ 11/Sep/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 kill_cursor shouldn't attempt reconnect

Also fix mongoc_client_kill_cursor, which I broke in the previous commit.
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/d805075ffa81cd6f27ac88660a1731bc67740bd7

Comment by A. Jesse Jiryu Davis [ 19/Jun/15 ]

I'm assigning this back to me and making it depend on CDRIVER-699. Once we have a fail-fast mode we can use that to try reconnection only once in order to send OP_KILLCURSORS. (Additionally, we need send OP_KILLCURSORS only to the correct server!)

Comment by Githook User [ 16/Jun/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 disable tests for now
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/56f1f92ef185811d2fb2887f9254538975a1f8d9

Comment by A. Jesse Jiryu Davis [ 16/Jun/15 ]

Let's hold off on this for a few days until we can finish the blacklisting spec and implement it here. I think we can fix this bug much more easily once that's done.

Comment by Hannes Magnusson [ 15/Jun/15 ]

Nice catch

That

if (mongoc_cluster_sendv_to_server() ) {} /* implicit else */ mongoc_cluster_sendv(..) 

is not cool.

I also think mongoc_cluster_sendv () should be completely removed, replaced with _to_server ().

As for the reconnect to the server on failure, should the driver really try to reconnect to the server to send the killcursor? How does that work in relation to SDAM+SS+upcoming-blacklisting?
We don't retry failed operations – I don't think killcursor should be an exception from that rule.

Comment by A. Jesse Jiryu Davis [ 14/Jun/15 ]

I wrote a mock replica set to test OP_KILLCURSORS six ways:

1. Single client, a replica set with a primary and 5 secondaries, send initial query and then destroy the cursor, kill-cursors is sent to proper secondary.
2. Pooled client, same test.
3. Single client, a replica set with a primary and 5 secondaries, send initial query, then network error on getmore, and then destroy the cursor, kill-cursors is sent to proper secondary.
4. Pooled client, same test.
5. Same as 3 but with no primary.
6. Same as 5, but pooled.

Here's the tests:

https://github.com/mongodb/mongo-c-driver/blob/dadcb96/tests/test-mongoc-cursor.c#L337-L346

Only 1 and 2 pass, despite bjori's patch, so this bug is still open. I think the problem is here:

https://github.com/mongodb/mongo-c-driver/blob/1.2.0-dev/src/mongoc/mongoc-client.c#L451-L459

The client should try to reconnect in order to send OP_KILLCURSORS, although it shouldn't try for very long.

If it can't reconnect it shouldn't fall back to sending OP_KILLCURSORS to the primary!

Comment by Githook User [ 14/Jun/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 more kill cursors tests

Add (currently failing) test that kill-cursors is sent to
the right secondary if there's no primary.
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/dadcb969f81b7bb6df7307cc6bc46290535c1997

Comment by Githook User [ 14/Jun/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 killcursors test single and pooled client
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/6b209af833679917892465b610b5e9010c7877c1

Comment by Githook User [ 14/Jun/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: CDRIVER-679 test killcursors with a mock replica set
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/6365954bfc09da522b2a792105d1d16c4d3464e5

Comment by Githook User [ 14/Jun/15 ]

Author:

{u'username': u'ajdavis', u'name': u'A. Jesse Jiryu Davis', u'email': u'jesse@mongodb.com'}

Message: Merge branch 'pr-240' into 1.2.0-dev

Comment by Githook User [ 14/Jun/15 ]

Author:

{u'username': u'bjori', u'name': u'Hannes Magnusson', u'email': u'bjori@php.net'}

Message: CDRIVER-679: Make sure we send the kill cursor to the corresponding server
Branch: 1.2.0-dev
https://github.com/mongodb/mongo-c-driver/commit/ff0c0a809fe7979ce65c99f617c259ab98e3c077

Comment by Hannes Magnusson [ 10/Jun/15 ]

https://github.com/mongodb/mongo-c-driver/pull/240

Comment by A. Jesse Jiryu Davis [ 08/Jun/15 ]

CDRIVER-699 may help, by providing us a way to say "select a server with the following hint, but don't wait if it's unavailable"

Generated at Wed Feb 07 21:10:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.