[CDRIVER-3068] Topology scanner stuck when changing standalone server to replica set Created: 03/Apr/19  Updated: 27/Oct/23  Resolved: 03/Jun/19

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: 1.14.0
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Sascha Zelzer Assignee: Unassigned
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-12661 [Server] Advise users when to reconfi... Closed
related to CDRIVER-3107 Warn when removing the last server fr... Closed

 Description   

I am using the C driver (via mongo-cxx-driver) in pooled mode.

First, I create a pool using the URL mongodb://localhost:27017. The server is a standalone MongoDB 4.0.3 server.

Then, I destroy that pool and create a new pool using the URL mongodb://localhost:27017/?replicaset=rep. Afterwards, I stop the MongoDB server and start it again as a single replica set (with replica set name "rep"). I can successfully connect to the server using the Mongo shell (and e.g. call "rs.initiate()").

However, I cannot connect to the replica set using clients from the new pool in my program. Using APMĀ  callbacks, it seems like the topology scanner does not get notified of the new configuration:

mongodb://localhost:27017: topology changed (5ca4c4969a183ffe80003b21) Single -> Single
?? previous servers:??
?? Standalone localhost:27017 (master)??
?? new servers:??
?? Standalone localhost:27017 (master)??
mongodb://localhost:27017: server changed (5ca4c4969a183ffe80003b21): localhost:27017 -> localhost:27017
mongodb://localhost:27017: topology changed (5ca4c4969a183ffe80003b21) Single -> Single
?? previous servers:??
?? Standalone localhost:27017 (master)??
?? new servers: Standalone localhost:27017 (master)??
mongodb://localhost:27017: topology closed (5ca4c4969a183ffe80003b21)
mongodb://localhost:27017/?replicaset=rep: topology opening (5ca4c4d99a183ffe80003b23)
mongodb://localhost:27017/?replicaset=rep: topology changed (5ca4c4d99a183ffe80003b23) Unknown -> ReplicaSetNoPrimary
mongodb://localhost:27017/?replicaset=rep: server opening (5ca4c4d99a183ffe80003b23): localhost:27017
mongodb://localhost:27017/?replicaset=rep: server changed (5ca4c4d99a183ffe80003b23): localhost:27017 -> localhost:27017
mongodb://localhost:27017/?replicaset=rep: server closed (5ca4c4d99a183ffe80003b23): localhost:27017
mongodb://localhost:27017/?replicaset=rep: topology changed (5ca4c4d99a183ffe80003b23) ReplicaSetNoPrimary -> ReplicaSetNoPrimary

I am stuck as of now. The actual code is part of a larger code-base. If there is nothing else I could try, I could create a minimal working program which hopefully reproduces this.

Thanks in advance.



 Comments   
Comment by Sascha Zelzer [ 13/Jun/19 ]

Thanks a lot for your analysis and the creation of the two follow-up tickets.

Comment by Jeremy Mikola [ 30/Apr/19 ]

saszel: Based on the list of steps you shared to reproduce the problem, I believe the topology scanner is performing as designed.

The scanner initially disregards localhost:27017 because it is not a member of the "rep" replica set, which leaves the topology in the ReplicaSetNoPrimary state with no candidate servers (see: transition table in the SDAM specification). Any monitoring intervals from that point on are no-ops since there are no remaining servers in the topology. Although SDAM: Other Topology Types notes that clients (i.e. drivers) should emit a warning if the last candidate server would be removed from a topology, I don't believe libmongoc does so. I have opened CDRIVER-3107 to track that improvement.

The correct course of action would be to convert the MongoDB deployment to a replica set before modifying the application. So long as the new replica set primary was also accessible via localhost:27017, the application would still be able to connect to it and execute read/write operations as if it were a standalone. Once the replica set was configured, you could then reconfigure the driver's URI by adding the "replicaSet" option and optionally adding more hosts to the connection string (per replica set connection string examples) and finally restart the application.

I realize this was not explicitly stated in the MongoDB manual's tutorial for Converting a standalone to a replica set. I've also opened DOCS-12661 to request an improvement to that document.

Comment by Sascha Zelzer [ 03/Apr/19 ]

Creating two pools is actually not necessary. Here is a simplified list of steps to reproduce this:

  1. Run a standalone MongoDB server
  2. Create a pool using mongodb://localhost:27017/?replicaset=rep (as expected, clients cannot connect using that pool)
  3. Stop the MongoDB server and start it again with the --replset rep command line option (and initialize it)
  4. Clients from the pool are still not able to connect and APM callbacks do not show any activity for the scanner
Generated at Wed Feb 07 21:17:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.