[JAVA-3285] Urgent: Java Driver Logging : "Cluster created with settings" only to one ip Created: 08/May/19 Updated: 27/Oct/23 Resolved: 09/May/19 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Session Management |
| Affects Version/s: | 3.4.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Venkata Anil Kumar Gobburi | Assignee: | Ross Lawley |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Mongo 3.4.20 : Replica set with three nodes one of which is an arbiter Tomcat 7 |
||
| Description |
|
End client is Tomcat with webapp which uses Spring Recently we converted a single instance mongo with ip address 1.1.1.1 (example) to a replica set with ip addresses 1.1.1.1, 2.2.2.2, 3.3.3.3 (arbiter) Connection String : mongodb://username:password@1.1.1.1:27017,2.2.2.2:27017,3.3.3.3:27017/database?replicaSet=replicaset
After updating the connection string on application server to standard mongouri, the application is stalling after switchover from 1.1.1.1 to 2.2.2.2. We enabled detailed logging for mongo driver using below logging.properties org.mongodb.driver.level=ALL The observation is that once the war file is loaded and during the "Cluster created with settings " is only connecting to 1.1.1.1 no matter what we do. The most interesting (and concerning) aspect is that we tried removing 1.1.1.1 from all the property files, removed 1.1.1.1 from the replica set (2.2.2.2 was made the PRIMARY) and shutdown mongo on 1.1.1.1 to make sure there is no way the driver could know or understand the presence of 1.1.1.1 but still as soon as I start tomcat, i could see below in the tomcat logs. Another issue is that the application is getting stalled after switching over to 2.2.2.2 and we believe understanding and resolving this issue can bring light to the stalling issue.
Note : We have a node js app running which is connecting to this replicaset and it is able to connect to 2.2.2.2 seamlessly after a switchover from 1.1.1.1. It is only on tomcat we are facing this issue.
May 08, 2019 12:18:33 AM org.apache.catalina.core.StandardEngine startInternal ] |
| Comments |
| Comment by Ross Lawley [ 09/May/19 ] |
|
Hi ubuntuuser01, Glad you got it sorted. The logging will log all driver usage for the driver level as per the logging configuration. So had you been logging to a file you would have seen the application logging there. The case of using MongoDB for storing the logs will increase the logging - as you've seen it will also log the logging of its own operations. Ross |
| Comment by Venkata Anil Kumar Gobburi [ 08/May/19 ] |
|
Hello Ross, Thank you for quick response. We were able to resolve the issue. For the sake of everyone else who may face this issue now or in future I would like to post the root cause here. The root cause for our case was that there was log4j properties file which the application was using which triggered the "Cluster created with settings" message and the heartbeat connections there after during the tomcat startup. There were two log4j property files on two different folder locations and I was assuming the application was using one of the property file but turns out it was using the second property file located else where outside the folder we generally use to store all the config files. Apparently I have not scrubbed this file (since i didnt know its presence at the time) because of which logs were showing that connections were made to 1.1.1.1. All the while we were thinking its the application itself and its traffic which created those log messages which led us down the wrong track of investigation initially The reason this issue was difficult to trace by the team was that they made all the necessary changes in the code to connect using mongoUri above and the source of the issue was outside the source code. Interestingly these log4j settings were not used by the dev team locally but were present on the servers which made triaging even more difficult since local setups were working fine during switch over. Systematically narrowing down the scope of the root cause by performing different test cases helped identify the issue. Before : log4j.appender.MongoDB=org.log4mongo.MongoDbAppender
After: (I changed the hostname and the port parameters to include all the three nodes in the replica set) log4j.appender.MongoDB=org.log4mongo.MongoDbAppender After converting the log4j settings to replica set settings below is the new log. One can clearly see that the "Cluster created with settings" is picking up all the three ip's and ports from log4j.appender.MongoDB.hostname and log4j.appender.MongoDB.port during startup.
May 08, 2019 6:36:28 PM org.apache.catalina.startup.TldConfig execute , {address=2.2.2.2:27017, type=UNKNOWN, state=CONNECTING}, {address=3.3.3.3:27018, type=UNKNOWN, state=CONNECTING}] , minWireVersion=0, maxWireVersion=5, maxDocumentSize=16777216, roundTripTimeNanos=26462773, setName='replicaset', canonicalAddress=3.3.3.3:27018, hosts=[2.2.2.2:27017, 1.1.1.1:27017], passives=[], arbiters=[3.3.3.3:27018], primary='2.2.2.2:27017', tagSet=TagSet{[]}, electionId=null, setVersion=15, lastWriteDate=Wed May 08 18:36:30 UTC 2019, lastUpdateTimeNanos=4709674695100178} , {address=2.2.2.2:27017, type=UNKNOWN, state=CONNECTING}, {address=3.3.3.3:27018, type=REPLICA_SET_ARBITER, roundTripTime=26.5 ms, state=CONNECTED}]
Question : Does logging from com.mongodb.diagnostics.logging.JULLogger represent only the log4j messages or do they represent real traffic as well? Is there a way to differentiate from the debug logs which connection is originating from the log4j and which is from the real application traffic? Want to understand if there is a general best practice in this area.
|
| Comment by Ross Lawley [ 08/May/19 ] |
|
Hi ubuntuuser01, Thank you for the ticket. All versions of the driver are tested across single, replica sets and sharded replicasets. So it is surprising to hear you are having an issue. If you have removed 1.1.1.1 from your app and configuration and the driver is still trying to connect to it, it highlights there is some cached configuration / war files. Please could you double check there are no cached configurations / war files etc.. and add logging to check what uri is being picked up in your application. Ross |