Details
-
Question
-
Resolution: Incomplete
-
Critical - P2
-
None
-
2.2.0
-
None
-
None
-
Server:CentOS 6.3 Client: CentOS 6.3, PHP Driver, Java Driver
Description
- We are running a MongoDB replica set with 1 Primary and 1 Secondary and 1 Arbiter.
- We have 2 applications writing to 2 separate db's in Mongo - lets call these DB1 and DB2 for convenience.
- DB1 has 6 collections while DB2 has 107 collections.
- DB1 has been in production for 10 months, DB2 has been in production for 6 months.
- Today, at around 11:49 a.m. we discovered that 5 collections in DB1 and 90 collections in DB2 were absent from the Primary. Everything was intact in the Secondary.
- Examining mongodb log, we saw that the number of connections ramped up from ~ 450 at 11:49 a.m. to about 20,000 at 12:11 p.m. after which the mongodb instance started refusing new connections (connection limit reached)
Fri Aug 23 12:11:54 [initandlisten] connection refused because too many open connections: 20000
- We restarted mongod and performed a mongodump from the secondary and ran a mongorestore to the primary. After this both the databases DB1 and DB2 started accepting connections and are now working fine.
- We cannot see any "drop" commands in the mongodb log
- We can see 2 instances of PageFaultException during this time
Fri Aug 23 11:49:58 [conn33572019] PageFaultException thrown
Fri Aug 23 11:56:00 [conn33572019] PageFaultException thrown
The questions are:
- How did collections in the primary get dropped and were still present in the secondary?
- Is there any condition under which something like this can happen?