-
Type: Bug
-
Resolution: Duplicate
-
Priority: Critical - P2
-
None
-
Affects Version/s: 4.0.0, 4.0.1
-
Component/s: None
-
None
-
ALL
-
Context:
We have batchs that get data in db (cursor count > 5M records) and, for each record, compute various data. This computation can be quite long and the delta time between 2 request to getMore can be much more than 10mn (cursorTimeout). So we run our database with cursorTimeoutMillis = 7200000 (2 hours) and everything run well on our current db (mongoDB 3.0.x).
When trying MongoDb 4.0 (currently 4.0.1) we run into an issue : some programs crash with CursorNotFound (all our programs are in python).
On specific tests program we can see that the cursor vanishes 30 to 40mn after opening the cursor
So I tried to reproduce the problem with a small script that first get a few records from a cursor, than wait for some
(long) time and then get the remaing records. While waiting, I'm getting sessions list and serverStatus().metrics
.cursor
cursor total == 1 (our cursor)
Within 5mn, we see sessions 2 appearing (the cache seems to be updated every 5mn)
one of these sessions (which seems to be related to the cursor as we can see in mongodb.log (verbosity=5)) has its lastUse that does not change over time (seems coherent).
About 30mn later we have only one session left. The session related to the cursor is not there anymore. I guess its because of liidTTLIndex on config.system.sessions
{ "v" : 2, "key" : { "lastUse" : 1 }, "name" : "lsidTTLIndex", "ns" : "config.system.sessions", "expireAfterSeconds" : 1800 }
5mn later (session cache refresh ?) serverStatus.metrics.cursor has a total of 0 and if we stop waiting, our for loop
on the cursor ends with a CursorNotFound error.
I guess that if the index had a expireAfterSeconds greater than cursorTimeoutMillis / 1000 (to get seconds), the issue would disappear.
This issue is critical for us. We cannot upgrade until it is solved
- duplicates
-
SERVER-36808 Server closes cursors that are still in use during session cache refresh
- Closed