[SERVER-49974] A strange couple of messages in the logs on the second node Created: 29/Jul/20  Updated: 27/Oct/23  Resolved: 02/Sep/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.2.8
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Сергей Грищенюк Assignee: Dmitry Agranat
Resolution: Community Answered Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:
Case:

 Description   

Hello. I have been observing strange pairs of records on one of the second nodes for several days.
She is hidden and the following pair of entries comes to her log every 5 minutes
2020-07-29T14: 26: 35.983 + 0300 I NETWORK [LogicalSessionCacheRefresh] Successfully connected to mong
o-prod01.work-yc.cxm: 27017 (1 connections now open to mongo-prod01.work-yc.cxm: 27017 with a 0 second
timeout)
2020-07-29T14: 26: 36.069 + 0300 I NETWORK [LogicalSessionCacheRefresh] scoped connection to mongo-pro
d01.work-yc.cxm: 27017 not being returned to the pool
2020-07-29T14: 26: 36.069 + 0300 I CONTROL [LogicalSessionCacheRefresh] Failed to refresh session cach
e: Location13111: wrong type for field (expireAfterSeconds) long! = int

In theory, a backup is made from it. And I'm not sure if it will be created without errors. I was unable to find the problem.
Those. I don't even know how I can get to it.

I tried to delete db.system.sessions.dropIndex ("lsidTTLIndex"), but the system does not want to authorize me



 Comments   
Comment by Dmitry Agranat [ 02/Sep/20 ]

Sounds good vetedie90@gmail.com, I will go ahead and close this case. Thanks again for flagging this issue to us.

Comment by Сергей Грищенюк [ 01/Sep/20 ]

I can confidently inform you that this error has disappeared from the logs.

Comment by Dmitry Agranat [ 01/Sep/20 ]

Hi vetedie90@gmail.com, thank you for the update, how the rest of the steps went and if successful, do you still see Failed to refresh session cache messages?

Comment by Сергей Грищенюк [ 31/Aug/20 ]

Hello, Dmitry. 
Step 2.

replicaname:PRIMARY> db.system.sessions.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "lastUse" : 1
                },
                "name" : "lsidTTLIndex",
                "ns" : "config.system.sessions",
                "expireAfterSeconds" : NumberLong(3600)
        },
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "config.system.sessions"
        }
]

Step 4.

replicaname:PRIMARY> db.system.sessions.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "config.system.sessions"
        },
        {
                "v" : 2,
                "key" : {
                        "lastUse" : 1
                },
                "name" : "lsidTTLIndex",
                "ns" : "config.system.sessions",
                "expireAfterSeconds" : 3600
        }
]

Comment by Сергей Грищенюк [ 20/Aug/20 ]

Hello Dmitry.
Unfortunately, we will be able to return to this issue only on August 25th. I will unsubscribe to the task as soon as I get some result.

Comment by Dmitry Agranat [ 19/Aug/20 ]

Hi vetedie90@gmail.com, did you have a chance to review our last comment?

Comment by Dmitry Agranat [ 10/Aug/20 ]

Hi vetedie90@gmail.com, please test this procedure in your testing environment before executing in production.

1. Grant yourself access to the system.sessions collection by replacing <USERNAME> with your username:

    use admin
    db.createRole({
        "role": "dropSessionsIndexRole",
        "privileges":[
            { 
                "resource": { "db": "config", "collection": "system.sessions" },
                "actions": ["find", "dropIndex"]
            }
        ],
        "roles": []
    })
    db.grantRolesToUser("<USERNAME>", ["dropSessionsIndexRole"])

2. Check what value the TTL index lsidTTLIndex on the system.sessions collection is:

    use config
    db.system.sessions.getIndexes()

3. Drop the TTL index lsidTTLIndex:

    db.system.sessions.dropIndex("lsidTTLIndex")

4. Verify that the TTL index lsidTTLIndex is recreated automatically. Note that this can take up to 5 or 10 minutes.

    db.system.sessions.getIndexes()

5. Revoke and drop the dropSessionsIndexRole role replacing <USERNAME> with your username:

    use admin
    db.revokeRolesFromUser("<USERNAME>",["dropSessionsIndexRole"])
    db.dropRole("dropSessionsIndexRole")

Let us know how it goes and what the value was of expiresAfterSeconds in step 2 and 4.
Dima

Comment by Сергей Грищенюк [ 06/Aug/20 ]

Now on both hidden nodes
"localLogicalSessionTimeoutMinutes": 30

Comment by Сергей Грищенюк [ 06/Aug/20 ]

Sorry, was out of touch with this problem.
I am answering the questions.
One of the two hidden participants (from which I sent logs) had this parameter until the problem was discovered. On the second, as it turned out, no. After the parameter was removed from the first, the situation did not change. I can set this parameter to all members of the replica (except for the arbiter, probably).

Comment by Dmitry Agranat [ 02/Aug/20 ]

Hi vetedie90@gmail.com,

Before we'll figure out what permissions you are missing to drop lsidTTLIndex, I'd like to step back and understand how this issue has happened in the first place. Was there any change after which you've noticed this behavior (version upgrade, any parameter change, member role change...). Does this issue only manifest itself on one specific member? Any information provided to help us figure out the root cause of this issue is highly appreciated.

Also, you said:

All non-hidden replica participants (except the arbiter) have the parameter
 
setParameter:
localLogicalSessionTimeoutMinutes: 60

But based on your issue description, this is a hidden member. What is the localLogicalSessionTimeoutMinutes on this hidden member?

Thanks,
Dima

Comment by Сергей Грищенюк [ 29/Jul/20 ]

No, this is standart replica. PSSAHH.

drop lsidTTLIndex:

use config
db.system.sessions.dropIndex("lsidTTLIndex");

In response, I receive a message that my user is not authorized, although he has very high rights.

After adding permissions with the db.grantRolesToUser("admin", [\{role: "readWrite", db: "config", collection: "system.sessions"}]);
in the database admin (where it is located), I still get an authorization refusal.
// code placeholder

 

 

Comment by Dmitry Agranat [ 29/Jul/20 ]

Thanks vetedie90@gmail.com, who exactly are you trying to drop lsidTTLIndex index?

Also, is this a sharded cluster?

Comment by Сергей Грищенюк [ 29/Jul/20 ]

Files from the node where the error goes - attached to the specified path

Comment by Сергей Грищенюк [ 29/Jul/20 ]

All non-hidden replica participants (except the arbiter) have the parameter

setParameter:
localLogicalSessionTimeoutMinutes: 60

Comment by Сергей Грищенюк [ 29/Jul/20 ]

Yes of course. Now I will.

Comment by Dmitry Agranat [ 29/Jul/20 ]

Hi vetedie90@gmail.com,

Would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location?

Also, please please provide:

  • What value the localLogicalSessionTimeoutMinutes is set to on all nodes?
  • How exactly did you try to drop lsidTTLIndex index?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Thanks,
Dima

Generated at Thu Feb 08 05:21:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.