[SERVER-57277] Arbiter nodes, (set to authenticate session against them), continue to expect authentication even after a resync. Created: 28/May/21  Updated: 27/Oct/23  Resolved: 16/Aug/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.5
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Marco Barbierato Assignee: Sara Golemon
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-5479 Arbiter in authenticated replica set ... Backlog
Sprint: Security 2021-06-28, Security 2021-07-12, Security 2021-07-26, Security 2021-08-09, Security 2021-08-23
Participants:
Case:

 Description   

Some tech team typically deploys manually PSA Replica Sets, authentication enabled, configuring users on all the nodes, Arbiter included, following these steps:

   1) Start arbiter as a standalone process.
   2) Create the user.
   3) Bring the node up as a replica set member.

Forcing an Initial Sync of Arbiter nodes installed following the above steps, the Arbiter can't longer authenticate sessions against it, but the Initial Sync does not revert the situation to a totally cleaned situation, because e.g. commands like db.shutdownServer() or rs.status() return the warning message:

2021-05-07T11:03:36.269+0000 E QUERY [js] Error: shutdownServer failed: {"ok" : 0,"errmsg" : "command shutdown requires authentication","code" : 13,"codeName" : "Unauthorized"} :_getErrorWithCode@src/mongo/shell/utils.js:25:13DB.prototype.shutdownServer@src/mongo/shell/db.js:426:19@(shell):1:1

Arbiter nodes correctly installed, to the command db.shutdownServer() allows to shutdown the process, while the command  rs.status()  returns the expected output.

The questions are:

  1. Can the unexpected behaviour of these Arbiter nodes lead to other potential health issues for the Replica Set?
      
  2. What's the correct\best way to revert the configuration of Arbiters set to authenticate sessions against them? Is the Initial Sync not enough?

 



 Comments   
Comment by Noah Moss [ 17/Jun/21 ]

sara.golemon- Thank you for this - I've included you in the email response and we'll see where this leads  

Comment by Noah Moss [ 16/Jun/21 ]

hi sara.golemon,

when these administrative commands run on arbiters that are former data bearing members, it's via localhost, correct? And what version are they upgrading from? v4.2?

no worries, thanks for taking it on. I do know they are going from 4.2 yes but unclear on the localhost question

Based on our understand, it would be fine to basically take your last 3 paragraphs and send that to the client (reworded a little by me for context), correct? Would you like to be copied and involved? they may ask for a call to discuss this in more detail which is common.

In the case of Arbiters, since they are not expected to have local storage, Automation Agent is probably seeing Localhost Auth Bypass as a convenient way to get things to work, but I can open a separate thread with that team to discuss this.

Will this be another Jira ticket you can link to so i can track it? 

Comment by Sara Golemon [ 15/Jun/21 ]

Hi, noah.moss! Sorry, I only just got assigned this, but I think I at least partially understand the context.

To be clear about what's going on, when these administrative commands run on arbiters that are former data bearing members, it's via localhost, correct? And what version are they upgrading from? v4.2?

I ask that because it sounds like Automation Agent is (ab)using what we call "Localhost auth bypass" which is /intended/ for initial setup only since you need to start with auth enabled, but with auth enabled you would be otherwise incapable of creating your initial user. Localhost auth bypass is specifically designed to only bypass auth until a user and/or role has been created. Once that happens, the auth bypass shuts off to protect against unwanted actions by other parties. Basically, it's a bootstrapping workaround.

In the case of Arbiters, since they are not expected to have local storage, Automation Agent is probably seeing Localhost Auth Bypass as a convenient way to get things to work, but I can open a separate thread with that team to discuss this.

For the sake of THIS installation, I can confirm that if the `admin.system.users` and `admin.system.roles` collections are empty on startup, then Localhost Auth Bypass will be enabled, and all commands will work without authenticating, but ONLY from 127.0.0.1, ::1, or the unix domain socket. If any documents are created in these collections during runtime, then Localhost Auth Bypass will disable and stay disabled until a restart (with empty privilege collections).

Comment by Marco Barbierato [ 31/May/21 ]

I did other attempts in my lab, and it seems that to ensure an Arbiter returns answering to the commands without asking for to authenticate the sessions against it, the namespace `admin.system.users` in its local configuration must be empty.
So, before to follow the Initial Sync procedure, to delete the local users from the Arbiter seems required.

Thus, the steps to follow are:

  1. Shutdown the Arbiter node, (do it from the Ops Manager UI, if the node is already under Automation).
  2. When the deployment is under Ops Manager with the Automation enabled, "suspend" it. 
  3. Remove the following parameters from the config file of the Arbiter node only:

replication: 
  replSetName: <Replica_Set_Name> 

  1. Access the *mongo shell* on the Arbiter node, and remove all the users from its local namespace `admin.system.users`: 

    use admin 
    ### Authenticate the session if required. "db.auth("<USER_NAME>","<PASSWORD>")" 
    db.system.users.deleteMany({user: {$gt: "a"}}, {w: "majority}, wtimeout: 100}) 
    ### Verify that there are not users configured, e.g. using the command "show users"

  2. Restore the Replica Set parameters in the configuration file (ref. point 3.).
  3. For the Arbiter node, empty the content of the *dbpath*.
  4. If at point 2. the Automation has been suspended, resume it.
  5. Start the node.

After the above steps the Arbiter returned to answer as expected to any command. 

Please, can confirm where the configurations for an Arbiter which authenticates session also against it are stored?

 

Generated at Thu Feb 08 05:41:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.