[SERVER-41029] mongocryptd should not delete existing domain socket if it fails to start TCP socket Created: 06/May/19  Updated: 04/Mar/20  Resolved: 27/Jun/19

Status: Closed
Project: Core Server
Component/s: Security
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kevin Albertson Assignee: Mira Carey
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-45895 mongocryptd creates socket and pid fi... Closed
is related to SERVER-41826 avoid unlinking the unix domain socke... Open
Operating System: ALL
Steps To Reproduce:
  • start mongocryptd
  • verify you can connect with mongo mongodb://%2Ftmp%2Fmongocryptd.sock/
  • cd to another directory (so pid file differs)
  • start mongocryptd again, which will fail with "SocketException: Address already in use"
  • although other mongocryptd is still running, mongo mongodb://%2Ftmp%2Fmongocryptd.sock/ fails with Connection refused
Sprint: Security 2019-05-20, Service Arch 2019-06-17, Service Arch 2019-07-01
Participants:

 Description   

CC mark.benvenuto + jeff.yemin
See repro steps.



 Comments   
Comment by Githook User [ 12/Aug/19 ]

Author:

{'name': 'Dan Aprahamian', 'username': 'daprahamian', 'email': 'dan.aprahamian@gmail.com'}

Message: NODE: remove connecting on linux socket

Remove connecting to mongocryptd on /tmp/mongocryptd.sock until
SERVER-41029 is resolved
Branch: master
https://github.com/mongodb/libmongocrypt/commit/7e604382b43d2aabcd69512b140016690b1443d8

Comment by Mira Carey [ 27/Jun/19 ]

I'm going to close this out as wontfix, in preference for SERVER-41826

Comment by Mira Carey [ 19/Jun/19 ]

I've filed SERVER-41826 with a strategy I believe we can use to avoid stealing the domain socket.

Couple of other thoughts:

  • I'm not sure how we're exposing the unix domain socket work in mongocryptd, but have you considered passing --nounixsocket and --bind_ip "./mysock"? That'll let you put the domain socket wherever you'd like. It's still a little error prone, but much less so (and let's opening servers in different directories more easily avoid collisions
  • I think we don't currently support it, but linux has support for abstract domain sockets (a domain socket with a leading '\0' byte). Those aren't on the file system and go away with process death. I'd have to think a bit about introducing the syntax for those, but I think they're a strictly better solution to your problem. If that sounds interesting to you (and if linux only support would still be useful) I can file a ticket to go that route as well
Comment by Kevin Albertson [ 19/Jun/19 ]

I want to avoid stealing the UNIX domain socket to avoid the user experience described in the repro.

Would it also be a problem if the second mongod managed to bind all ports, but then failed for some other reason?

Hmm, I think so. I guess it's just a matter that the first mongod to terminates deletes the UNIX domain socket. Perhaps there's no reasonable way to enforce that the socket file is only deleted if no mongod is bound to it. If that is the case, then perhaps we should close this as "Won't Fix", and that would be more reason for us to choose a sensible user-wide pidfile path. By creating it in the current working directory like we currently do, it's easy to hit issue by running your application in two different directories.

Comment by Mira Carey [ 18/Jun/19 ]

kevin.albertson, do you actually want what's in this ticket? Or do you want to avoid stealing the unix domain socket from a running process?

A few thoughts:

  • This problem shows up even if a subsequent mongod does start up (i.e. if you have different hosts bound to different ip addresses) because the socket name only includes binary+port.
  • Would it also be a problem if the second mongod managed to bind all ports, but then failed for some other reason? (because it would still override the unix domain socket)

I'm trying to figure out if the narrow problem this ticket describes is actually worth solving. Or if you want something more complicated in the "don't overwrite other's unix domain sockets" kind of vein

Comment by Mark Benvenuto [ 31/May/19 ]

The unix domain socket is simply being bound before the TCP/IP sockets. This is not a problem specific to mongocryptd. Assigning to service arch.

The code in question is here: https://github.com/mongodb/mongo/blob/933c6ad19c3f19a964c74a5174cbcf11cde0a66e/src/mongo/transport/transport_layer_asio.cpp#L678-L686

Generated at Thu Feb 08 04:56:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.