[SERVER-32216] With Mongo 3.6(in docker) we hit this error "Failed to unlink socket file /tmp/mongodb-27017.sock Operation not permitted" Created: 08/Dec/17  Updated: 30/Oct/23  Resolved: 19/Dec/17

Status: Closed
Project: Core Server
Component/s: Networking
Affects Version/s: None
Fix Version/s: 3.6.1, 3.7.1

Type: Bug Priority: Critical - P2
Reporter: lang qiu Assignee: Jonathan Reams
Resolution: Fixed Votes: 0
Labels: SWNA
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-77903 Upgrade from version 4.4.15 to versio... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.6
Steps To Reproduce:

1. Using the latest 3.6
2. redeploy the services

Sprint: Platforms 2017-12-18, Platforms 2018-01-01
Participants:

 Description   

We run mongo in docker. We were using 3.4 before and everything was good.

Yesterday we upgraded to 3.6 and redeploy our docker services, then mongo reported the error "Failed to unlink socket file /tmp/mongodb-27017.sock Operation not permitted"

The detailed log,

2017-12-07T13:10:09.637+0000 I CONTROL  [initandlisten] options: { config: "/etc/mongodb.conf", security: { authorization: "enabled" }, storage: { dbPath: "/data/db2", mmapv1: { smallFiles: true } } }
2017-12-07T13:10:09.657+0000 I -        [initandlisten] Detected data files in /data/db2 created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-12-07T13:10:09.657+0000 I STORAGE  [initandlisten]
2017-12-07T13:10:09.657+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2017-12-07T13:10:09.657+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2017-12-07T13:10:09.657+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1383M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-12-07T13:10:13.422+0000 E STORAGE  [initandlisten] WiredTiger error (-31802) [1512652213:422674][1:0x7fd7c63e7d40], txn-recover: unsupported WiredTiger file version: this build  only supports major/minor versions up to 1/0,  and the file is version 2/0: WT_ERROR: non-specific WiredTiger error
2017-12-07T13:10:13.422+0000 E STORAGE  [initandlisten] WiredTiger error (0) [1512652213:422729][1:0x7fd7c63e7d40], txn-recover: WiredTiger is unable to read the recovery log.
2017-12-07T13:10:13.422+0000 E STORAGE  [initandlisten] WiredTiger error (0) [1512652213:422736][1:0x7fd7c63e7d40], txn-recover: This may be due to the log files being encrypted, being from an older version or due to corruption on disk
2017-12-07T13:10:13.422+0000 E STORAGE  [initandlisten] WiredTiger error (0) [1512652213:422742][1:0x7fd7c63e7d40], txn-recover: You should confirm that you have opened the database with the correct options including all encryption and compression options
2017-12-07T13:10:13.422+0000 E STORAGE  [initandlisten] WiredTiger error (-31802) [1512652213:422757][1:0x7fd7c63e7d40], txn-recover: Recovery failed: WT_ERROR: non-specific WiredTiger error
2017-12-07T13:10:13.425+0000 I -        [initandlisten] Assertion: 28595:-31802: WT_ERROR: non-specific WiredTiger error src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 276
2017-12-07T13:10:13.425+0000 I STORAGE  [initandlisten] exception in initAndListen: 28595 -31802: WT_ERROR: non-specific WiredTiger error, terminating
2017-12-07T13:10:13.425+0000 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2017-12-07T13:10:13.425+0000 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27017.sock
2017-12-07T13:10:13.425+0000 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2017-12-07T13:10:13.425+0000 I CONTROL  [initandlisten] now exiting
2017-12-07T13:10:13.425+0000 I CONTROL  [initandlisten] shutting down with code:100



 Comments   
Comment by Ramon Fernandez Marina [ 19/Dec/17 ]

qiulang, I'm going to resolve this ticket since the fix to remove the unix domain socket on clean shutdown has made it into the codebase – thanks for reporting this.

We're planning on publishing a release with this fix soon. Until then you can use one of the workarounds listed above. For further support discussions please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience.

Thanks,
Ramón.

Comment by lang qiu [ 16/Dec/17 ]

Hi as I commented on Dec 11, using --nounixsocket has the same effect as adding a volume for /tmp, no linking error, only "this server is bound to localhost" so my other docker container can't connect to it. I will try --bind_ip=0.0.0.0 as you suggested.

So the only problem we have now is why do I see /tmp/mongodb-27017.sock error if I start mongo from a fresh docker container? Is it possible that my dockerfile caused the problem? I need my mongo image has the default data so I used a dockerfile at here suggested.

https://stackoverflow.com/questions/33558506/how-to-create-a-mongo-docker-image-with-default-collections-and-data

Comment by Jonathan Reams [ 15/Dec/17 ]

The warning about the server being bound to localhost was introduced in 3.6 as a security improvement. You can restore the 3.4 behavior by specifying --bind_ip=0.0.0.0.

If you're starting from a fresh docker container each time, then I don't understand why you're seeing the error about /tmp/mongodb-27017.sock error, unless there's a permissions or some other problem with /tmp. I also don't know why --nounixsocket didn't resolve your issue since that should prevent unlinking existing sockets in the first place. This is sounding more like a problem with the docker environment.

Comment by lang qiu [ 15/Dec/17 ]

OK, I find a way to check mongodb-27017.sock, so I added a docker volume for /tmp, i.e. map /tmp to host local folder.
Then mongo container can start now
srwx------ 1 mongodb mongodb 0 Dec 15 02:59 mongodb-27017.sock

But unfortunately, the same error showed again "this server is bound to localhost". So my other container failed to connect to it.

yunwei-product-dbdebug_mongo_1 | 2017-12-15T02:59:10.946715113Z 2017-12-15T02:59:10.941+0000 I CONTROL [initandlisten] ** WARNING: This server is bound to localhost.
yunwei-product-dbdebug_mongo_1 | 2017-12-15T02:59:10.946718654Z 2017-12-15T02:59:10.941+0000 I CONTROL [initandlisten] ** Remote systems will be unable to connect to this server.

Comment by lang qiu [ 15/Dec/17 ]

Hi, when I used mongo 3.4 it was indeed a /tmp/mongodb-27017.sock, mongodb is the owner (check the attached).

But when I used 3.6, the docker container kept restarting (with the error I reported) so I wasn't able to check its folder. What else can I do to further debug ?

Comment by lang qiu [ 13/Dec/17 ]

Hi, I run mongo in docker and I did a clean docker-compose build/up, which means there was nothing in /tmp. Also, the docker runs mongod with root, what other users can it be (I have limited knowledge about running mongo in docker though)?

I will double check it tomorrow.

Comment by Jonathan Reams [ 12/Dec/17 ]

qiulang, we believe this problem is being caused by an old mongod's UNIX domain socket being left in /tmp on shutdown, and the socket being owned by a different user than the new mongod. I've just pushed a change that will be in 3.6.1 that ensures these socket files get cleaned up during normal shutdown of the server.

I don't understand why --nounixsocket doesn't work around your issue, however, or why downgrading to 3.4 fixes your issue. Could you check whether there is a /tmp/mongodb-27017.sock file on your system and what its permissions are?

Comment by Githook User [ 12/Dec/17 ]

Author:

{'name': 'Jonathan Reams', 'email': 'jbreams@mongodb.com', 'username': 'jbreams'}

Message: SERVER-32216 Remove UNIX sockets on clean shutdown

(cherry picked from commit 9dc34426570cc57cfdb4b6f6ea4f31018662082f)
Branch: v3.6
https://github.com/mongodb/mongo/commit/0beac42d0151358516b4c4ed8b8c58a4185bec07

Comment by Githook User [ 12/Dec/17 ]

Author:

{'name': 'Jonathan Reams', 'email': 'jbreams@mongodb.com', 'username': 'jbreams'}

Message: SERVER-32216 Remove UNIX sockets on clean shutdown
Branch: master
https://github.com/mongodb/mongo/commit/9dc34426570cc57cfdb4b6f6ea4f31018662082f

Comment by lang qiu [ 11/Dec/17 ]

I tried both --transportLayer=legacy and --nounixsocket, unfortunately, them didn't work for me.
Although they both did not show "Failed to unlink socket file /tmp/mongodb-27017.sock Operation not permitted" error, they both showed "** WARNING: This server is bound to localhost. ... Remote systems will be unable to connect to this server. " And my nodejs docker can't connect to mongo, "Mongoose connected error MongoError: failed to connect to server [mongo:27017] on first connect [MongoError: connect ECONNREFUSED 172.18.7.11:27017]"

I also tried RUN rm /tmp/mongodb-27017.sock as another jira suggested, it did not work either.

After I switched back to mongo 3.4, the problem went away! I have tried several times to confirm it.
One thing I forgot to mention is it works fine on my MacBook (both 3.4 & 3.6), only when I deployed 3.6 to the cloud (https://www.aliyun.com/) I hit this problem.

BTW the command I use to start mongo is CMD ["mongod", "--config", "/etc/mongodb.conf", "--smallfiles","--auth","--transportLayer=legacy"]

mongodb.conf only has 1 line

dbpath = /data/db2

Comment by Jonathan Reams [ 08/Dec/17 ]

3.6 has a new networking implementation, but they should both have the same behaviour here. Could you try running the mongod with --transportLayer=legacy to see if the problem reproduces? That should switch mongodb back to using the 3.4 networking code. You can also work around this problem by disabling UNIX sockets if you aren't using them (see https://docs.mongodb.com/manual/reference/configuration-options/#net-unixdomainsocket-options)

Comment by Kaloian Manassiev [ 08/Dec/17 ]

This is also making it problematic to use MongoDB 3.6 on a shared Ubuntu system, because different users create these devices with different ownership and it leads to the same error for whoever comes second.

Generated at Thu Feb 08 04:29:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.