[SERVER-42603] Recent service file change may cause cyclic dependencies Created: 02/Aug/19  Updated: 29/Oct/23  Resolved: 05/Aug/19

Status: Closed
Project: Core Server
Component/s: Packaging
Affects Version/s: 3.4.22, 4.0.11
Fix Version/s: 3.6.14, 4.0.12, 4.2.0, 3.4.23, 4.3.1

Type: Bug Priority: Critical - P2
Reporter: Connecting Media Assignee: Mathew Robinson (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File mongod-4.0.10.service     File mongod-4.0.11.service    
Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-42716 Incorrect dependency on multi-user.ta... Closed
Problem/Incident
is caused by SERVER-36043 systemd unit for mongod starts before... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4
Steps To Reproduce:
  • Install version 4.0.11 of mongodb-org-server
  • Check the file /lib/systemd/system/mongod.service
  • Note the `After` and "WantedBy" target.
  • Compare with 4.0.10
Participants:
Case:

 Description   

I have noticed that in the version 4.0.11 the service file for systemd changed and introduced a cyclic dependcy.
As it seems the "After" target has been changed in this version. Previously it was

After=network.target

and now it is

After=multi-user.target

This is an issue because the multi-user target is also set as the wanted by target:

WantedBy=multi-user.target

This creates cyclic depencies and from what I have noticed the order in which systemd fixes this cyclic depency seems to be somewhat random. This prevents all services depending on mongod.service to start at all and also seems to be randomly preventing other services from starting.

Once the system is up, all services can be started manully by hand, so it "only" affects the startup order. But since this definately breaks systems I've given it the priority "Critical - P2"



 Comments   
Comment by Githook User [ 06/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Add After=network.target to service files

(cherry picked from commit edd215fd7979d776be5a9fab6cc8335a29fd96f1)
Branch: v3.4
https://github.com/mongodb/mongo/commit/667d8fae4a1a06ede9af584857e6f1230650b134

Comment by Githook User [ 06/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'username': 'chasinglogic', 'email': 'chasinglogic@gmail.com'}

Message: SERVER-42603 Add After=network.target to service files

(cherry picked from commit edd215fd7979d776be5a9fab6cc8335a29fd96f1)
Branch: v3.6
https://github.com/mongodb/mongo/commit/a64a8387f5e7b8ea329b7ac5bd4d152044609c86

Comment by Githook User [ 06/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Add After=network.target to service files

(cherry picked from commit edd215fd7979d776be5a9fab6cc8335a29fd96f1)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6587dcb2bf3cb9676126b6e06222c04340023eb5

Comment by Githook User [ 06/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Add After=network.target to service files

(cherry picked from commit edd215fd7979d776be5a9fab6cc8335a29fd96f1)
Branch: v4.2
https://github.com/mongodb/mongo/commit/14f4a2d1973b4a7056ceeefc8814c62d0f12c33d

Comment by Githook User [ 06/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'username': 'chasinglogic', 'email': 'chasinglogic@gmail.com'}

Message: SERVER-42603 Add After=network.target to service files
Branch: master
https://github.com/mongodb/mongo/commit/edd215fd7979d776be5a9fab6cc8335a29fd96f1

Comment by Connecting Media [ 06/Aug/19 ]

I also checked MariaDB and PostgreSQL, and both have "After=network.target". Which is the correct target to depend on.

Please do some research before messing with system files. Stuff like that breaks systems.

Comment by Connecting Media [ 06/Aug/19 ]

Why was the "After" setting removed completely?
Now MongoDB can be started before networking is available, causing it to fail. This is yet another bug waiting to happen. The correct target is either "network.target" or "remote-fs.target". Or maybe even both.

So I'm asking for it to be reopened and fixed properly.

Comment by Githook User [ 05/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'username': 'chasinglogic', 'email': 'chasinglogic@gmail.com'}

Message: SERVER-42603 Remove cyclic dependency in SystemD service files

(cherry picked from commit 18bff834e331f8a6a13aeec4c9cf94a9e9239d75)
Branch: v3.6
https://github.com/mongodb/mongo/commit/a8f80c31ced63a9ab42a146f35b64d5a0b0607eb

Comment by Githook User [ 05/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Remove cyclic dependency in SystemD service files

(cherry picked from commit 18bff834e331f8a6a13aeec4c9cf94a9e9239d75)
Branch: v3.4
https://github.com/mongodb/mongo/commit/60107ec5e7134ce9a106494900c5dbabff5a643e

Comment by Githook User [ 05/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Remove cyclic dependency in SystemD service files

(cherry picked from commit 18bff834e331f8a6a13aeec4c9cf94a9e9239d75)
Branch: v4.0
https://github.com/mongodb/mongo/commit/c57d7cb99e012d87c94e9a6548f0b4cbdc0a4295

Comment by Githook User [ 05/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Remove cyclic dependency in SystemD service files

(cherry picked from commit 18bff834e331f8a6a13aeec4c9cf94a9e9239d75)
Branch: v4.2
https://github.com/mongodb/mongo/commit/b00e09660fe268e292fb39861eea1f40ad0ae7b7

Comment by Githook User [ 05/Aug/19 ]

Author:

{'name': 'Mathew Robinson', 'email': 'chasinglogic@gmail.com', 'username': 'chasinglogic'}

Message: SERVER-42603 Remove cyclic dependency in SystemD service files
Branch: master
https://github.com/mongodb/mongo/commit/18bff834e331f8a6a13aeec4c9cf94a9e9239d75

Comment by Connecting Media [ 05/Aug/19 ]

After digging through the code, this bug was introduced by trying to fix this issue: SERVER-36043
Which after all isn't and issue but a misconfiguation on the side of the bug reporter.

Comment by Mathew Robinson (Inactive) [ 05/Aug/19 ]

Hey ConnectingMedia,

I was able to reproduce the issue, I saw in the logs what you mean about how SystemD will force the cycle to break and usually prevent the dependent service.

I've sent this out for code review. Thanks for the bug report!

Comment by Connecting Media [ 02/Aug/19 ]

Thanks for the reply @Mathew Robinson,

All I have in these service files is

After=mongod.service
Requires=mongod.service

The issue being here that "mongod.service" depends on "multi-user.target" (because of the "After" setting in the "mongod.service" file) and "multi-user.target" depends on "mongod.service" (because of the "WantedBy" setting in the "mongod.service" file). This is not valid and a cyclic dependency. It might be ok (ok as in "it doesn't blow up") if nothing depends on "mongod.service" but as soon as something does, that certainly breaks things.

If it helps I can recreate it and show you the logs from the system startup. Though it should be trivial to recreate.

Little note: I probably won't be able to respond before Monday.

Comment by Mathew Robinson (Inactive) [ 02/Aug/19 ]

Hey ConnectingMedia,

Can you send me one of your dependent service files that isn't starting? I don't need any of the Exec*= lines I just need to see how you're configuring everything else.

It appears from the systemd.unit man page (truncated) that what we have is a valid configuration:

Before=, After=
...
Note that this setting is independent of and orthogonal to the
requirement dependencies as configured by Requires=, Wants= or
BindsTo=. It is a common pattern to include a unit name in both the
After= and Requires= options, in which case the unit listed will be
started before the unit that is configured with these options. 
...

So I'd like to repro with one of your dependent services so I can determine if this is a problem with our service configuration and what I can do to fix it, or report an upstream systemd bug. Thanks!

Comment by Danny Hatcher (Inactive) [ 02/Aug/19 ]

Thanks for the report. I'll send it to the appropriate team.

Generated at Thu Feb 08 05:00:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.