[SERVER-34374] resmoke.py uses bytestrings for representing pathnames, leading to silently failing to clear the dbpath on Windows Created: 07/Apr/18  Updated: 29/Oct/23  Resolved: 11/May/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 3.2.21, 3.4.16, 3.6.6, 4.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Jonathan Abrahams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-34371 Stop ignoring errors when the test fi... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.6, v3.4, v3.2
Sprint: TIG 2018-05-07, TIG 2018-05-21
Participants:
Linked BF Score: 16
Story Points: 2

 Description   

https://bugs.python.org/issue24672 describes an issue in Python where shutil.rmtree() fails to delete files with non-ASCII pathnames when a bytestring (i.e. a str instance in Python 2). The ntpath.py module in Python preserves type of its argument so it sufficient to use a unicode instance instead in order to have Python call the W-suffixed Win32 APIs that return Unicode strings.

I've verified on a Windows spawn host that the following patch to config.py addresses this issue. The change to parser.py is to just do the same if someone were to specify --dbpathPrefix when trying to reproduce a failure outside of Evergreen.

diff --git a/buildscripts/resmokelib/config.py b/buildscripts/resmokelib/config.py
index 66753c389d..2f13c2df96 100644
--- a/buildscripts/resmokelib/config.py
+++ b/buildscripts/resmokelib/config.py
@@ -34,7 +34,7 @@ DEFAULT_BENCHMARK_MIN_TIME = datetime.timedelta(seconds=5)
 
 # Default root directory for where resmoke.py puts directories containing data files of mongod's it
 # starts, as well as those started by individual tests.
-DEFAULT_DBPATH_PREFIX = os.path.normpath("/data/db")
+DEFAULT_DBPATH_PREFIX = os.path.normpath(u"/data/db")
 
 # Names below correspond to how they are specified via the command line or in the options YAML file.
 DEFAULTS = {
diff --git a/buildscripts/resmokelib/parser.py b/buildscripts/resmokelib/parser.py
index d9f40da3e9..1353f899fd 100644
--- a/buildscripts/resmokelib/parser.py
+++ b/buildscripts/resmokelib/parser.py
@@ -352,7 +352,7 @@ def update_config_vars(values):  # pylint: disable=too-many-statements
     _config.ARCHIVE_LIMIT_TESTS = config.pop("archive_limit_tests")
     _config.BASE_PORT = int(config.pop("base_port"))
     _config.BUILDLOGGER_URL = config.pop("buildlogger_url")
-    _config.DBPATH_PREFIX = _expand_user(config.pop("dbpath_prefix"))
+    _config.DBPATH_PREFIX = unicode(_expand_user(config.pop("dbpath_prefix")))
     _config.DBTEST_EXECUTABLE = _expand_user(config.pop("dbtest_executable"))
     _config.DRY_RUN = config.pop("dry_run")
     _config.EXCLUDE_WITH_ANY_TAGS = _tags_from_list(config.pop("exclude_with_any_tags"))

However, I'm not sure if more special handling on Linux platforms is necessary as the changes from https://github.com/pypa/setuptools/commit/5ad13718686bee04a93b4e86929c1bb170f14a52 suggest we shouldn't use Unicode string literals if sys.getfilesystemencoding() == 'ascii'. We currently set the LANG=C environment variable on all of Ubuntu 16.04 builders (SERVER-31717, SERVER-33184) so it isn't clear why we'd even be able to create files with non-ASCII pathnames. CC mark.benvenuto

$ LANG=C python -c 'import sys; print(sys.getfilesystemencoding())'
ANSI_X3.4-1968



 Comments   
Comment by Githook User [ 18/Jun/18 ]

Author:

{'username': 'hptabster', 'name': 'Jonathan Abrahams', 'email': 'jonathan@mongodb.com'}

Message: SERVER-34374 Wrap shutil.rmtree() in resmoke to handle path names which may contain non-ASCII characters

(cherry picked from commit d6837a12b3586b0738dcd4214951a1d6f6b1415e)
Branch: v3.2
https://github.com/mongodb/mongo/commit/f012132dafbbcc80460b1ae2dbdf0a638838b10e

Comment by Githook User [ 18/Jun/18 ]

Author:

{'username': 'hptabster', 'name': 'Jonathan Abrahams', 'email': 'jonathan@mongodb.com'}

Message: SERVER-34374 Wrap shutil.rmtree() in resmoke to handle path names which may contain non-ASCII characters

(cherry picked from commit d6837a12b3586b0738dcd4214951a1d6f6b1415e)
Branch: v3.4
https://github.com/mongodb/mongo/commit/835b1d7d103755758ca18c496555aaad8aa87273

Comment by Githook User [ 18/Jun/18 ]

Author:

{'username': 'hptabster', 'name': 'Jonathan Abrahams', 'email': 'jonathan@mongodb.com'}

Message: SERVER-34374 Wrap shutil.rmtree() in resmoke to handle path names which may contain non-ASCII characters

(cherry picked from commit d6837a12b3586b0738dcd4214951a1d6f6b1415e)
Branch: v3.6
https://github.com/mongodb/mongo/commit/8fd9976bb4b24c4a9fc52fc556325b4a1446aea5

Comment by Githook User [ 11/May/18 ]

Author:

{'name': 'Jonathan Abrahams', 'email': 'jonathan@mongodb.com', 'username': 'hptabster'}

Message: SERVER-34374 Wrap shutil.rmtree() in resmoke to handle path names which may contain non-ASCII characters
Branch: master
https://github.com/mongodb/mongo/commit/d6837a12b3586b0738dcd4214951a1d6f6b1415e

Generated at Thu Feb 08 04:36:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.