[SERVER-72867] normalize all text file lines to be EOL-terminated Created: 15/Jan/23  Updated: 29/Oct/23  Resolved: 01/May/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Billy Donahue Assignee: Billy Donahue
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-76620 query_golden_* jstests should ensure ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Service Arch 2023-04-17, Service Arch 2023-05-01, Service Arch 2023-05-15
Participants:

 Description   

We've accumulated a large number of text files over time for which the last line is unterminated. This is something we lint against, and it's a source of churn. We can make a single commit to bulk-fix everything in the repo. Of course, files that are recognized as "binary" by git should be left alone. The appropriate repair can be generated by:

#!/bin/bash
find_missing_eol_filter() {
    while read f ; do tail -n 1 "$f" | read _ || echo "$f" ; done
}
fix_missing_eol_filter() {
    while read f ; do echo >> "$f" ; done
}
generate_text_files_list() {
    git ls-files --eol | grep -v src/third_party | grep '^i/lf' | cut -c 40-
}
generate_text_files_list | find_missing_eol_filter | fix_missing_eol_filter

https://gist.github.com/BillyDonahue/976c19b4de7c06a49f57a81ae7642bd4

(draft) https://github.com/10gen/mongo/pull/9819



 Comments   
Comment by Billy Donahue [ 01/May/23 ]

Restoring the Fixed close status, not "Works As Designed".

Comment by Billy Donahue [ 01/May/23 ]

Works as designed. SERVER-76620 will fix the bug in golden tests.

Comment by Billy Donahue [ 01/May/23 ]

Ok. I'm glad this worked itself out. I know there's a revert of my change pending and it should be halted in favor of just getting your SERVER-76620 in as a fix.

Comment by David Percy [ 01/May/23 ]

The golden does suite overrides print(), because it wants to write to both stdout and the test output. It attempts to imitate the builtin print(), which is implemented here.

What's not obvious from that function is that the builtin print() appends an extra newline, but only when the argument doesn't already end in a newline. So print('a') acts the same as print('a\n'), and no extra newline is added to print('a\n'), print('a\n\n') etc. I assume this is happening in the log helper.

I created SERVER-76620 and linked it to the BF. While the builtin print() behavior is surprising, the golden suite should do a better job of matching its behavior.

Comment by Billy Donahue [ 01/May/23 ]

I'm afraid I don't think I understand how these files ever didn't have terminating newlines if they're a series of print statements.
printjson is just a print statement of a formatted string.

$ echo "print('hello'); print('world');" | build/install/bin/mongo
//other stuff....
hello
world
bye

This shows a newline is inserted after each print call.

Maybe the golden file is monkey-patching print in such a way that it strips the trailing newline?
david.percy@mongodb.com

Comment by Billy Donahue [ 27/Apr/23 ]

Yeah I think the golden files are meant to be text, so they need newlines.

If tests are generating text files that don't end with newlines then they should be fixed. I'd say that even if the purpose of the text file is for comparison against a golden expected output.

If this is too hard to do then maybe the comparison logic should count two files to be effectively identical if they only differ in the terminating newline. But really the test should be fixed.

The script I used was careful to only modify files that are in git as 'lf' files. If these golden files are meant to be binary they can be marked that way in git, but that's fragile I think.

Comment by Githook User [ 26/Apr/23 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-72867 eol-terminate all nonempty text files
Branch: master
https://github.com/mongodb/mongo/commit/49395d449c33b1201fb57f4a4f2aca8278c36a5b

Generated at Thu Feb 08 06:23:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.