-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Build
-
Fully Compatible
If you run the following command, it will print the 20 largest files in our git history:
$ git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 | cut -c 1-12,41- | $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest | tail -n 20 3ea705c12893 15MiB buildscripts/build/eslint 96f965112d62 16MiB test.pgd 89e7c6caebce 16MiB jstests/query_golden/libs/data/ce_accuracy_test.data 482f0bd92dbb 17MiB f bba33c8a932a 19MiB src/third_party/wiredtiger/tools/wtstats/template/fixtures/big.fixture.json 62b209ac01bf 24MiB src/third_party/thriftbin/thrift-0.17.0/lib/cpp/.libs/libthrift.a 39d226614d75 27MiB src/third_party/thriftbin/thrift-0.17.0/lib/cpp/test/.libs/libtestgencpp.a bc23712cd92e 40MiB sys-perf-investigation/q16-metal-log.txt d69991ac4373 40MiB sys-perf-investigation/q16-log.txt c05b732ce6b8 42MiB dump_bazel.802602.core 59b8d12834a4 43MiB jstests/query_golden/libs/data/ce_accuracy_test.data ce6532a16176 50MiB docker/docker-compose-linux-aarch64 2f6e95244565 53MiB src/third_party/thriftbin/thrift-0.17.0/compiler/cpp/thrift 1288a917fff4 61MiB dump_db_pipeline_tes.2732455.core 3c5825ac63a9 61MiB dump_db_pipeline_tes.2719951.core 31b4c9b561b8 85MiB dump_python3.22527.core c2e4a6c09611 98MiB dump_python3.15756.core aa52a216f4fc 100MiB srv/rs0-0/journal/WiredTigerPreplog.0000000001 e9d8747d44e3 100MiB srv/rs0-0/journal/WiredTigerLog.0000000036 728283a9ae3c 161MiB shell/msvc/ipch/mongo-56e29e4d/mongo-1b36a34c.ipch
Several of these are coredumps that were presumably accidentally committed. They were later deleted (eg, here), but that only removes it from the current version, it doesn't actually remove it from history. So now every clone of the mongo repo needs to carry that file around in .git forever. We should add a pattern for core dumps to our .gitignore to prevent this from happening again.