[SERVER-17010] Reduce file handle usage in File based Sorter Created: 22/Jan/15  Updated: 22/Dec/20  Resolved: 31/Oct/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.6.7, 3.2.1
Fix Version/s: 3.4.22, 3.6.12, 4.0.7, 4.1.5

Type: Bug Priority: Major - P3
Reporter: Mark Benvenuto Assignee: Dianna Hohensee (Inactive)
Resolution: Done Votes: 7
Labels: nyc
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Duplicate
is duplicated by SERVER-39597 Reduce file handle usage in File base... Closed
is duplicated by SERVER-31652 Initialization synchronization of win... Closed
Gantt Dependency
has to be done before SERVER-37293 Refactor Sorter so that DocumentSourc... Closed
Related
related to SERVER-16991 errno:24 Too many open files on Windows Closed
is related to SERVER-38764 External sorter should use 64-bit int... Closed
is related to SERVER-24020 Increase open file limit on Windows Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0, v3.6, v3.4
Sprint: Storage NYC 2018-09-24, Storage NYC 2018-10-08, Storage NYC 2018-10-22, Storage NYC 2018-11-05
Participants:
Case:
Linked BF Score: 90

 Description   

Per dupuisla, we need to reduce the number of file handles that we used to sort large amounts of data to avoid file handle exhaustion on various platforms.

See SERVER-14572:

I spent some time on the sorter.cpp and it is clear that you need to review this code. The FileIterator consume far too much file descriptors. There is no upper limit and increase the number of handle is just a poor fix.
Why not merge all this in one file and use "seek" to move between the different block instead of creating thousand of files? In my case, I have something like 2000 of these temp files created, so 2048 will be on the low side.



 Comments   
Comment by Githook User [ 11/Jun/19 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com', 'username': 'DiannaHohensee'}

Message: SERVER-17010 each Sorter instance spills to a single file rather than a new file per spill to disk

SERVER-38764 External sorter should use 64-bit integers for file offsets

(cherry picked from commit 48d999c08304b6ede2a9d1f9d9db974b59fe97e2)
Branch: v3.4
https://github.com/mongodb/mongo/commit/c1c761dd865308a15ff75748bf111d0a3ce366d6

Comment by Githook User [ 12/Mar/19 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com', 'username': 'DiannaHohensee'}

Message: SERVER-17010 each Sorter instance spills to a single file rather than a new file per spill to disk

(cherry picked from commit 2be7f2677a40a863f336d2964f456c9d87ddc838)

SERVER-38764 External sorter should use 64-bit integers for file offsets

(cherry picked from commit 9dafb7a3e3bafa463ab5951189b670965995dada)
Branch: v3.6
https://github.com/mongodb/mongo/commit/48d999c08304b6ede2a9d1f9d9db974b59fe97e2

Comment by Githook User [ 25/Feb/19 ]

Author:

{'name': 'Eric Milkie', 'username': 'milkie', 'email': 'milkie@10gen.com'}

Message: SERVER-17010 additional fixes (didn't get merged with cherry-pick)
Branch: v4.0
https://github.com/mongodb/mongo/commit/f906e0b21da0c0fb365687e4e04723a5af8d5002

Comment by Githook User [ 25/Feb/19 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com', 'username': 'DiannaHohensee'}

Message: SERVER-17010 each Sorter instance spills to a single file rather than a new file per spill to disk

(cherry picked from commit 2be7f2677a40a863f336d2964f456c9d87ddc838)
Branch: v4.0
https://github.com/mongodb/mongo/commit/9491e6fb0ec2b38dd758d8acfe0e0aa11aa7bad9

Comment by Githook User [ 31/Oct/18 ]

Author:

{'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com', 'username': 'DiannaHohensee'}

Message: SERVER-17010 each Sorter instance spills to a single file rather than a new file per spill to disk
Branch: master
https://github.com/mongodb/mongo/commit/2be7f2677a40a863f336d2964f456c9d87ddc838

Comment by mars911 [ 23/Oct/17 ]

How to solve this problem?

Comment by Mathias Stearn [ 14/Jul/17 ]

It should work and be fairly easy to do. It is just a matter of changing the SortedFileWriter and sorter::FileIterator to work over ranges within a single file rather than having their own files, and having the few direct users of SortedFileWriter pass in the common file handle they want to use.

Comment by Ian Whalen (Inactive) [ 14/Jul/17 ]

redbeard0531 What are your thoughts on how to approach this/complexity to do so?

Comment by Alexander Gorrod [ 13/Jul/17 ]

This has been encountered by at least one customer recently - I've moved it into needs triage state, to ensure trigger a review of it's priority, design a path forward and schedule the work.

Generated at Thu Feb 08 03:43:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.