[DRIVERS-2518] Add rationale for why the filter parameter is required for countDocuments Created: 14/Dec/22 Updated: 26/Jun/23 |
|
| Status: | Backlog |
| Project: | Drivers |
| Component/s: | CRUD |
| Fix Version/s: | None |
| Type: | Task | Priority: | Unknown |
| Reporter: | Shane Harvey | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Driver Changes: | Not Needed | ||||||||
| Description |
|
When the countDocuments API was added I believe the intention was to make the filter parameter required, that is coll.countDocuments() is not allowed, it has to be coll.countDocuments({}). However I can't find any such rationale in the spec itself. We should add rationale for why the filter parameter is required for countDocuments and perhaps even add a spec test that calling countDocuments without a filter will raise an error. |
| Comments |
| Comment by Shane Harvey [ 19/Dec/22 ] | |
|
Thanks for the deep dive jmikola@mongodb.com. In that case let's not add a test for this behavior since it would be a breaking change in some drivers. Let's just make the rationale clear in the spec and let existing drivers stay as they are. | |
| Comment by Jeremy Mikola [ 15/Dec/22 ] | |
|
Writing this was a fun trip down memory lane. I'll address the comments above in the order they were written. Historical evidence for filter being a required parameter for all CRUD APIs
I was basing that on the method declaration:
I was able to dig up some old comments supporting the historical justification I alluded to above. This comment from SPEC-76 explains how we ended up requiring explicit filter arguments:
And this comment from SPEC-86 provides some more context on how to interpret the API declarations:
This is also why all spec tests for find, count, and distinct operations specify a filter argument, even if it's just an empty document – not just in the CRUD spec but in derivative specs such as retryable reads as well. There is one reference in the CRUD spec where collection.find({}) is called without an argument but it's in the definition for "Iterable" so I wouldn't consider that authoritative (and it was likely just intended to be a concise reference to the operation name). The filter parameter for count-related APIs
By design, estimateDocumentCount() cannot take a filter parameter because it relies on the count command's behavior to return an estimated count from collection metadata in the absence of a query filter. If there was an actual rationale for requiring filter on countDocuments() independent of maintaining consistency with existing CRUD APIs, I could find no record of it. There was a patch in the original code review titled "Make it clear that filter is a required argument" but no comments leading up to it. There likewise isn't any mention about this in the scope doc.
No argument there. Having said that...As Shane pointed out, Python and probably most drivers do not require filter parameters for various CRUD methods. We can say this is because we priortize BC over spec compliance, but I doubt any driver has bothered to change that in a major version bump. Likewise, drivers written after the CRUD spec don't seem to require filter either (e.g. Rust). In fact, Rust's count_documents() method also has an optional filter and it looks like Node does as well. In flagrant disregard for the CRUD spec, I historically allowed optional filters for all read-only CRUD methods in PHPLIB – and didn't consider any reason not to continue doing so when implementing countDocuments(). I won't object if someone wants to add a rationale to the spec, but I don't think it serves much purpose if we're missing rationale for other methods like find() and distinct() – and I'm not sure what else we might say for those at this point other than "This is what Eliot wanted in 2014". And even if we do that, it doesn't change the fact that at least a handful of drivers allow it to be optional anyway. | |
| Comment by Shane Harvey [ 14/Dec/22 ] | |
|
AFAICT the spec doesn't say that find requires a filter, python supports coll.find() for example as do other drivers. | |
| Comment by Jeremy Mikola [ 14/Dec/22 ] | |
|
I'm not going to jump through the git history but I imagine this is related to find() also requiring a filter document – and the justification for that goes likely back to the original draft of the CRUD spec (so the only rationale might be in an old Google Doc comment thread). IMO, it only makes sense to do so for write methods where a default filter could make it possible to fat-finger a deleteMany or updateMany. I've never agreed with the find() method requiring a filter parameter and PHP has historically never required it (in the interest of BC). |