[CXX-1001] mongocxx::cursor.begin() increments the iterator Created: 21/Aug/16  Updated: 10/May/17  Resolved: 15/Nov/16

Status: Closed
Project: C++ Driver
Component/s: API
Affects Version/s: 3.0.1
Fix Version/s: 3.1.0

Type: Bug Priority: Major - P3
Reporter: George Thompson Assignee: David Golden
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 10
MS VS 2015
MongoDB 3.3.5
MongoDB C++11 Driver 3.0.1
Using WiredTiger


Issue Links:
Duplicate
is duplicated by CXX-1337 Tailable cursor don't resume Closed
Related
related to CXX-1100 Having more than one iterator per cur... Closed
is related to CDRIVER-1886 Expose mongoc_cursor_t 'sent' field Closed

 Description   

Simple program to create and print out two records:

main.cpp

#include <bsoncxx/builder/stream/document.hpp>
#include <bsoncxx/types.hpp>
#include <bsoncxx/json.hpp>
#include <mongocxx/client.hpp>
#include <mongocxx/instance.hpp>
#include <mongocxx/uri.hpp>
#include <iostream>
 
int main()
{
	mongocxx::instance inst{};
	mongocxx::client conn{ mongocxx::uri{} };
 
	auto db = conn["testcursor"];
 
	db["names"].drop();
 
	bsoncxx::document::value document = bsoncxx::builder::stream::document{} << "name" << "George" << bsoncxx::builder::stream::finalize;	
	db["names"].insert_one(std::move(document));
	document = bsoncxx::builder::stream::document{} << "name" << "Mark" << bsoncxx::builder::stream::finalize;
	db["names"].insert_one(std::move(document));
 
	auto cursor = db["names"].find({});
 
	for (auto&& doc : cursor) {
		std::cout << bsoncxx::to_json(doc) << std::endl;
	}
	return 0;
}

Output (as expected):

{
    "_id" : {
        "$oid" : "57b8b1209853f297f00005c1"
    },
    "name" : "George"
}
{
    "_id" : {
        "$oid" : "57b8b1209853f297f00005c2"
    },
    "name" : "Mark"
}

Now, if I insert into the code cursor.begin(); right after the find():

	auto cursor = db["names"].find({});
	
        cursor.begin();
    
	for (auto&& doc : cursor) {
		std::cout << bsoncxx::to_json(doc) << std::endl;
	}

Then my output is (unexpectedly missing the "George" entry):

{
    "_id" : {
        "$oid" : "57b8b1c99853f26e68007472"
    },
    "name" : "Mark"
}

My objective was to test if the cursor was empty (there is no .empty()) so I tried to use:

if ( cursor.begin() == cursor.end() ) but that seemingly advances the iterator?



 Comments   
Comment by Githook User [ 29/Nov/16 ]

Author:

{u'username': u'xdg', u'name': u'David Golden', u'email': u'xdg@xdg.me'}

Message: CXX-1001 CXX-1100 Fix mongocxx cursor iteration

CXX-1001 reported the repeated calls to mongocxx::cursor::begin() would
increment the cursor and lose documents. This is fixed by only
incrementing the cursor during construction one time (required to send the
query/command to the server).

CXX-1100 reported that incrementing an iterator would leave all other
iterators on that cursor with dangling pointers. While this is
technically legal, if undefined behavior, for input iterators, the team
decided to have all iterators from the same cursor move in lockstep;
this is also legal and prevents undefined behaviors, which is more
user-friendly. This is achieved by moving the "current document" view
into the private cursor implementation rather than keeping copies in
individual iterators.

The changes above required some additional coordination to track cursors
sent and dead/finished. Strangely, we couldn't depend on libmongoc
functions for this due to what appear to be some inconsistencies in
libmongoc's handling of "traditional" cursors and "command cursors".
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/e2c086fe158c9a21f1d168ec13800b1a04e2211f

Comment by Githook User [ 29/Nov/16 ]

Author:

{u'username': u'xdg', u'name': u'David Golden', u'email': u'xdg@xdg.me'}

Message: CXX-1001 CXX-1100 Fix mongocxx cursor iteration

CXX-1001 reported the repeated calls to mongocxx::cursor::begin() would
increment the cursor and lose documents. This is fixed by only
incrementing the cursor during construction one time (required to send the
query/command to the server).

CXX-1100 reported that incrementing an iterator would leave all other
iterators on that cursor with dangling pointers. While this is
technically legal, if undefined behavior, for input iterators, the team
decided to have all iterators from the same cursor move in lockstep;
this is also legal and prevents undefined behaviors, which is more
user-friendly. This is achieved by moving the "current document" view
into the private cursor implementation rather than keeping copies in
individual iterators.

The changes above required some additional coordination to track cursors
sent and dead/finished. Strangely, we couldn't depend on libmongoc
functions for this due to what appear to be some inconsistencies in
libmongoc's handling of "traditional" cursors and "command cursors".
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/e2c086fe158c9a21f1d168ec13800b1a04e2211f

Comment by Githook User [ 15/Nov/16 ]

Author:

{u'username': u'xdg', u'name': u'David Golden', u'email': u'xdg@xdg.me'}

Message: CXX-1001 CXX-1100 Fix mongocxx cursor iteration

CXX-1001 reported the repeated calls to mongocxx::cursor::begin() would
increment the cursor and lose documents. This is fixed by only
incrementing the cursor during construction one time (required to send the
query/command to the server).

CXX-1100 reported that incrementing an iterator would leave all other
iterators on that cursor with dangling pointers. While this is
technically legal, if undefined behavior, for input iterators, the team
decided to have all iterators from the same cursor move in lockstep;
this is also legal and prevents undefined behaviors, which is more
user-friendly. This is achieved by moving the "current document" view
into the private cursor implementation rather than keeping copies in
individual iterators.

The changes above required some additional coordination to track cursors
sent and dead/finished. Strangely, we couldn't depend on libmongoc
functions for this due to what appear to be some inconsistencies in
libmongoc's handling of "traditional" cursors and "command cursors".
Branch: 3.1-dev
https://github.com/mongodb/mongo-cxx-driver/commit/e2c086fe158c9a21f1d168ec13800b1a04e2211f

Comment by Githook User [ 15/Nov/16 ]

Author:

{u'username': u'xdg', u'name': u'David Golden', u'email': u'xdg@xdg.me'}

Message: CXX-1001 CXX-1100 Fix mongocxx cursor iteration

CXX-1001 reported the repeated calls to mongocxx::cursor::begin() would
increment the cursor and lose documents. This is fixed by only
incrementing the cursor during construction one time (required to send the
query/command to the server).

CXX-1100 reported that incrementing an iterator would leave all other
iterators on that cursor with dangling pointers. While this is
technically legal, if undefined behavior, for input iterators, the team
decided to have all iterators from the same cursor move in lockstep;
this is also legal and prevents undefined behaviors, which is more
user-friendly. This is achieved by moving the "current document" view
into the private cursor implementation rather than keeping copies in
individual iterators.

The changes above required some additional coordination to track cursors
sent and dead/finished. Strangely, we couldn't depend on libmongoc
functions for this due to what appear to be some inconsistencies in
libmongoc's handling of "traditional" cursors and "command cursors".
Branch: 3.1-dev
https://github.com/mongodb/mongo-cxx-driver/commit/e2c086fe158c9a21f1d168ec13800b1a04e2211f

Comment by David Golden [ 09/Nov/16 ]

https://mongodbcr.appspot.com/107340001/

Comment by David Golden [ 24/Oct/16 ]

Solving this problem must be done in concert with solving CXX-1100.

Comment by Mira Carey [ 22/Aug/16 ]

I don't think it matters much that our cursor's return input iterators instead of forward iterators. Anyone who's trying to enforce the concept is going to care about more than just the presence of begin or end and should care about the iterator tag type.

I do think we should cache the last result we read from the libmongoc cursor. If we keep around the last result we read off that cursor, we can make the first call to begin() run the query (so you can tell if there are any results), then successive calls without an iterator advancement return whatever the last result was. That makes our iterators at least InputIterators, and leaves our cursors as non-Container things that happen to have begin and end.

Comment by George Thompson [ 22/Aug/16 ]

Thanks rassi.

Comment by J Rassi [ 22/Aug/16 ]

therefore, you could consider buffering the results of the query before checking if the result set is empty, as an alternative workaround:

mongocxx::cursor cursor = ...;
std::vector<bsoncxx::document::value> results;
std::transform(cursor.begin(), cursor.end(), std::back_inserter(results),
               [](bsoncxx::document::view v) { return bsoncxx::document::value(v); });
if (results.empty()) {
    // Inform user the result set is empty.
}
for (auto&& result : results) {
    // Do something with "result".
}

Or, just use a bool:

mongocxx::cursor cursor = ...;
bool hasResults = false;
for (auto&& result : cursor) {
    hasResults = true;
    // Do something with "result".
}
if (!hasResults) {
    // Inform user the result set is empty.
}

Comment by David Golden [ 22/Aug/16 ]

I believe you can put $max and $min directly into the filter, but I admit I haven't tried it.

Comment by George Thompson [ 22/Aug/16 ]

The problem with the .count() workaround is the lack of meta modifiers in mongocxx::options::count which I have available with mongocxx::options::find::modifiers. Specifically in my actual code, I am using $max and $min in my find() which I am unsure I can use with mongocxx::collection::count. Unless mongocxx::options::find::modifiers is a convenience wrapper and I could alternatively implement the modifiers in my bsoncxx::document::view_or_value count filter?

Comment by George Thompson [ 22/Aug/16 ]

Thank you.

Comment by David Golden [ 22/Aug/16 ]

For your workaround, I'd suggest issuing a count with a limit of 1 to test if you'll get any documents. That at least minimizes the expense. Then issue your find command if there are. There's a race condition, but it's no different than if you issued a find instead of a count and it found no documents either.

Comment by George Thompson [ 22/Aug/16 ]

Thanks david.golden. I am always hesitant to mark questions as a bug. When I stepped into the mongoDB code, it did appear to increment which isn't (I believe) standard behavior for an STL input iterator (or any iterator). Is the only workaround to re-issue the query? I am currently in development so I can do that (expensive) temporary workaround for now.

Comment by David Golden [ 22/Aug/16 ]

I agree it's unusual. The iterator is an input iterator and begin really only makes sense for a forward iterator. But "rewinding" a cursor means re-issuing a query to MongoDB, which is expensive.

I'm going to mark this as a bug and update the title and we'll work on a solution for your use case.

Thank you for reporting this!

Comment by George Thompson [ 22/Aug/16 ]

Thanks david.golden. But that means:

auto cursor = db["names"].find( document{} << "name" << "Mark" << finalize);
 
	if (cursor.begin() == cursor.end())
		std::cout << "No documents found." << std::endl;
 
	for (auto&& doc : cursor) {
		std::cout << bsoncxx::to_json(doc) << std::endl;
	}

fails. I.e., there will be no output since the cursor has incremented. Should I decrement the cursor in case begin() and end() are not equal? Is this not unique behavior for a cursor?

Comment by David Golden [ 22/Aug/16 ]

The iterator is a view on the cursor rather than being independent, so cursor.begin() currently gets a view on the first document and (behind the scenes) the cursor gets incremented to be ready for the next one. However, cursor.begin() will still equal cursor.end() for an empty cursor.

Generated at Wed Feb 07 22:01:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.