[CXX-1659] cursor::begin() is invalid after using std::distance Created: 01/Oct/18  Updated: 27/Oct/23  Resolved: 02/Oct/18

Status: Closed
Project: C++ Driver
Component/s: None
Affects Version/s: 3.3.1
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Romain LEGUAY [X] Assignee: Unassigned
Resolution: Works as Designed Votes: 0
Labels: cursor
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

macOS X 10.13.6, Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix



 Description   

I try to understand how the iterators of mongo-cxx-driver work.

I create a document using bsoncxx::builder::basic::make_document and bsoncxx::builder::basic::kvp like this:

 

mongocxx::instance inst{};

std::tm tm;

std::stringstream ss("1922-03-28");

ss >> std::get_time(&tm, "%Y-%m-%d");

auto tp = std::chrono::system_clock::from_time_t(std::mktime(&tm));

auto doc = make_document(kvp("name", "John doe"),

                                                 kvp("dob", bsoncxx::types::b_date(tp)),

                                                 kvp("id", "abcde"),

                                                 kvp("sex", "M"));

auto client = mongocxx::client(mongocxx::uri());

auto collection = client["patientdb_test"]["patient"];

collection.insert_one(doc.view());

delay(2000); // Wait during two seconds.

 

After that, I search all documents:

 

auto cursor = collection.find({});

 

Finally, I test the cursor to see if it's empty and the number of element found using std::distance function:

if(cursor.begin() != cursor.end())

{

      auto it = cursor.begin();

     if(std::distance(it, cursor.end()) == 1)

      {            

             if (cursor.begin() != cursor.end()) // <--- Segmentation fault

             

Unknown macro: {              }

      }

}

I don't understand why the begin iterator change after the call of std::distance method.

Is this a bug or do I use the cursor wrongly?



 Comments   
Comment by Romain LEGUAY [X] [ 02/Oct/18 ]

Thank you for this great answer.

I will use for now collection::count for my unit tests.

 

Comment by Kevin Albertson [ 01/Oct/18 ]

Also, if you only need to retrieve the count of documents in a collection (without iterating over individual documents later), it's much more efficient to use collection::count.

collection::count specifically will be deprecated in the 3.4.0 release in favor of two new methods: collection::count_documents and collection::estimated_document_count. But upgrading will be straightforward.

Comment by Kevin Albertson [ 01/Oct/18 ]

Right, you'll need to copy. Dereferencing the cursor iterator gives back a bsoncxx::document::view. This is a non-owning view of the bson document. You can copy this into an owning bsoncxx::document::value. The following example shows how to do that (using explicit types and an ordinary for loop):

    std::vector<bsoncxx::document::value> docs;
    for (mongocxx::cursor::iterator iter = cursor.begin();
         iter != cursor.end();
         iter++) {
        bsoncxx::document::view doc_view = *iter;
        // Copy the document view into a document value in the vector.
        docs.emplace_back(doc_view);
    }
    std::cout << "count is " << docs.size() << std::endl;

The docs page on Working with BSON gives a bit more information on the view and value types. Hope that helps!

I agree that cursor.current() might be a more logical sounding name. But having the standard begin/end functions allows us to use cursors like any STL iterator. For example, cursors can be used in range loops:

for (bsoncxx::document::view doc : collection.find(query)) {
     std::cout << bsoncxx::to_json(doc) << std::endl;
}

Comment by Romain LEGUAY [X] [ 01/Oct/18 ]

Thank you for your answer,

I found the error: when I use the test Framework Catch2 and do a

CHECK(cursor.begin() != cursor.end())

And continue even if it's false.... so I obtained the segmentation fault.

To put all bsoncxx::document::value inside a std::vector, I suppose we need to use std::copy or similar algorithm?

Do you think cursor.current() would be a better name than cursor.begin()?

 

Comment by Kevin Albertson [ 01/Oct/18 ]

Hi Athius,

Once results are returned in a cursor, subsequent calls to begin point to the next remaining result. Documented here. So once you iterate an iterator returned from cursor.begin(), subsequent calls to cursor.begin() return an iterator at the point where you left off.

The call you have to std::distance iterates until the end. If you need to retrieve the count before iterating over, you could buffer the documents in a std::vector or similar.

However, the segmentation fault seems odd to me. That call to cursor.begin() should return the end iterator, and I've checked with my own test. Can you provide a compilable example?

Generated at Wed Feb 07 22:03:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.