[JAVA-2418] cursor.setBatchSize() seems not working Created: 22/Dec/16  Updated: 27/Oct/23  Resolved: 28/Dec/16

Status: Closed
Project: Java Driver
Component/s: Async
Affects Version/s: 3.4.1
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Alireza Mohamadi [X] Assignee: Unassigned
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 10 x64


Attachments: PNG File Capture.PNG    

 Description   

Hi. Part of my code:

As you see I limit cursor batch size to 1, but it returns 4 elements to me after calling cursor.next().



 Comments   
Comment by Jeffrey Yemin [ 28/Dec/16 ]

Thanks for the code sample. I now see what the problem is. The batch size set on the AsyncBatchCursor doesn't apply to the first batch of documents, only to the subsequent batches. To apply to the first batch, set the batch size on the FindIterable instead, e.g.:

FindIterable<Document> it = posts.find(...).batchSize(1);

Comment by Alireza Mohamadi [X] [ 22/Dec/16 ]

public Document nextPost (int userID) throws SuspendExecution, Throwable
	{
		AbstractMap.SimpleEntry<AsyncBatchCursor<Document>, Long> entry = bachCursorMap.get(userID);
		if (entry == null  ||
				entry.getKey().isClosed())
		
		{
			entry = new FiberAsync<AbstractMap.SimpleEntry<AsyncBatchCursor<Document>, Long>,
					Throwable>()
			{
				
				@Override
				protected void requestAsync ()
				{
					FindIterable<Document> it = posts.find(
							or(not(exists("orders.visitList")),
							   elemMatch(
									   "orders.visitList",
									   or(ne("userID", userID),
									      and(
											      gt("amount", 0),
											      lt("time", System
													      .currentTimeMillis() - TimeUnit.DAYS
													      .toMillis(1)),
											      lt("errorCount", 10)
									         )
									     )
							            )
							  )
					                                      );
					
					it.batchCursor(
							(r, t) ->
							{
								if (t != null)
								{
									asyncFailed(t);
								}
								else
								{
									asyncCompleted(new AbstractMap
											.SimpleEntry<>(r, System.currentTimeMillis()
									));
								}
							});
					
				}
			}.run();
			bachCursorMap.put(userID, entry);
		}
		entry = bachCursorMap.get(userID);
		final AsyncBatchCursor<Document> cursor = (AsyncBatchCursor<Document>) entry.getKey();
		cursor.setBatchSize(1);
		List<Document> list = new FiberAsync<List<Document>, Throwable>()
		{
			@Override
			protected void requestAsync ()
			{
				cursor.next(
						(r, t) ->
						{
							if (t != null)
							{
								asyncFailed(t);
							}
							else
							{
								asyncCompleted(r);
							}
						});
				
			}
		}.run();
		if (list == null)
		{
			return null;
		}
		Document doc = list.get(0);
		return doc;
	}

FiberAsync.execute() is Quasar's way to do async operations inside a Fiber. You may ignore this part and simply consider normal API calls. Many thanks

Comment by Jeffrey Yemin [ 22/Dec/16 ]

I also can't reproduce this on 3.2. Can you provide a full code sample that reproduces the issue?

Comment by Alireza Mohamadi [X] [ 22/Dec/16 ]

Hi Jeff
I'm still developing my server application, so I'm using standalone configuration, but I suppose to use standalone config even when deploying, because of deploying on a powerful server and using lightweight threads for handling HTTP requests.
And right now, I'm using MongoDB v3.2.11 but because I'm still developing my app, I can upgrade to 3.4 if needed.
Many thanks to MongoDB support team for glorious response rate.

Comment by Jeffrey Yemin [ 22/Dec/16 ]

Hi Alireza,

I understand what you're trying to do now. Thanks for clarifying.

I'm not able to reproduce your results with a MongoDB 3.4 standalone server. Please let us know what version of MongoDB you're running against and what configuration (standalone, replica set, sharded cluster).

Thanks,
Jeff

Comment by Alireza Mohamadi [X] [ 22/Dec/16 ]

Thanks for reply. Well If I can't get 1 result at a time, I can't be sure that results still hold query conditions or not. I mean that for example in the above code I have 4 results, user requests one, I send first element of these four, and 5 minutes later user comes back and requests for another result. I should return second result but it may have changed in DB. That's the reason why I want driver to return exactly one result from MongoDB, because just when server has already sent a document to you, you can't be sure if it is changed in DBMS side or not. But if I use cursors as they get maintained over time, and get one result each time, I can be sure that results still fit my request.
Sorry for bad explanation.
EDIT> And surely I don't want the first result, and I can't use forEach style. I mean that this use case can only get satisfied only by using cursor directly.

Comment by Jeffrey Yemin [ 22/Dec/16 ]

The batch size controls the size of the batches returned by the server. It doesn't affect the total number of results. Perhaps what you need is limit rather than batch size? If you want a single result, you can use com.mongodb.async.client.MongoIterable#first as a short cut.

Generated at Thu Feb 08 08:57:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.