[SERVER-32605] Aggregation on secondary server is not using the same index as the primary Created: 09/Jan/18  Updated: 21/Mar/18  Resolved: 16/Feb/18

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: 3.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Travis Brown Assignee: Mark Agarunov
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File explain_primary_without_new_index.txt     Text File explain_secondary_with_new_index.txt     Text File explain_secondary_without_new_index.txt     File fast.js     PNG File indexDetails.png     Text File primary_without_new_index.log     Text File secondary_with_new_index.log     Text File secondary_without_new_index.log     File slow.js    
Operating System: ALL
Steps To Reproduce:

Upgrade to 3.6.1
Run this query with the read preference set to primary
Run this query with the read preference set to secondaryPreferred

Participants:

 Description   

On our development server, I upgraded from 3.6.0 to 3.6.1, and one of our aggregation queries started taking about 30s. I ran the profiler on both the primary and secondary, and noticed that it was using a different index on the secondary. On the primary, the correct index is used, on the secondary, it is not. I could solve this problem by adding an additional index, but this is not ideal. The results from the profiling on the primary (fast.js) and secondary (slow.js) are attached.



 Comments   
Comment by Kelsey Schubert [ 16/Feb/18 ]

Hi travis@bryx.com,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Kelsey

Comment by Mark Agarunov [ 01/Feb/18 ]

Hello travis@bryx.com,

We still need additional information to diagnose the problem. If this is still an issue for you, could you please provide the output of db.getIndexes() from all affected nodes?

Thanks,
Mark

Comment by Mark Agarunov [ 18/Jan/18 ]

Hello travis@bryx.com,

Thank you for providing this information. Looking over this, it appears that there may be indexes that exist on one node but not the other, possibly due to rolling index builds. To confirm if this is the case, could you please provide the output of db.getIndexes() from all affected nodes?

Thanks,
Mark

Comment by Travis Brown [ 10/Jan/18 ]

Hi Mark,

Please see the attached files. This has the logs with profiling enabled as well as the result from the explain functions.

Travis

Comment by Mark Agarunov [ 10/Jan/18 ]

Hello travis@bryx.com,

Thank you for the report. To get a better idea of what may be causing this, could you please provide the output of explain() as well as the log files from mongod from both the primary and secondary nodes? This should give us some insight into this behavior.

Thanks,
Mark

Generated at Thu Feb 08 04:30:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.