[SERVER-46518] Legacy MR shardedfinish can target a former primary Created: 02/Mar/20  Updated: 10/Mar/20  Resolved: 10/Mar/20

Status: Closed
Project: Core Server
Component/s: MapReduce, Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kevin Pulo Assignee: Nicholas Zolnierz
Resolution: Won't Fix Votes: 0
Labels: qopt-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Sprint: Query 2020-04-06
Participants:
Linked BF Score: 39

 Description   

This causes the overall MR to fail with "not master and slaveOk=false" when trying to establish cursors on the shards, because the cursors are supposed to be against the primary but if there has been an election then the node the query is sent to might now be a secondary. It's not clear to me if the MR should just read from the secondary, or if this is just another failure mode of the legacy MR system (ie. it's simply not robust to any elections at any point during its runtime).



 Comments   
Comment by Nicholas Zolnierz [ 10/Mar/20 ]

Closing as won't fix since the new implementation using agg fixes the issue on master/v4.4, and we don't plan to backport for legacy MR.

Comment by Nicholas Zolnierz [ 06/Mar/20 ]

Ok thanks ali.mir, I'm going to link the BF to your ticket instead and flag this one for re-triage. We may end up closing as won't fix given it's only relevant for pre-v4.4
MR.

Comment by Ali Mir [ 06/Mar/20 ]

nicholas.zolnierz Yes, SERVER-46300 should fix this bug. The fix is currently in progress.

Comment by Nicholas Zolnierz [ 06/Mar/20 ]

Although the linked BF shows different symptoms, I think the fix for SERVER-46300 will indirectly fix this bug as well. ali.mir do you agree?

P.S. this test is only relevant for 4.4 and will soon be deleted, so any fixes should really only be on the 4.4 branch. Also legacy mapReduce has never worked well in the presence of step-downs so I don't think there's any novel issue being exposed here.

Comment by David Storch [ 04/Mar/20 ]

charlie.swanson whoops! That's correct.

Comment by Charlie Swanson [ 03/Mar/20 ]

david.storch I assume you didn't mean to assign this to yourself?

Comment by David Storch [ 03/Mar/20 ]

Sending to the query optimization team because the associated test failure happened in a multiVersion test recently added as part of the mapReduce in agg refactor.

Generated at Thu Feb 08 05:11:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.