[SERVER-28107] Augment command result with operationTime in mongos Created: 24/Feb/17 Updated: 05/Apr/17 Resolved: 22/Mar/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.5.5 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Misha Tyulenev | Assignee: | Misha Tyulenev |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Sprint: | Sharding 2017-03-27 |
| Participants: |
| Description |
|
While the operationTime is returned by the mongod its lost in translation in mongos. There are several passthrough sharding tests that fail so its writeOps but it may affect other command results as well. for 3.6. mongos will be executing Hence the task can be split into 4 parts: 1. Augment the txn with decoration that keeps the current operation's time
2.
Create instance of ShardingTaskExecutor in the sharding_initialization.c makeTaskExecutor method. Alternative design (rejected) :
3. Access tracker when the results from shards arrive to mongos: 4. The commandOp results are finalized in
|
| Comments |
| Comment by Githook User [ 22/Mar/17 ] |
|
Author: {u'username': u'mikety', u'name': u'Misha Tyulenev', u'email': u'misha@mongodb.com'}Message: |
| Comment by Kaloian Manassiev [ 14/Mar/17 ] |
|
My only comment would be to not call the argument you capture in the lambda opCtx so it is not confused to be the operation context of the thread, which is running the lambda. Maybe call it clientCommandOpCtx so it is not confusing. Another thing which could be considered is to make the OperationTimeTracker decoration be a shared pointer instead and only capture it in the lambda. That way even if the original client operation returns, destroying the operation context, the task executor's thread will not access an invalid pointer. |
| Comment by Githook User [ 10/Mar/17 ] |
|
Author: {u'username': u'jsmulrow', u'name': u'Jack Mulrow', u'email': u'jack.mulrow@mongodb.com'}Message: |
| Comment by Misha Tyulenev [ 09/Mar/17 ] |
|
We can follow up and remove operationTime processing from hooks once the project to introduce a scatrer/gather shard processing - esha.maharishi works on it - is committed. But until now I feel that hooks approach is acceptable as it allows us to make progress faster. |
| Comment by Samantha Ritter (Inactive) [ 09/Mar/17 ] |
|
I also don't feel a hook is the right place for this. The existing hooks are meant to allow you to interject things into the NetworkInterfaceASIO state machine, which is otherwise a black box. For example, metadata is parsed in the middle of the state machine, so if you needed to access that particular point in the connection you couldn't do it without the provided hook. The task you are describing involves parsing the command reply, which happens at the end of the state machine, not in the middle of the state machine. In other words, you do not need to be inside the black box of NIA to do this work. I understand why a hook that runs right before NIA returns your response would be a convenient place to put this, but logically it feels like a mismatch. I also think the performance counter-argument is worth investigating. If we/you do decide to go with the hooks approach anyway, I feel very strongly against changing the name of the hook to "EgressHook," that name doesn't mean anything to me. The hooks' names should tell the user the point at which they are interrupting the NIA state machine (for example EgressMetadataHook implies that we are hooking into the machine at the point where we receive and process metadata). |
| Comment by Kaloian Manassiev [ 09/Mar/17 ] |
|
renctan - if we don't do it through the hooks we risk at some point someone introducing a new network command execution path which doesn't do the cluster time aggregation and incorrect values to be returned to the client. By doing it in the hooks and at the time when the command response is generated we capture the beginning of the request and the end of it and nobody has to think about it. The argument about picking up the time of the config server is not really a problem, because if we read some chunks metadata with a later cluster time, chances are that whichever shard receives a read with that cluster time will also need that up-to-date metadata as well. |
| Comment by Misha Tyulenev [ 09/Mar/17 ] |
|
Per offline discussion the choice between hooks and ARM/parallel is the choice between simplicity and performance as the hooks approach may introduce extra wait on the client. However "hooks" offer uniform approach that will take case of all the code paths. The consensus is to go with the proposed "hook approach". |
| Comment by Randolph Tan [ 09/Mar/17 ] |
|
Not sure the hooks are the right place to do it. One argument against it is that it's indicriminate - it will pick up other responses that are not directly related to the current operation. For example, a chunk manager refresh causes it to store the operation time from the config server while executing a write command to the shards. I feel like the operation time can be better handled by the one aggregrating the results, like the AsyncResultsMerger/ParallelSortClusteredCursor or whatever the replacement will be called. If we are going for the hook approach, maybe we should just modify the existing method signature to accept txn and rename it to readReply instead of creating a new one since the new OP_COMMAND protocol will no longer have a separate metadata section. |
| Comment by Misha Tyulenev [ 09/Mar/17 ] |
|
kaloian.manassiev renctan samantha.ritter could you please review the proposal and submit the feedback via any convenient for you channel? |