[SERVER-10144] check if our semantics on getlasterror w:<n> are consistent on a series of write operations Created: 09/Jul/13  Updated: 10/Dec/14  Resolved: 22/Apr/14

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.4.4
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dwight Merriman Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-10037 Optimize updates when document is unc... Closed
Operating System: ALL
Participants:

 Description   

This is sort of a QA and design question – and if found to be wrong then woudl be a bug.

Some users send a series of writes (w1, w2, w3, w4, w5, ..., w n) (on one connection) to a single replica set, and then after the series, call

{ getLastError:1, w:3 }

for example. The assumption by the user is that when this is acknowledges, all writes through wn are propogated, including earlier writes like w5.

However, suppose the write w n was a "no-op", such as

db.foo.remove( { _id : "DNE" } ) // assume there is no "DNE" in the collection...

or

db.foo.update( { _id : "DNE" } , ... )

In these cases wn is not placed in the oplog on the primary.

A couple of questions:

(1) If after w n I call getLastError with w=3 (or majority etc.), am I assured w2 and the others have propogated?

Based on a simple test (below), it appears the the answer is yes. So perhaps all is fine.

(2) What about sharded? (2a) What is the contract for sharded? (2b) And what happens, and is it correct?

x:PRIMARY> // this is a one member replica set
x:PRIMARY> db.foo.insert({})
x:PRIMARY> db.foo.remove({x:3333})  // this does not appear in the oplog -- perhaps that is a good thing
x:PRIMARY> db.getLastError(2)       // hangs forever, as one would want, indicating question  #1 behaves "nicely"



 Comments   
Comment by Eric Milkie [ 22/Apr/14 ]

Rassi's excellent description for questions 1 and 2 is correct.

Comment by J Rassi [ 17/Apr/14 ]

For (2): no. An err: null GLE response from mongos guarantees that the most recent operation on that connection (and only that operation) was propagated to the requested number of replicas on each shard targeted (i.e. affected) by that operation. No information is revealed about any previous operations on the connection. Note that mongos provides this contract by keeping track of the optime associated with the most recent operation on each affected shard.

See the following script for an example.

st = new ShardingTest({shards: {rs0: {nodes: 1}, rs1: {nodes: 2}}, verbose: 1});
st.stopBalancer();
db = st.getDB("test");
db.adminCommand({enableSharding: "test"});
db.adminCommand({movePrimary: "test", to: "test-rs0"});
db.adminCommand({shardCollection: "test.foo", key: {_id: 1}});
db.adminCommand({split: "test.foo", middle: {_id: 0}});
db.adminCommand({moveChunk: "test.foo", find: {_id: 0}, to: "test-rs1"});
db.foo.insert({_id: -1});
db.foo.insert({_id: 0});
db.foo.remove({_id: 0});
db.getLastError(2); // does not hang, since the most recent operation targeted the nodes=2 shard (i.e did not target the nodes=1 shard)

Your original assertion is false (that is, the "assumption" that receiving an err: null GLE response guarantees that earlier writes on the connection have been propagated). GLE reveals no information about whether the earlier writes on the connection have even succeeded; thus, no information can be revealed about whether they've propagated.

Comment by J Rassi [ 09/Jul/13 ]

For (1): yes if w_n "succeeded" (e.g. a delete with an invalid query predicate like {_id:{$invalid:1}} does not succeed), otherwise no. If the "last operation" on the connection did not fail with an error (w_n in your example), the GLE command looks up the optime for the most recent write on the connection (w_5 in your example, assuming nothing past w_5 generated an oplog entry) and returns err: null only if the requested subset of replicas have caught up to that optime within the timeout. However, if the "last operation" on the connection did fail with an error, then GLE returns immediately to notify the user, without waiting for replicas.

Generated at Thu Feb 08 03:22:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.