[SERVER-17032] DistributedLock pinger does not appear to check skew after startup Created: 23/Jan/15  Updated: 27/Oct/15  Resolved: 27/Oct/15

Status: Closed
Project: Core Server
Component/s: Concurrency, Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Spencer Jackson Assignee: Andy Schwerin
Resolution: Won't Fix Votes: 0
Labels: 28qa, distributed-lock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Steps To Reproduce:

1) Set up a sharded cluster, with one mongos, two mongods, and two config servers on a machine, and another config server in a VM. Ensure that the VM's clock is synced to the host machine, preferably using NTP.

Run:
mongos> sh.addShard( "192.168.56.102:27021" )

{ "shardAdded" : "shard0000", "ok" : 1 }

mongos> sh.addShard( "192.168.56.102:27022" )

{ "shardAdded" : "shard0001", "ok" : 1 }

mongos> sh.enableSharding("database")

{ "ok" : 1 }

mongos> sh.shardCollection("database.col",

{data: 1}

)

{ "collectionsharded" : "database.col", "ok" : 1 }

mongos> use database
switched to db database
mongos> for (var i = 0; i < 25000; i++) { db.col.insert(

{data: i, payload: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}

)}
WriteResult(

{ "nInserted" : 1 }

)

2) Set the VM's clock to the distant future
$ sudo date 012204052020
Wed Jan 22 4:05:00 EST 2020

3) Split the collection, expecting a failure due to clock skew
mongos> sh.splitFind( "database.col",

{ "data": "27544" }

)

{ "ok" : 1 }
Participants:

 Description   

tryAcquire will call distLockPinger.got, unless a distLockPinger thread exists for the DistributedLock. This function will check if the cluster is experiencing clock skew. If it is, creating the distLockPinger will fail. Otherwise, it succeeds. The distLockPinger appears to be expected to live forever, based off the comment for killPinger stating "For use in testing, ping thread should run indefinitely in practice." The pinger does not check for clock skew.

This results in commands like findSplit successfully running even when the cluster is experiencing excessive clock skew.



 Comments   
Comment by Andy Schwerin [ 27/Oct/15 ]

This goes away with SERVER-1448 because the CSRS distributed lock protocol does not require the clocks to be well synchronized among all config replica set nodes.

Generated at Thu Feb 08 03:43:05 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.