Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-12225

Fix RNG generator weakness around mongodb $sample stage

    • Storage Engines
    • 8
    • 2024-01-09 - I Grew Tired, StorEng - 2024-01-23, 2024-02-06 tapioooooooooooooca, 2024-02-20_A_near-death_puffin, 2024-03-05 - Claronald
    • v7.3, v7.2, v7.0, v6.0, v5.0

      After the change in WT-11532, the ticket has caused some fallout in mongodb BF-30947 and BF-30957. The failures in both BFs show that the randomness is failing on mainly windows machine.

      [js_test:read_and_write_distribution] uncaught exception: Error: command failed: {
      [js_test:read_and_write_distribution] 	"ok" : 0,
      [js_test:read_and_write_distribution] 	"errmsg" : "Failed to find split points that partition the data into 10 chunks with roughly equal number of documents using the shard key being analyzed :: caused by :: Error on remote shard EC2AMAZ-TQG8U1N:20043 :: caused by :: Executor error during getMore :: caused by :: $sample stage could not find a non-duplicate document after 100 while using a random cursor. This is likely a sporadic failure, please try again.",
      [js_test:read_and_write_distribution] 	"code" : 28799,
      [js_test:read_and_write_distribution] 	"codeName" : "Location28799",
      [js_test:read_and_write_distribution] 	"$clusterTime" : {
      [js_test:read_and_write_distribution] 		"clusterTime" : Timestamp(1700789508, 108),
      [js_test:read_and_write_distribution] 		"signature" : {
      [js_test:read_and_write_distribution] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [js_test:read_and_write_distribution] 			"keyId" : NumberLong(0)
      [js_test:read_and_write_distribution] 		}
      [js_test:read_and_write_distribution] 	},
      [js_test:read_and_write_distribution] 	"operationTime" : Timestamp(1700789508, 108)
      [js_test:read_and_write_distribution] } with original command request: {
      [js_test:read_and_write_distribution] 	"analyzeShardKey" : "testDb.sampledCollSharded",
      [js_test:read_and_write_distribution] 	"key" : {
      [js_test:read_and_write_distribution] 		"x" : "hashed"
      [js_test:read_and_write_distribution] 	},
      [js_test:read_and_write_distribution] 	"lsid" : {
      [js_test:read_and_write_distribution] 		"id" : UUID("edded01f-9760-42ad-967f-567f31011449")
      [js_test:read_and_write_distribution] 	},
      [js_test:read_and_write_distribution] 	"$clusterTime" : {
      [js_test:read_and_write_distribution] 		"clusterTime" : Timestamp(1700789508, 29),
      [js_test:read_and_write_distribution] 		"signature" : {
      [js_test:read_and_write_distribution] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [js_test:read_and_write_distribution] 			"keyId" : NumberLong(0)
      [js_test:read_and_write_distribution] 		}
      [js_test:read_and_write_distribution] 	}
      [js_test:read_and_write_distribution] } on connection: connection to EC2AMAZ-TQG8U1N:20049 :
      [js_test:read_and_write_distribution] _getErrorWithCode@src/mongo/shell/utils.js:24:13
      [js_test:read_and_write_distribution] doassert@src/mongo/shell/assert.js:18:14
      [js_test:read_and_write_distribution] _assertCommandWorked@src/mongo/shell/assert.js:748:25
      [js_test:read_and_write_distribution] assert.commandWorked@src/mongo/shell/assert.js:842:16
      [js_test:read_and_write_distribution] waitForSampledQueries/<@jstests\sharding\analyze_shard_key\read_and_write_distribution.js:420:22
      [js_test:read_and_write_distribution] assert.soon@src/mongo/shell/assert.js:364:21
      [js_test:read_and_write_distribution] waitForSampledQueries@jstests\sharding\analyze_shard_key\read_and_write_distribution.js:417:12
      [js_test:read_and_write_distribution] runTest@jstests\sharding\analyze_shard_key\read_and_write_distribution.js:510:11
      [js_test:read_and_write_distribution] @jstests\sharding\analyze_shard_key\read_and_write_distribution.js:608:12
      

      This ticket aim is to investigate why the fallout is happening within wiredtiger and find a solution after the problem has been rootcaused.

        1. test_table99.py
          3 kB
        2. test_random.c
          2 kB

            Assignee:
            jie.chen@mongodb.com Jie Chen
            Reporter:
            jie.chen@mongodb.com Jie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: