[SERVER-13289] Concurrent background index builds causes index corruption. Created: 19/Mar/14  Updated: 10/Dec/14  Resolved: 20/Mar/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.4.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sherry Ger Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongo.log    
Issue Links:
Related
Operating System: ALL
Steps To Reproduce:

1. Create a large dataset.
2. Bring up multiple shells.
3. In each shell perform an index creation with background:1. For example,

db.myCollection.ensureIndex({a:1, d:-1}, {background:1}) // in window 1
db.myCollection.ensureIndex({e:1, c:-1}, {background:1}) // in window 2
db.myCollection.ensureIndex({a:-1, b:1}, {background:1}) // in window 3

4. The index creation in window 3 resulted in the following error.

> db.myCollection.ensureIndex({a:-1, b:1}, {background:1})
{
	"err" : "missing Extra",
	"code" : 14045,
	"n" : 0,
	"connectionId" : 2,
	"ok" : 1
}

5. Subsequent validate resulted in the following.

> db.myCollection.validate(true)
{
	"ns" : "test.myCollection",
	"firstExtent" : "5:191ea000 ns:test.myCollection",
	"lastExtent" : "4:13c94000 ns:test.myCollection",
	"extentCount" : 18,
	"extents" : [
		{
			"loc" : "5:191ea000",
			"xnext" : "0:f000",
			"xprev" : "null",
			"nsdiag" : "test.myCollection",
			"size" : 8192,
			"firstRecord" : "5:191ea0b0",
			"lastRecord" : "5:191ebf70"
		},
		{
			"loc" : "0:f000",
			"xnext" : "0:17000",
			"xprev" : "5:191ea000",
			"nsdiag" : "test.myCollection",
			"size" : 32768,
			"firstRecord" : "0:f0b0",
			"lastRecord" : "0:16f70"
		},
		{
			"loc" : "0:17000",
			"xnext" : "0:5b000",
			"xprev" : "0:f000",
			"nsdiag" : "test.myCollection",
			"size" : 131072,
			"firstRecord" : "0:170b0",
			"lastRecord" : "0:36f70"
		},
		{
			"loc" : "0:5b000",
			"xnext" : "0:16b000",
			"xprev" : "0:17000",
			"nsdiag" : "test.myCollection",
			"size" : 524288,
			"firstRecord" : "0:5b0b0",
			"lastRecord" : "0:daf70"
		},
		{
			"loc" : "0:16b000",
			"xnext" : "0:5ab000",
			"xprev" : "0:5b000",
			"nsdiag" : "test.myCollection",
			"size" : 2097152,
			"firstRecord" : "0:16b0b0",
			"lastRecord" : "0:36af70"
		},
		{
			"loc" : "0:5ab000",
			"xnext" : "0:16ab000",
			"xprev" : "0:16b000",
			"nsdiag" : "test.myCollection",
			"size" : 8388608,
			"firstRecord" : "0:5ab0b0",
			"lastRecord" : "0:daaf70"
		},
		{
			"loc" : "0:16ab000",
			"xnext" : "0:2178000",
			"xprev" : "0:5ab000",
			"nsdiag" : "test.myCollection",
			"size" : 11325440,
			"firstRecord" : "0:16ab0b0",
			"lastRecord" : "0:2177f70"
		},
		{
			"loc" : "0:2178000",
			"xnext" : "1:2000",
			"xprev" : "0:16ab000",
			"nsdiag" : "test.myCollection",
			"size" : 15290368,
			"firstRecord" : "0:21780b0",
			"lastRecord" : "0:300cf50"
		},
		{
			"loc" : "1:2000",
			"xnext" : "1:13b2000",
			"xprev" : "0:2178000",
			"nsdiag" : "test.myCollection",
			"size" : 20643840,
			"firstRecord" : "1:20b0",
			"lastRecord" : "1:13b1f90"
		},
		{
			"loc" : "1:13b2000",
			"xnext" : "1:3eae000",
			"xprev" : "1:2000",
			"nsdiag" : "test.myCollection",
			"size" : 27869184,
			"firstRecord" : "1:13b20b0",
			"lastRecord" : "1:2e45f90"
		},
		{
			"loc" : "1:3eae000",
			"xnext" : "2:2000",
			"xprev" : "1:13b2000",
			"nsdiag" : "test.myCollection",
			"size" : 37625856,
			"firstRecord" : "1:3eae0b0",
			"lastRecord" : "1:628ff90"
		},
		{
			"loc" : "2:2000",
			"xnext" : "2:469a000",
			"xprev" : "1:3eae000",
			"nsdiag" : "test.myCollection",
			"size" : 50798592,
			"firstRecord" : "2:20b0",
			"lastRecord" : "2:3073f90"
		},
		{
			"loc" : "2:469a000",
			"xnext" : "2:a5e8000",
			"xprev" : "2:2000",
			"nsdiag" : "test.myCollection",
			"size" : 68579328,
			"firstRecord" : "2:469a0b0",
			"lastRecord" : "2:8800f90"
		},
		{
			"loc" : "2:a5e8000",
			"xnext" : "3:2861000",
			"xprev" : "2:469a000",
			"nsdiag" : "test.myCollection",
			"size" : 92581888,
			"firstRecord" : "2:a5e80b0",
			"lastRecord" : "2:fe32f50"
		},
		{
			"loc" : "3:2861000",
			"xnext" : "3:d614000",
			"xprev" : "2:a5e8000",
			"nsdiag" : "test.myCollection",
			"size" : 124985344,
			"firstRecord" : "3:28610b0",
			"lastRecord" : "3:9f92f50"
		},
		{
			"loc" : "3:d614000",
			"xnext" : "4:2000",
			"xprev" : "3:2861000",
			"nsdiag" : "test.myCollection",
			"size" : 168730624,
			"firstRecord" : "3:d6140b0",
			"lastRecord" : "3:176fdf50"
		},
		{
			"loc" : "4:2000",
			"xnext" : "4:13c94000",
			"xprev" : "3:d614000",
			"nsdiag" : "test.myCollection",
			"size" : 227786752,
			"firstRecord" : "4:20b0",
			"lastRecord" : "4:d93df50"
		},
		{
			"loc" : "4:13c94000",
			"xnext" : "null",
			"xprev" : "4:2000",
			"nsdiag" : "test.myCollection",
			"size" : 307515392,
			"firstRecord" : "4:13c940b0",
			"lastRecord" : "4:19e6df30"
		}
	],
	"datasize" : 800000080,
	"nrecords" : NumberLong("4379697861059674116"),
	"lastExtentSize" : 0,
	"padding" : 1,
	"firstExtentDetails" : {
		"loc" : "5:191ea000",
		"xnext" : "0:f000",
		"xprev" : "null",
		"nsdiag" : "test.myCollection",
		"size" : 8192,
		"firstRecord" : "5:191ea0b0",
		"lastRecord" : "5:191ebf70"
	},
	"lastExtentDetails" : {
		"loc" : "4:13c94000",
		"xnext" : "null",
		"xprev" : "4:2000",
		"nsdiag" : "test.myCollection",
		"size" : 307515392,
		"firstRecord" : "4:13c940b0",
		"lastRecord" : "4:19e6df30"
	},
	"objectsFound" : 10000000,
	"invalidObjects" : 0,
	"bytesWithHeaders" : 960000080,
	"bytesWithoutHeaders" : 800000080,
	"deletedCount" : 13,
	"deletedSize" : 204911440,
	"nIndexes" : 16896,
	"valid" : false,
	"errors" : [
		"exception during index validate idxn 4"
	],
	"advice" : "ns corrupt, requires repair",
	"ok" : 1
}

The log file is attached to the ticket. Please note, there were multiple attempts to drop another index on the same collection. That did not seem to cause any issues. Only when the third index creation was kicked off, we observed the error in step 4.

Participants:

 Description   

Running multiple ensureIndex with the background true option caused index corruption.



 Comments   
Comment by Eliot Horowitz (Inactive) [ 20/Mar/14 ]

Manifestation of SERVER-12990

Generated at Thu Feb 08 03:31:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.