[SERVER-7044] 32-bit Windows 2.0 fails in Buildbot running test.exe -- QueryOptimizerTests::QueryPlanSetTests::Delete aborted after 30 minutes Created: 15/Sep/12  Updated: 10/Dec/14  Resolved: 26/Mar/13

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 2.0.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tad Marshall Assignee: Tad Marshall
Resolution: Cannot Reproduce Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

32-bit Windows in Buildbot: "scons --release --release smoke smokeJs"


Operating System: Windows
Participants:

 Description   

The Buildbot run of test.exe in 32-bit Windows has been intermittently failing for at least a month and a half. We need to figure out why and make it pass every time.

The most recent failure:

http://buildbot.mongodb.org/builders/V2.0%20Windows%2032-bit/builds/351/steps/test/logs/stdio

c:\Python27\Scripts\..\python.exe buildscripts/smoke.py test
cwd [C:\10gen\buildslaves\mongo\Windows_32bit_V2.0\mongo]
can't get processlist: [Error 3] The system cannot find the path specified
num procs:1
Sat Sep 15 09:04:02 
Sat Sep 15 09:04:02 warning: 32-bit servers don't have journaling enabled by default. Please use --journal if you want durability.
Sat Sep 15 09:04:02 
Sat Sep 15 09:04:02 [initandlisten] MongoDB starting : pid=4212 port=32000 dbpath=/data/db/sconsTests 32-bit host=ip-0A420969
Sat Sep 15 09:04:02 [initandlisten] 
Sat Sep 15 09:04:02 [initandlisten] ** NOTE: when using MongoDB 32 bit, you are limited to about 2 gigabytes of data
Sat Sep 15 09:04:02 [initandlisten] **       see http://blog.mongodb.org/post/137788967/32-bit-limitations
Sat Sep 15 09:04:02 [initandlisten] **       with --journal, the limit is lower
Sat Sep 15 09:04:02 [initandlisten] 
Sat Sep 15 09:04:02 [initandlisten] db version v2.0.8-rc0-pre-, pdfile version 4.5
Sat Sep 15 09:04:02 [initandlisten] git version: 72cc3481b9e9ec791365b8ba6c183945e8093147
Sat Sep 15 09:04:02 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=0, build=6002, platform=2, service_pack='Service Pack 2') BOOST_LIB_VERSION=1_42
Sat Sep 15 09:04:02 [initandlisten] options: { dbpath: "/data/db/sconsTests/", port: 32000 }
Sat Sep 15 09:04:02 [initandlisten] waiting for connections on port 32000
Sat Sep 15 09:04:02 [websvr] admin web console waiting for connections on port 33000
Sat Sep 15 09:04:03 [initandlisten] connection accepted from 127.0.0.1:54580 #1
Sat Sep 15 09:04:03 [conn1] end connection 127.0.0.1:54580
Sat Sep 15 09:04:03 [testsuite] git version: 72cc3481b9e9ec791365b8ba6c183945e8093147
Sat Sep 15 09:04:03 [testsuite] build info: windows sys.getwindowsversion(major=6, minor=0, build=6002, platform=2, service_pack='Service Pack 2') BOOST_LIB_VERSION=1_42
Sat Sep 15 09:04:03 [testsuite] 32bit
Sat Sep 15 09:04:03 [testsuite] random seed: 1347714243
Sat Sep 15 09:04:03 [testsuite] ****************
Sat Sep 15 09:04:03 [testsuite] running with journaling enabled to test that. dbtests will do this occasionally even if --dur is not specified.
Sat Sep 15 09:04:03 [testsuite] ****************
Sat Sep 15 09:04:03 [testsuite] journal dir=/tmp/unittest/journal
Sat Sep 15 09:04:03 [testsuite] recover : no journal files present, no recovery needed
starting test : test.exe 
Sat Sep 15 09:04:03 [testsuite] going to run suite: background_job
Sat Sep 15 09:04:05 [testsuite] going to run suite: balancer_policy
Sat Sep 15 09:04:05 [testsuite] going to run suite: basic
warning: remove or comment out this line by starting it with '#', skipping now : 	    this = false  
Sat Sep 15 09:04:18 [testsuite] going to run suite: btree
Sat Sep 15 09:04:18 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:04:18 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.btreetests top: { opid: 158, active: true, waitingForLock: false, secs_running: 0, op: "update", ns: "unittests.btreetests", query: { b: 1.0 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:04:18 [testsuite] going to run suite: btree1
Sat Sep 15 09:04:18 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:04:18 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:04:18 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:04:18 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:04:18 [testsuite] going to run suite: client
Sat Sep 15 09:04:18 [testsuite] going to run suite: commands
Sat Sep 15 09:04:18 [testsuite] going to run suite: cursor
Sat Sep 15 09:04:18 [testsuite] going to run suite: directclient
Sat Sep 15 09:04:18 [testsuite] going to run suite: histogram
Sat Sep 15 09:04:18 [testsuite] going to run suite: js
{ "a" : NumberLong("-4991432399172683043") }
BinData(0,"YXNkYXMAYXNkYXNk")
BinData(0,"YXNkYXMAYXNkYXNk")
Sat Sep 15 09:04:19 [testsuite] going to run suite: jsobj
Sat Sep 15 09:04:23 [testsuite] going to run suite: json
Sat Sep 15 09:04:23 [testsuite] going to run suite: matcher
normal: 1 all: 4
Sat Sep 15 09:04:23 [testsuite] going to run suite: mmap
Sat Sep 15 09:04:27 [testsuite] going to run suite: namespace
Sat Sep 15 09:04:27 [testsuite] couldn't make room for new record (len: 4628) in capped ns unittests.NamespaceDetailsTests
  Extent 0 (capExtent)
    magic: 41424344 extent->ns: unittests.NamespaceDetailsTests
    fr: null lr: null extent->len: 4096
  Extent 1
    magic: 41424344 extent->ns: unittests.NamespaceDetailsTests
    fr: null lr: null extent->len: 4096
insert: couldn't alloc space for object ns:unittests.NamespaceDetailsTests capped:1
Sat Sep 15 09:04:27 [testsuite] couldn't make room for new record (len: 4444) in capped ns unittests.NamespaceDetailsTests
  Extent 0
    magic: 41424344 extent->ns: unittests.NamespaceDetailsTests
    fr: null lr: null extent->len: 4096
  Extent 1 (capExtent)
    magic: 41424344 extent->ns: unittests.NamespaceDetailsTests
    fr: null lr: null extent->len: 4096
insert: couldn't alloc space for object ns:unittests.NamespaceDetailsTests capped:1
Sat Sep 15 09:04:27 [testsuite] going to run suite: pdfile
Sat Sep 15 09:04:27 [testsuite] dropDatabase unittest_ex
Sat Sep 15 09:04:29 [testsuite] removeJournalFiles
Sat Sep 15 09:04:29 [testsuite] dropDatabase unittest_ex
Sat Sep 15 09:04:32 [testsuite] FlushViewOfFile for /tmp/unittest/unittest_ex.3 succeeded after 6 attempts taking 3312 ms
Sat Sep 15 09:04:32 [testsuite] removeJournalFiles
Sat Sep 15 09:04:32 [testsuite] going to run suite: perf
stats test                              rps------  time-- cmts  jrnMB	wrDFMB	cIWLk	early	prpLgB  wrToJ	wrToDF	rmpPrVw
stats dummy                              75772583  3000ms 
stats                          new/old:      1.01         git: "72cc3481b9e9ec791365b8ba6c183945e8093147"
stats checksum                                 31  2006ms 
stats                          new/old:      1.03
compress round trip 414.94MB/sec
stats compress                                  4  4246ms 
stats                          new/old:      1.33
test  compress seems slow n:17 ops/sec but expect greater than:1000000
stats thread-local-storage               36621950  3000ms 
stats                          new/old:      1.00
stats malloc                              9423650  4000ms 
stats                          new/old:      1.00
stats Timer                                996000  1000ms 
stats                          new/old:      1.01
stats rlock                              15462800   500ms 
stats                          new/old:      1.00
stats wlock                              15902100   500ms 
stats                          new/old:      1.00
stats mutex                              17497800   500ms 
stats                          new/old:      1.01
stats simplemutex                        24223600   500ms 
stats                          new/old:      1.00
stats spinlock                           24299400   500ms 
stats                          new/old:      1.01
stats curTimeMillis64                     1967600   500ms 
stats                          new/old:      1.01
      avg timer granularity: 1.00ms 
Sat Sep 15 09:05:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
stats Key-woequal                        17463983  3000ms 
stats                          new/old:      1.00
stats BufBuilder                          9509016  3000ms 
stats                          new/old:      0.98
stats StackBufBuilder                    35009316  3000ms 
stats                          new/old:      1.02
stats BSONIter                            4450900  3000ms 
stats                          new/old:      1.00
stats BSONGetFields1By1                   2058383  3000ms 
stats                          new/old:      1.00
stats BSONGetFields                       3559200  3000ms 
stats                          new/old:      1.01
stats insert-duplicate-_ids                 13427  5001ms 47	0.39	0.03	0	0	12	156	1	1
stats                          new/old:      1.00
stats insert-simple                         39820  5000ms 181	7.11	16.68	136	136	427	774	710	19
stats                          new/old:      1.03
stats findOne_by_id                         97190  5001ms 49	0.04	0.09	0	0	1	4	3	0
stats                          new/old:      1.02
stats random-inserts                        20170  5000ms 118	36.64	72.38	80	80	373	1743	562	30
stats                          new/old:      1.03
stats random-inserts-with-more-indexes      12512  5019ms 108	41.19	82.56	72	72	347	1958	479	31
stats                          new/old:      1.02
stats random-upserts                        12540  5000ms 49	7.73	15.49	7	7	39	437	243	7
stats                          new/old:      1.04
stats random-upserts-inc                    15436  5001ms 45	0.84	0.48	0	0	3	162	218	3
stats                          new/old:      1.02
stats random-upserts-with-more-indexes       8596  5002ms 58	16.20	36.33	19	19	65	801	190	12
stats                          new/old:      1.02
stats random-upserts-with-more-indexes-inc      8798  5001ms 39	16.64	39.83	0	0	7	834	133	3
stats                          new/old:      1.00
stats insert-big                              540  1296ms 7	19.19	132.13	3	3	123	807	80	13
stats                          new/old:      1.03
Sat Sep 15 09:06:04 [testsuite] going to run suite: query
Sat Sep 15 09:06:05 [testsuite] getMore: cursorid not found unittests.querytests.TailableDelete 82711574821390
Sat Sep 15 09:06:05 [testsuite] assertion 13051 tailable cursor requested on non capped collection ns:unittests.querytests.TailCappedOnly query:{}
Sat Sep 15 09:06:05 [testsuite] info DFM::findAll(): extent 0:3000 was empty, skipping ahead. ns:unittests.system.indexes
Sat Sep 15 09:06:05 [testsuite] dropDatabase unittests
Sat Sep 15 09:06:08 [testsuite] FlushViewOfFile for /tmp/unittest/perftest.2 succeeded after 2 attempts taking 1723 ms
Sat Sep 15 09:06:12 [testsuite] removeJournalFiles
Sat Sep 15 09:06:12 [testsuite] dropDatabase unittests
Sat Sep 15 09:06:12 [testsuite] removeJournalFiles
Sat Sep 15 09:06:12 [testsuite] getMore: cursorid not found unittests.querytests.tailablecappedrace 91542027588934
HelperTest  slow:265000 fast:19000
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161483, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 98 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161532, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 196 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161583, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 298 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161635, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 402 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161686, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 504 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161735, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 602 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161786, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 704 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161837, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 806 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: unittests.querytests.helpertestbyid top: { opid: 1161888, active: true, waitingForLock: false, secs_running: 0, op: "remove", ns: "unittests.querytests.helpertestbyid", query: { _id: 908 }, client: "0.0.0.0:0", desc: "testsuite", numYields: 0 }
Sat Sep 15 09:06:13 [testsuite] going to run suite: queryoptimizer
Sat Sep 15 09:06:13 [testsuite] querymissingns starts
Sat Sep 15 09:06:13 [testsuite] end QueryMissingNs
Sat Sep 15 09:08:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 1 minutes.
Sat Sep 15 09:09:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 2 minutes.
Sat Sep 15 09:10:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:10:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 3 minutes.
Sat Sep 15 09:11:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 4 minutes.
Sat Sep 15 09:12:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 5 minutes.
Sat Sep 15 09:13:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 6 minutes.
Sat Sep 15 09:14:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 7 minutes.
Sat Sep 15 09:15:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:15:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 8 minutes.
Sat Sep 15 09:16:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 9 minutes.
Sat Sep 15 09:16:14 [testsuite] couldn't get lazy rwlock
Sat Sep 15 09:17:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 10 minutes.
Sat Sep 15 09:18:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 11 minutes.
Sat Sep 15 09:19:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 12 minutes.
Sat Sep 15 09:20:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:20:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 13 minutes.
Sat Sep 15 09:21:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 14 minutes.
Sat Sep 15 09:22:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 15 minutes.
Sat Sep 15 09:23:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 16 minutes.
Sat Sep 15 09:24:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 17 minutes.
Sat Sep 15 09:25:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:25:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 18 minutes.
Sat Sep 15 09:26:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 19 minutes.
Sat Sep 15 09:27:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 20 minutes.
Sat Sep 15 09:28:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 21 minutes.
Sat Sep 15 09:29:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 22 minutes.
Sat Sep 15 09:30:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:30:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 23 minutes.
Sat Sep 15 09:31:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 24 minutes.
Sat Sep 15 09:32:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 25 minutes.
Sat Sep 15 09:33:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 26 minutes.
Sat Sep 15 09:34:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 27 minutes.
Sat Sep 15 09:35:02 [clientcursormon] mem (MB) res:15 virt:61 mapped:0
Sat Sep 15 09:35:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 28 minutes.
Sat Sep 15 09:36:03 [TestWatchDog] warning: class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 29 minutes.
Sat Sep 15 09:37:03 [TestWatchDog] class QueryOptimizerTests::QueryPlanSetTests::Delete has been running for more than 30 minutes. aborting.
['test']
running C:\10gen\buildslaves\mongo\Windows_32bit_V2.0\mongo\mongod.exe --port 32000 --dbpath /data/db/sconsTests/
 *******************************************
         Test : test.exe ...
['C:\\10gen\\buildslaves\\mongo\\Windows_32bit_V2.0\\mongo\\test.exe']
                1984092.99994ms
test C:\10gen\buildslaves\mongo\Windows_32bit_V2.0\mongo\test.exe exited with status 3
0 tests succeeded
The following tests failed (with exit code):
C:\10gen\buildslaves\mongo\Windows_32bit_V2.0\mongo\test.exe	3
scons: building terminated because of errors.
Traceback (most recent call last):
  File "buildscripts/smoke.py", line 521, in <module>
    main()
  File "buildscripts/smoke.py", line 517, in main
    report()
  File "buildscripts/smoke.py", line 374, in report
    raise Exception("Test failures")
Exception: Test failures
scons: *** [smoke] Error 1
program finished with exit code 2
elapsedTime=2573.437000

The failure in run #304 on July 27, 2012 is the same thing:

http://buildbot.mongodb.org/builders/V2.0%20Windows%2032-bit/builds/304/steps/test/logs/stdio

Clue: in the logs that I've looked at, we have "randomly" enabled journaling in the failure cases, but not in the cases where the test passes.

Sat Sep 15 09:04:03 [testsuite] random seed: 1347714243
Sat Sep 15 09:04:03 [testsuite] ****************
Sat Sep 15 09:04:03 [testsuite] running with journaling enabled to test that. dbtests will do this occasionally even if --dur is not specified.
Sat Sep 15 09:04:03 [testsuite] ****************


Generated at Thu Feb 08 03:13:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.