[SERVER-10093] initial_sync4.js failing on Windows 32-bit nightly Created: 03/Jul/13  Updated: 10/Dec/14  Resolved: 21/Aug/13

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Matt Kangas Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: buildbot
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

buildbot: Windows 32-bit


Operating System: Windows
Participants:

 Description   

Failing differently than SERVER-9419?

Nightly Windows 32-bit Build #1320 on Wed Jul 3

It appears that something similar has been occurring since Build 1312 on Sun Jun 30 (rev 665e6ec21) but not before Build 1308 on Wed Jun 26 (rev 3d6e8936938e)

6. Start updating documents on primary
2013-07-03 11:50:58 EDT	
 m31000| Wed Jul 03 15:50:46.984 [conn5] end connection 10.151.17.233:51215 (4 connections now open)
 m31002| Wed Jul 03 15:50:46.984 [rsSync] oplog sync 3 of 3
 m31000| Wed Jul 03 15:50:47.280 [initandlisten] connection accepted from 10.151.17.233:51221 #7 (5 connections now open)
 m31000| Wed Jul 03 15:50:47.733 [conn1] update d.c query: { _id: 48492.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:203732 203ms
 m31000| Wed Jul 03 15:50:47.967 [conn7] command admin.$cmd command: { replSetUpdatePosition: 1, handshake: { handshake: ObjectId('51d448522784a9e70e6b06b7'), member: 2, config: { _id: 2, host: "EC2AMAZ-EXWX6MN:31002" } } } ntoreturn:1 keyUpdates:0  reslen:37 233ms
 m31000| Wed Jul 03 15:50:47.967 [conn7] end connection 10.151.17.233:51221 (4 connections now open)
 m31000| Wed Jul 03 15:50:48.169 [conn1] update d.c query: { _id: 47989.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:203201 203ms
 m31000| Wed Jul 03 15:50:48.388 [initandlisten] connection accepted from 10.151.17.233:51222 #8 (5 connections now open)
 m31000| Wed Jul 03 15:50:48.591 [conn1] update d.c query: { _id: 47518.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:406588 203ms
 m31000| Wed Jul 03 15:50:49.012 [conn1] update d.c query: { _id: 47047.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:203186 203ms
 m31000| Wed Jul 03 15:50:49.402 [slaveTracking] build index local.slaves { _id: 1 }
 m31000| Wed Jul 03 15:50:49.402 [slaveTracking] build index done.  scanned 0 total records. 0 secs
 m31000| Wed Jul 03 15:50:49.854 [conn1] update d.c query: { _id: 46109.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203266 203ms
 m31000| Wed Jul 03 15:50:50.275 [conn1] update d.c query: { _id: 45642.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:203320 203ms
 m31000| Wed Jul 03 15:50:50.712 [conn1] update d.c query: { _id: 45141.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203354 203ms
 m31000| Wed Jul 03 15:50:51.134 [conn1] update d.c query: { _id: 44679.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:406513 203ms
 m31000| Wed Jul 03 15:50:51.555 [conn1] update d.c query: { _id: 44220.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203298 203ms
 m31000| Wed Jul 03 15:50:52.397 [conn1] update d.c query: { _id: 43293.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:155 203ms
 m31000| Wed Jul 03 15:50:52.818 [conn1] update d.c query: { _id: 42829.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203346 203ms
 m31000| Wed Jul 03 15:50:54.098 [conn1] update d.c query: { _id: 41411.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203364 203ms
 m31000| Wed Jul 03 15:50:54.347 [conn3] end connection 10.151.17.233:51196 (4 connections now open)
 m31000| Wed Jul 03 15:50:54.519 [conn1] update d.c query: { _id: 40946.0 } update: { $set: { a: 1.0 } } idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:203298 203ms
 m31000| Wed Jul 03 15:50:54.737 [initandlisten] connection accepted from 10.151.17.233:51223 #9 (5 connections now open)
 m31000| Wed Jul 03 15:50:54.940 [conn1] update d.c query: { _id: 40479.0 } update: { $set: { a.b: 1.0 } } idhack:1 nmoved:1 nupdated:1 keyUpdates:0 locks(micros) w:203591 203ms
 m31000| Wed Jul 03 15:50:55.377 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:203145 203ms
 m31000| Wed Jul 03 15:50:55.829 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:405998 203ms
 m31000| Wed Jul 03 15:50:56.266 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:406482 203ms
 m31000| Wed Jul 03 15:50:57.140 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:203336 203ms
2013-07-03 11:50:59 EDT	
 m31000| Wed Jul 03 15:50:58.481 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:406111 203ms
 m31000| Wed Jul 03 15:50:58.934 [conn1] insert d.c ninserted:1 keyUpdates:0 locks(micros) w:406407 203ms
 m31000| Wed Jul 03 15:50:59.776 [rsHealthPoll] DBClientCursor::init call() failed
 m31000| Wed Jul 03 15:50:59.776 [rsHealthPoll] replSet info EC2AMAZ-EXWX6MN:31002 is down (or slow to respond): 
 m31000| Wed Jul 03 15:50:59.776 [rsHealthPoll] replSet member EC2AMAZ-EXWX6MN:31002 is now in state DOWN
 m31000| Wed Jul 03 15:50:59.776 [rsMgr] can't see a majority of the set, relinquishing primary
 m31000| Wed Jul 03 15:50:59.807 [rsMgr] replSet relinquishing primary state
 m31000| Wed Jul 03 15:50:59.807 [rsMgr] replSet SECONDARY
 m31000| Wed Jul 03 15:50:59.807 [rsMgr] replSet closing client sockets after relinquishing primary
 m31000| Wed Jul 03 15:50:59.807 [conn4] end connection 10.151.17.233:51214 (4 connections now open)
 m31000| Wed Jul 03 15:50:59.823 [conn8] end connection 10.151.17.233:51222 (4 connections now open)
 m31002| Wed Jul 03 15:50:59.823 [rsBackgroundSync] DBClientCursor::init call() failed
 m31002| Wed Jul 03 15:50:59.823 [conn2] command admin.$cmd command: { replSetHeartbeat: "jstests_initsync4", v: 2, pv: 1, checkEmpty: false, from: "EC2AMAZ-EXWX6MN:31000" } ntoreturn:1 keyUpdates:0  reslen:203 10580ms
 m31000| Wed Jul 03 15:50:59.823 [conn1] end connection 127.0.0.1:51180 (4 connections now open)
Wed Jul 03 15:50:59.823 Socket say send() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:31000
Wed Jul 03 15:50:59.823 Error: socket exception [SEND_ERROR] for 127.0.0.1:31000 at src/mongo/shell/collection.js:203
failed to load: D:\slave\Windows_32bit_Nightly\mongo\jstests\replsets\initial_sync4.js



 Comments   
Comment by Matt Kangas [ 21/Aug/13 ]

This appears to have gone away (or been resolved), but was replaced by sync_passive2.js failures. SERVER-10341

Comment by Matt Kangas [ 06/Aug/13 ]

Again:

Windows 32-bit Build #6078 (Aug 6)
http://buildlogs.mongodb.org/Windows%2032-bit/builds/6078/test/recent%20failures/initial_sync4.js

assert failed : tried 200 times, giving up on undefined
Error: Printing Stack Trace
    at printStackTrace (src/mongo/shell/utils.js:37:15)
    at doassert (src/mongo/shell/assert.js:6:5)
    at assert (src/mongo/shell/assert.js:14:5)
    at wait (jstests/replsets/rslib.js:14:9)
    at D:\slave\Windows_32bit\mongo\jstests\replsets\initial_sync4.js:65:1
Tue Aug 06 14:47:24.148 assert failed : tried 200 times, giving up on undefined at src/mongo/shell/assert.js:7

Comment by Matt Kangas [ 03/Aug/13 ]

Either initial_sync4.js or initial_sync2.js has been failing on the Windows 32-bit builder for a long time now.

The last green build on this host (#5992) occurred on Jul 18 19:26, rev 83d3c8966697. It's been broken for more than two weeks.

Latest failure:

Windows 32-bit Build #6063 Aug 2
http://buildbot.mongodb.org/builders/Windows%2032-bit/builds/6063/steps/test_9/logs/stdio

assert failed : tried 200 times, giving up on undefined
Error: Printing Stack Trace
    at printStackTrace (src/mongo/shell/utils.js:37:15)
    at doassert (src/mongo/shell/assert.js:6:5)
    at assert (src/mongo/shell/assert.js:14:5)
    at wait (jstests/replsets/rslib.js:14:9)
    at D:\slave\Windows_32bit\mongo\jstests\replsets\initial_sync4.js:65:1
Sat Aug 03 03:49:25.764 assert failed : tried 200 times, giving up on undefined at src/mongo/shell/assert.js:7
failed to load: D:\slave\Windows_32bit\mongo\jstests\replsets\initial_sync4.js

Comment by Tad Marshall [ 28/Jul/13 ]

Test has been failing consistently since commits 27c4e7fbd2ef6eeb04dccd1bcdecdb21b00522d1, 96c541693cc5b6954a5286788f7ef88f5c680bc6 and 90526f8ca838873e40cda563e77c9f4ba71936e5 .
Last time this test passed on Windows 32-bit was the previous commit, 9bf70757db09b3a7166440c508bf798d50bd212a .

Comment by Eric Milkie [ 15/Jul/13 ]

Resolved by 6486b4035c5ac52679eb3e1a034c925ccdd20deb

Comment by Eric Milkie [ 12/Jul/13 ]

I think this is due to the extreme IO slowness on this particular builder. We are going to try to move to another Amazon instance.

Generated at Thu Feb 08 03:22:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.