[SERVER-19310] indexbg_interrupts.js Created: 01/Dec/14  Updated: 05/Feb/16  Resolved: 07/Jul/15

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 3.1.6

Type: Bug Priority: Major - P3
Reporter: Ian Whalen (Inactive) Assignee: Matt Dannenberg
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: RPL 6 07/17/15
Participants:

 Description   

LOGS

TASK HISTORY

 m31001| 2014-11-30T07:24:38.569+0000 I REPL     [ReplicationExecutor] syncing from: WIN-LPG54MQP00S:31000
 m31000| 2014-11-30T07:24:38.569+0000 I NETWORK  [initandlisten] connection accepted from 10.169.136.164:59604 #9 (5 connections now open)
ReplSetTest awaitReplication: checking secondaries against timestamp Timestamp(1417332276, 1)
ReplSetTest awaitReplication: checking secondary #1: WIN-LPG54MQP00S:31001
ReplSetTest awaitReplication: timestamp for secondary #1, WIN-LPG54MQP00S:31001, is Timestamp(1417332220, 1) but latest is Timestamp(1417332276, 1)
2014-11-30 02:25:17 EST	
 m31001| 2014-11-30T07:25:17.988+0000 I QUERY    [conn1] command local.$cmd command: count { count: "oplog.rs", query: {}, fields: {} } planSummary: COUNT keyUpdates:0  reslen:44 39415ms
ReplSetTest awaitReplication: last oplog entry (of 374508) for secondary #1, WIN-LPG54MQP00S:31001, is {  "ts" : Timestamp(1417332220, 1),  "h" : NumberLong("-6465147432734327563"),  "v" : 2,  "op" : "i",  "ns" : "bgIndexSec.system.indexes",  "o" : {  "ns" : "bgIndexSec.jstests_feh",  "key" : {  "i" : 1 },  "name" : "i_1",  "background" : true } }
ReplSetTest awaitReplication: secondary #1, WIN-LPG54MQP00S:31001, is NOT synced
assert.soon failed, msg:awaiting replication
Error: assert.soon failed, msg:awaiting replication
    at Error (<anonymous>)
    at doassert (src/mongo/shell/assert.js:11:14)
    at Function.assert.soon (src/mongo/shell/assert.js:189:13)
    at ReplSetTest.awaitReplication (src/mongo/shell/replsettest.js:516:12)
    at C:\data\mci\shell\src\jstests\noPassthroughWithMongod\indexbg_interrupts.js:90:14



 Comments   
Comment by Githook User [ 07/Jul/15 ]

Author:

{u'username': u'dannenberg', u'name': u'matt dannenberg', u'email': u'matt.dannenberg@10gen.com'}

Message: SERVER-19310 extend the timeout on awaitReplication in indexbg_interrupts.js
Branch: master
https://github.com/mongodb/mongo/commit/a81827ad36beeca1badf99ad7a43a05d53b44593

Comment by Matt Dannenberg [ 07/Jul/15 ]

Extending the timeout seemed to work (ran three times across all builders here). I'm going to push that sometime this afternoon assuming there are no objections.

Comment by Matt Dannenberg [ 06/Jul/15 ]

This failure seems to span a lot of builders and always fails in the convertToCapped portion of the test. I believe convertToCapped takes longer than the other index cancelling ops and that extending the awaitReplication timeout may be a sufficient fix for this. I will try a patch build with a 60 second timeout.

Comment by Ian Whalen (Inactive) [ 06/Jul/15 ]

mattd@10gen.com any thoughts on this?

Comment by J Rassi [ 09/Jun/15 ]

Re-opening ticket; this test is continuing to fail intermittently (same manifestation).

Failure from May 6 on SSL RHEL 5.5 64-bit:

https://evergreen.mongodb.com/task/mongodb_mongo_master_rhel55_d2ab9818beee1416c18e8cae818bc392f94d440f_15_05_06_21_26_09_noPassthroughWithMongod_WT_rhel55
http://buildlogs.mongodb.org/MCI_rhel55/builds/1076374/test/noPassthroughWithMongod_WT_0/indexbg_interrupts.js

ReplSetTest awaitReplication: last oplog entry (of 374491) for secondary #1, build4.nj1.10gen.cc:31001, is {  "ts" : Timestamp(1430965583, 4705),  "h" : NumberLong("-3989623832682242469"),  "v" : 2,  "op" : "i",  "ns" : "bgIndexSec.system.indexes",  "o" : {  "ns" : "bgIndexSec.jstests_feh",  "key" : {  "i" : 1 },  "name" : "i_1",  "background" : true } }
ReplSetTest awaitReplication: secondary #1, build4.nj1.10gen.cc:31001, is NOT synced
assert.soon failed, msg:awaiting replication
Error: assert.soon failed, msg:awaiting replication
    at Error (<anonymous>)
    at doassert (src/mongo/shell/assert.js:11:14)
    at Function.assert.soon (src/mongo/shell/assert.js:189:13)
    at ReplSetTest.awaitReplication (src/mongo/shell/replsettest.js:515:12)
    at /srv/10gen/mci-exec/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js:90:14
2015-05-06T22:32:25.429-0400 E QUERY    Error: assert.soon failed, msg:awaiting replication
    at Error (<anonymous>)
    at doassert (src/mongo/shell/assert.js:11:14)
    at Function.assert.soon (src/mongo/shell/assert.js:189:13)
    at ReplSetTest.awaitReplication (src/mongo/shell/replsettest.js:515:12)
    at /srv/10gen/mci-exec/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js:90:14 at src/mongo/shell/assert.js:13
failed to load: /srv/10gen/mci-exec/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js

Note also that there have been a couple of failures in the past month on Linux 64-bit LSM, but I believe that the LSM buildvariant is not yet considered stable.

Comment by Matt Dannenberg [ 19/Dec/14 ]

That particular failure doesn't seem to have happened since... possibly close as gone away?

Comment by Ian Whalen (Inactive) [ 19/Dec/14 ]

mattd@10gen.com is there anything to do here?

Comment by Matt Dannenberg [ 11/Dec/14 ]

two failures since, but for a different reason:

 m31000| 2014-12-10T19:54:40.003-0500 I -        [conn1]   Index Build (background): 93100/100000 93%
 m31000| 2014-12-10T19:54:40.199-0500 I INDEX    [conn1] build index done.  scanned 100000 total records. 2 secs
 m31000| 2014-12-10T19:54:40.200-0500 I QUERY    [conn1] command bgIndexSec.$cmd command: createIndexes { createIndexes: "jstests_feh", indexes: [ { key: { i: 1.0 }, name: "i_1", background: true } ] } keyUpdates:0  reslen:113 2849ms
2014-12-10T19:54:40.203-0500 I NETWORK  trying reconnect to 127.0.0.1:31001 (127.0.0.1) failed
2014-12-10T19:54:40.204-0500 W NETWORK  Failed to connect to 127.0.0.1:31001, reason: errno:61 Connection refused
2014-12-10T19:54:40.204-0500 I NETWORK  reconnect 127.0.0.1:31001 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:31001 (127.0.0.1), connection attempt failed
2014-12-10T19:54:40.207-0500 I QUERY    Error: socket exception [CONNECT_ERROR] for 127.0.0.1:31001 (127.0.0.1) failed
    at DBQuery._exec (src/mongo/shell/query.js:83:36)
    at DBQuery.hasNext (src/mongo/shell/query.js:240:10)
    at DBCollection.findOne (src/mongo/shell/collection.js:185:19)
    at DB.runCommand (src/mongo/shell/db.js:58:41)
    at DBCollection._dbCommand (src/mongo/shell/collection.js:103:21)
    at DBCollection._getIndexesCommand (src/mongo/shell/collection.js:996:20)
    at DBCollection.getIndexes (src/mongo/shell/collection.js:1021:20)
    at /data/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js:81:56
    at Function.assert.soon (src/mongo/shell/assert.js:183:17)
    at /data/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js:80:12 at src/mongo/shell/collection.js:57
failed to load: /data/mci/shell/src/jstests/noPassthroughWithMongod/indexbg_interrupts.js

http://buildlogs.mongodb.org/MCI_osx-108/builds/419427/test/noPassthroughWithMongod_WT_0/indexbg_interrupts.js
http://buildlogs.mongodb.org/MCI_ubuntu1404-debug-asan/builds/397679/test/noPassthroughWithMongod_WT_0/indexbg_interrupts.js

Generated at Thu Feb 08 03:50:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.