[SERVER-7806] wtimeout with tagsets never returns Created: 30/Nov/12  Updated: 02/Dec/12  Resolved: 01/Dec/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.2.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Hannes Magnusson Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File 2.2.0.txt     File tags_timeout.js    
Operating System: Linux
Steps To Reproduce:
  • Configure couple of nodes in a replicaset with couple of tags
  • Configure a GLE mode using those tags
  • Issue a write
  • (optionally verify the write got replicated)
  • Call GLE with the tag and a timeout
Participants:

 Description   

When passing in a wtimeout to GLE with tag name, the query never returns - even though it got replicated just fine.

RS:PRIMARY> rs.status()
{
	"set" : "RS",
	"date" : ISODate("2012-11-30T01:22:15Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "primary.rs.local:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 4322154,
			"optime" : Timestamp(1354238383000, 1),
			"optimeDate" : ISODate("2012-11-30T01:19:43Z"),
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "secondary.rs.local:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 12060,
			"optime" : Timestamp(1354238383000, 1),
			"optimeDate" : ISODate("2012-11-30T01:19:43Z"),
			"lastHeartbeat" : ISODate("2012-11-30T01:22:13Z"),
			"pingMs" : 0
		},
		{
			"_id" : 2,
			"name" : "tertiary.rs.local:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 1319,
			"lastHeartbeat" : ISODate("2012-11-30T01:22:14Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}
RS:PRIMARY> rs.conf()
{
	"_id" : "RS",
	"version" : 2,
	"members" : [
		{
			"_id" : 0,
			"host" : "primary.rs.local:27017",
			"priority" : 20,
			"tags" : {
				"dc" : "ny",
				"important" : "A"
			}
		},
		{
			"_id" : 1,
			"host" : "secondary.rs.local:27017",
			"priority" : 10,
			"tags" : {
				"dc" : "sf",
				"important" : "C"
			}
		},
		{
			"_id" : 2,
			"host" : "tertiary.rs.local:27017",
			"arbiterOnly" : true
		}
	],
	"settings" : {
		"getLastErrorModes" : {
			"default" : {
				"important" : 2,
				"dc" : 2
			}
		}
	}
}
RS:PRIMARY> use blogs
switched to db blogs
RS:PRIMARY> db.articles.insert({"title": "hello world"});
RS:PRIMARY> rs.status()
{
	"set" : "RS",
	"date" : ISODate("2012-11-30T01:23:01Z"),
	"myState" : 1,
	"members" : [
		{
			"_id" : 0,
			"name" : "primary.rs.local:27017",
			"health" : 1,
			"state" : 1,
			"stateStr" : "PRIMARY",
			"uptime" : 4322200,
			"optime" : Timestamp(1354238561000, 1),
			"optimeDate" : ISODate("2012-11-30T01:22:41Z"),
			"self" : true
		},
		{
			"_id" : 1,
			"name" : "secondary.rs.local:27017",
			"health" : 1,
			"state" : 2,
			"stateStr" : "SECONDARY",
			"uptime" : 12106,
			"optime" : Timestamp(1354238561000, 1),
			"optimeDate" : ISODate("2012-11-30T01:22:41Z"),
			"lastHeartbeat" : ISODate("2012-11-30T01:22:59Z"),
			"pingMs" : 0
		},
		{
			"_id" : 2,
			"name" : "tertiary.rs.local:27017",
			"health" : 1,
			"state" : 7,
			"stateStr" : "ARBITER",
			"uptime" : 1365,
			"lastHeartbeat" : ISODate("2012-11-30T01:23:00Z"),
			"pingMs" : 0
		}
	],
	"ok" : 1
}
RS:PRIMARY> db.getLastError("default", 500);
timeout
RS:PRIMARY> db.getLastErrorObj("default", 500);
{
	"n" : 0,
	"lastOp" : Timestamp(1354238561000, 1),
	"connectionId" : 62860,
	"wtimeout" : true,
	"waited" : 500,
	"err" : "timeout",
	"ok" : 1
}



 Comments   
Comment by Hannes Magnusson [ 02/Dec/12 ]

bjori@anden:~/Sources/mongo/mongo ((r2.2.0)) $ python buildscripts/smoke.py --mode=files jstests/tags_timeout.js &> 2.2.0.txt
Failed test run on 2.2.0 (see attachment)

Comment by Daniel Pasette (Inactive) [ 02/Dec/12 ]

I can't repro on 2.2.0, at least with the script I attached. Can you repro with the script or is there something different in the script?

Comment by Hannes Magnusson [ 02/Dec/12 ]

I have confirmed this bug does exists in 2.2.0.
Upgrading to 2.2.2 solved the problem.

Comment by Hannes Magnusson [ 01/Dec/12 ]

Thats why I ran rs.status() to verify the secondary had it replicated already before I ran GLE.
It doesn't matter how high I set the wtimeout, even seconds, it always says it times out - even though the condition of the writeconcern has clearly been full filled within that timeframe

Comment by Daniel Pasette (Inactive) [ 01/Dec/12 ]

Can't reproduce this. The timeout you set is probably too low. Attaching a jstest.

Generated at Thu Feb 08 03:15:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.