[GODRIVER-2787] Arbiter connection permanently broken after initiating the replica set Created: 30/Mar/23  Updated: 28/Oct/23  Resolved: 13/Jun/23

Status: Closed
Project: Go Driver
Component/s: Connections, Server Selection
Affects Version/s: None
Fix Version/s: 1.11.7

Type: Bug Priority: Unknown
Reporter: Aleksander Polak Assignee: Matt Dale
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by GODRIVER-2829 Requesting an immediate heartbeat che... Backlog
is caused by GODRIVER-2828 SDAM error handling doesn't ignore er... Closed
Documentation Changes: Not Needed
Documentation Changes Summary:

1. What would you like to communicate to the user about this feature?
2. Would you like the user to see examples of the syntax and/or executable code and its output?
3. Which versions of the driver/connector does this apply to?


 Description   

Summary

In my scenario, I have an arbiter from which I collect metrics using the Percona MongoDB exporter (https://github.com/percona/mongodb_exporter). The exporter uses the Go driver to establish a direct connection to the arbiter and execute commands like getDiagnosticData and replSetGetStatus every 10 seconds.

The exporter connects to the arbiter as soon as it starts and reuses the same connection throughout its entire life. When the mongod instance and the MongoDB exporter start at the same time (which is typically the case as they reside in the same Kubernetes pod), the connection becomes permanently broken. The following error is returned by the Go driver on every attempt to execute any command:

server selection error: context canceled, current topology: { Type: Single, Servers: [{ Addr: mongo-arbiter:27017, Type: Unknown, Last error: (NotPrimaryOrSecondary) node is not in primary or recovering state }, ] }

Restarting only the MongoDB exporter or disabling the global connection pool (via the --no-mongodb.global-conn-pool flag) fixes the problem.

This is particularly easy to reproduce when the exporter connects to the soon-to-be arbiter instance before initiating the replica set. In such case, the exporter is initially able to execute commands but starts failing once the replica set is initiated.

MongoDB driver version: v1.10.3 (the issue doesn't reproduce with v1.3.2 used by older versions of the MongoDB exporter)

MongoDB server version: v4.4 or later

Topology: replica set or sharded cluster + arbiter

How to Reproduce

Reproducibility: 100%

1. Create the following prometheus.yml file:

global:
  scrape_interval: 10s
  scrape_timeout: 9s
 
scrape_configs:
  - job_name: mongo-0
    static_configs:
      - targets:
          - exporter-0:9216
  - job_name: mongo-1
    static_configs:
      - targets:
          - exporter-1:9216
  - job_name: mongo-arbiter
    static_configs:
      - targets:
          - exporter-arbiter:9216

2. Create the following docker-compose.yml file:

version: '3.8'
 
services:
  mongo-0:
    container_name: mongo-0
    image: mongo:4.4
    networks:
      - mongo
    entrypoint: ["/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs"]
  mongo-1:
    container_name: mongo-1
    image: mongo:4.4
    networks:
      - mongo
    entrypoint: ["/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs"]
  mongo-arbiter:
    container_name: mongo-arbiter
    image: mongo:4.4
    networks:
      - mongo
    entrypoint: ["/usr/bin/mongod", "--bind_ip_all", "--replSet", "rs"]
  exporter-0:
    container_name: exporter-0
    image: percona/mongodb_exporter:0.37.0
    networks:
      - mongo
    command: ["--mongodb.uri=mongodb://mongo-0:27017/", "--mongodb.direct-connect", "--collector.diagnosticdata", "--mongodb.global-conn-pool", "--log.level=debug"]
  exporter-1:
    container_name: exporter-1
    image: percona/mongodb_exporter:0.37.0
    networks:
      - mongo
    command: ["--mongodb.uri=mongodb://mongo-1:27017/", "--mongodb.direct-connect", "--collector.diagnosticdata", "--mongodb.global-conn-pool", "--log.level=debug"]
  exporter-arbiter:
    container_name: exporter-arbiter
    image: percona/mongodb_exporter:0.37.0
    networks:
      - mongo
    command: ["--mongodb.uri=mongodb://mongo-arbiter:27017/", "--mongodb.direct-connect", "--collector.diagnosticdata", "--mongodb.global-conn-pool", "--log.level=debug"]
  victoriametrics:
    container_name: victoriametrics
    image: victoriametrics/victoria-metrics:latest
    networks:
      - mongo
    ports:
      - 8428:8428
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    command: ["-promscrape.config", "/etc/prometheus/prometheus.yml"]
 
networks:
  mongo:
    driver: bridge

3. Start the containers in the background:

docker compose up -d

4. Monitor the exporter-arbiter container logs for 30-40 seconds:

docker logs -f exporter-arbiter

Note that the getDiagnosticData command result is emitted every 10 seconds.

5. Initiate the replica set:

cat <<EOF | docker exec -i mongo-0 mongo
rs.initiate({
    "_id": "rs",
    "members": [
        {"_id": 0, "host": "mongo-0:27017"},
        {"_id": 1, "host": "mongo-1:27017"},
        {"_id": 2, "host": "mongo-arbiter:27017", "arbiterOnly": true}
    ]
});
EOF

6. Monitor the exporter-arbiter container logs for another 30-40 seconds.

docker logs -f exporter-arbiter

Note that the following entries are now emitted instead of the command results:

time="2023-03-30T18:49:31Z" level=warning msg="cannot load topology labels: cannot check if the instance is an arbiter: server selection error: context canceled, current topology: { Type: Single, Servers: [{ Addr: mongo-arbiter:27017, Type: Unknown, Last error: (NotPrimaryOrSecondary) node is not in primary or recovering state }, ] }"
time="2023-03-30T18:49:31Z" level=error msg="Registry - Cannot get node type to check if this is a mongos : server selection error: context canceled, current topology: { Type: Single, Servers: [{ Addr: mongo-arbiter:27017, Type: Unknown, Last error: (NotPrimaryOrSecondary) node is not in primary or recovering state }, ] }"

7. Restart the exporter-arbiter container:

docker restart exporter-arbiter

8. Monitor the exporter-arbiter container logs for another 30-40 seconds:

docker logs -f exporter-arbiter

Note that the command results are emitted again.

9. Stop and remove the containers:

docker compose down

Additional Background

Use the --no-mongodb.global-conn-pool flag instead of the --mongodb.global-conn-pool flag in the example above to disable the global connection pool. The issue doesn't reproduce when a new connection is used each time.

I forked the v0.11.2 branch of the MongoDB exporter which uses the Go driver v1.3.2 which doesn't have this issue. After upgrading to v1.11.3, the issue started to appear.



 Comments   
Comment by Aleksander Polak [ 13/Jun/23 ]

matt.dale@mongodb.com Thanks! I can confirm the issue doesn't reproduce with Go driver v1.11.7. I checked it against an older v2.35.0 branch of MongoDB exporter which doesn't include some of the latest changes that I believe were introduced as a workaround for GODRIVER-2828. Everything works as expected. Feel free to close this issue.

Comment by Matt Dale [ 12/Jun/23 ]

aleksander.polak@tietoevry.com thanks for the additional info! You're correct that the topology change is related to the problem (specifically the topology version change, which is considered during error handling). The proposed fix in GODRIVER-2828 was released with Go driver v1.11.7. The mongodb_exporter tool was recently updated to use v1.11.7 in the main branch, but there hasn't been a release with the updated Go driver yet.

My local testing seems to show that mongodb_exporter built from the latest main revision (9fb40e2) resolves the previous issue. Are you able to build the latest version of mongodb_exporter? If so, can you confirm if it resolves your issue?

Comment by Aleksander Polak [ 09/May/23 ]

The Go Driver should just pass that error back to the caller, but due to what seems to be a bug in the new error handling logic, the Go Driver interprets that error as an indication that the database node should not be queried right now. The Go Driver should recover from that, but another bug prevents an immediate database status check, delaying it for 10 seconds, causing a persistent timeout.

Note that the problem only seems to occur if you connect to the mongod instance:

  • before initiating the replica set
  • right after restarting the instance

If you connect after the instance assumes the arbiter role, operations like client.ListDatabaseNames() still fail due to the "(NotPrimaryOrSecondary) node is not in primary or recovering state" error, but it doesn't cause any observable delays.

I suspect that the problem is linked to a change in topology reported by the mongod instance (from standalone to replica set) during the lifetime of the connection.

Comment by Aleksander Polak [ 08/May/23 ]

Hi matt.dale@mongodb.com, thank you for the update! Based on the information you provided, I can give you some more details.

MongoDB exporter seems to have a bug: it tries to parse the X-Prometheus-Scrape-Timeout-Seconds header value as an integer, but VictoriaMetrics passes a float instead:
https://github.com/percona/mongodb_exporter/blob/14c8bd00d7bc2daacb6d0b5325e67d9e1e1522bd/exporter/exporter.go#L262-L266

Therefore, the timeout in this scenario is always 10s, regardless of the scrape timeout, which was set to 9s. This explains why we see "context canceled": VictoriaMetrics disconnects from the MongoDB exporter before the timeout expires, causing the server to cancel the request. If the scrape timeout is extended to 20s, the error changes to "context deadline exceeded". Either way, the "server selection error" is directly caused by the request context cancellation/deadline being exceeded, not the server selection timeout managed by the driver, which is 30s by default.

That said, there is certainly a problem with the driver. To isolate it, I wrote this program that makes the same sequence of driver calls as the MongoDB exporter:

package main
 
import (
	"context"
	"flag"
	"log"
	"time"
 
	"go.mongodb.org/mongo-driver/bson"
	"go.mongodb.org/mongo-driver/mongo"
	"go.mongodb.org/mongo-driver/mongo/options"
)
 
var (
	uri      = flag.String("uri", "", "")
	interval = flag.Int("interval", 5, "")
	timeout  = flag.Int("timeout", 60, "")
 
	globalClient *mongo.Client
)
 
func runCommand(ctx context.Context, client *mongo.Client, database string, command bson.D) {
	log.Printf("run %s", command)
	var result bson.M
	err := client.Database(database).RunCommand(ctx, command).Decode(&result)
	if err != nil {
		log.Printf("error: %s", err)
	} else {
		log.Printf("ok")
	}
}
 
func listDatabaseNames(ctx context.Context, client *mongo.Client) {
	log.Print("list database names")
	_, err := client.ListDatabaseNames(ctx, bson.D{}, options.ListDatabases())
	if err != nil {
		log.Printf("error: %s", err)
	} else {
		log.Printf("ok")
	}
}
 
func queryCollection(ctx context.Context, client *mongo.Client, database string, collection string) {
	log.Printf("query %s, database: %s", collection, database)
	var result bson.M
	err := client.Database(database).Collection(collection).FindOne(ctx, bson.D{}).Decode(&result)
	if err != nil {
		log.Printf("error: %s", err)
	} else {
		log.Printf("ok")
	}
}
 
func connect(uri string) (*mongo.Client, error) {
	if globalClient == nil {
		client, err := mongo.Connect(context.TODO(), options.Client().ApplyURI(uri).SetDirect(true))
		if err != nil {
			return nil, err
		}
 
		globalClient = client
	}
 
	return globalClient, nil
}
 
func main() {
	flag.Parse()
 
	for {
		client, err := connect(*uri)
 
		if err != nil {
			log.Printf("can't connect to MongoDB: %s", err)
		} else {
			log.Print("begin iteration")
 
			ctx, cancel := context.WithTimeout(context.Background(), time.Duration(*timeout)*time.Second)
			defer cancel()
 
			listDatabaseNames(ctx, client)
 
			runCommand(ctx, client, "admin", bson.D{
				{Key: "getCmdLineOpts", Value: 1},
				{Key: "recordStats", Value: 1},
			})
			runCommand(ctx, client, "admin", bson.D{{Key: "replSetGetConfig", Value: 1}})
			runCommand(ctx, client, "admin", bson.D{{Key: "isMaster", Value: 1}})
 
			queryCollection(ctx, client, "config", "version")
			queryCollection(ctx, client, "admin", "system.version")
 
			runCommand(ctx, client, "admin", bson.D{{Key: "replSetGetConfig", Value: 1}})
			runCommand(ctx, client, "admin", bson.D{{Key: "getDiagnosticData", Value: 1}})
 
			log.Print("end iteration")
		}
 
		time.Sleep(time.Duration(*interval) * time.Second)
	}
}

When I use Go driver v1.3.7, after initiating the replica set I get:

2023/05/08 06:07:13 begin iteration
2023/05/08 06:07:13 list database names
2023/05/08 06:07:13 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:07:13 run [{getCmdLineOpts %!s(int=1)} {recordStats %!s(int=1)}]
2023/05/08 06:07:13 ok
2023/05/08 06:07:13 run [{replSetGetConfig %!s(int=1)}]
2023/05/08 06:07:13 ok
2023/05/08 06:07:13 run [{isMaster %!s(int=1)}]
2023/05/08 06:07:13 ok
2023/05/08 06:07:13 query version, database: config
2023/05/08 06:07:14 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:07:14 query system.version, database: admin
2023/05/08 06:07:15 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:07:15 run [{replSetGetConfig %!s(int=1)}]
2023/05/08 06:07:15 ok
2023/05/08 06:07:15 run [{getDiagnosticData %!s(int=1)}]
2023/05/08 06:07:15 ok
2023/05/08 06:07:15 end iteration

When I use Go driver v1.4.0 or later (all the way up to v1.11.6), I get:

2023/05/08 06:10:42 begin iteration
2023/05/08 06:10:42 list database names
2023/05/08 06:10:48 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:10:48 run [{getCmdLineOpts %!s(int=1)} {recordStats %!s(int=1)}]
2023/05/08 06:10:58 ok
2023/05/08 06:10:58 run [{replSetGetConfig %!s(int=1)}]
2023/05/08 06:10:58 ok
2023/05/08 06:10:58 run [{isMaster %!s(int=1)}]
2023/05/08 06:10:58 ok
2023/05/08 06:10:58 query version, database: config
2023/05/08 06:11:08 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:11:08 query system.version, database: admin
2023/05/08 06:11:28 error: (NotPrimaryOrSecondary) node is not in primary or recovering state
2023/05/08 06:11:28 run [{replSetGetConfig %!s(int=1)}]
2023/05/08 06:11:38 ok
2023/05/08 06:11:38 run [{getDiagnosticData %!s(int=1)}]
2023/05/08 06:11:38 ok
2023/05/08 06:11:38 end iteration

Note the timestamps: the duration of a single iteration went up from 2s to almost a minute. In particular, the operations that cause the "(NotPrimaryOrSecondary) node is not in primary or recovering state" error are now significantly slower (6-20s), and the operations immediately following them also take longer (10s). Re-connecting fixes the problem.

Comment by Matt Dale [ 03/May/23 ]

Hey aleksander.polak@tietoevry.com, I've been able to reproduce this and have a theory about the root cause of the issue. One of the major changes to the Go Driver since v1.3.x is what it does when it encounters specific errors. In this case, the mongodb_exporter is trying to list all non-system collections (see here), which is not a valid operation on an arbiter, so it always returns an error like

(NotPrimaryOrSecondary) node is not in primary or recovering state

The Go Driver should just pass that error back to the caller, but due to what seems to be a bug in the new error handling logic, the Go Driver interprets that error as an indication that the database node should not be queried right now. The Go Driver should recover from that, but another bug prevents an immediate database status check, delaying it for 10 seconds, causing a persistent timeout.

TLDR: The problem is a combination of mongodb_exporter running operations that will never succeed on an arbiter and bugs in the Go Driver error handling logic that cause it to delay operations to the arbiter and then not recover as quickly as it should.

I will continue investigating and see if I can validate my bug theories.

Comment by Matt Dale [ 10/Apr/23 ]

Hey aleksander.polak@tietoevry.com thanks for the ticket and all the details! We're looking into it and will get back to you as soon as we have some information or questions.

Generated at Thu Feb 08 08:39:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.