[SERVER-57026] Regular expression is invalid UTF-8 while enabling mongo backwards-incompatible 4.0 features Created: 18/May/21  Updated: 11/Apr/23  Resolved: 16/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: sandip Divekar Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-75942 Check that database name is valid UTF... Open
Participants:

 Description   

I have upgraded my mongo QA environment from v3.6 to v4.0.23. I am running sharded cluster with 14 mongos, 3 config servers and 3 shard servers. I have upgraded all three components successfully to v4.0.23.

Now I am trying execute last step of upgrade v3.6 to v4.0.23 which is enabling backwards-incompatible 4.0 features. But I am getting below error, even if I retried command, restarted mongos, config and shard servers.

// mongos> db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } )
{
        "ok" : 0,
        "errmsg" : "Regular expression is invalid UTF-8",
        "code" : 5108300,
        "codeName" : "Location5108300",
        "operationTime" : Timestamp(1621314272, 3),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1621314272, 3),
                "signature" : {
                        "hash" : BinData(0,"F/Ukgpx8acf5fBuOtKfw0gyz3L4="),
                        "keyId" : NumberLong("6934901981374840833")
                }
        }
}

{{}}



 Comments   
Comment by Edwin Zhou [ 16/Jun/21 ]

Hi sandipdivekar1@gmail.com,

We're thrilled to hear that you were able to resolve your problem! It's strange that your config database has _id fields containing regular expressions. Thank you for following up with your solution.

Best,
Edwin

Comment by sandip Divekar [ 16/Jun/21 ]

Hi Edwin Zhou,

Greetings, we have resolved this "regular expression is invalid utf-8" issue from QA and production. We found that our config servers data had few inconsistent data, our config.databases collection had invalid database names on both QA and production. We removed these few invalid names using mongos shell and few using GUI tool as few database names were difficult to remove from mongos shell. After removing these invalid names from config.databases collection, I was able to execute this feature incompatibility command successfully on production.

I am not sure when and how these invalid database names are added into config.databases collection. If you have any idea on this, please share the knowledge. 

Thanks for your help and guidance on this issue. you can mark this as closed.

Invalid names:
**

{ "_id" : "�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�\u0001\u0003\b", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "��\u0006�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "��\u0005�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "��", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�!\u0004�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�)\u0004�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�-", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�B\u0002�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�u\u0007�\u001", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "��\u0007�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "��\u000b�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�D\u000b�\", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�C\u0006�", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "Л\u0005,", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�\u0001", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�\u0007�\u001", "partitioned" : false, "primary" : "aws2-1" }
{ "_id" : "�\t", "partitioned" : false, "primary" : "aws2-1" }

 

Comment by Edwin Zhou [ 14/Jun/21 ]

Hi sandipdivekar1@gmail.com,

Would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Best,
Edwin

Comment by sandip Divekar [ 13/Jun/21 ]

Hi @Edwin Zhou 

Today I performed same command in our production cluster but it's failing with same error. I had managed to work this in QA environment using below steps.
1. Taken backup of config database

2. Stopped all mongos and client applications

3. Stopped all 3 config servers

4. Created new data directories for config server data

5. Formed new config replica set by restoring config data 

6. Then executed the command : db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } ). This time command will work.

7. But now when I start mongos and client application and check mongos.log file, I am seeing few error lines.

2021-06-13T04:38:30.811+0000 I NETWORK [Uptime reporter] Marking host $HOSTNAME:27019 as failed :: caused by :: NetworkInterfaceExceededTimeLimit: timed out
2021-06-13T04:38:30.811+0000 I SHARDING [Uptime reporter] Operation timed out :: caused by :: NetworkInterfaceExceededTimeLimit: timed out
2021-06-13T04:38:30.811+0000 W SHARDING [Uptime reporter] failed to refresh mongos settings :: caused by :: NetworkInterfaceExceededTimeLimit: Failed to refresh the balancer settings :: caused by :: timed out
2021-06-13T04:38:30.811+0000 I CONNPOOL [ShardRegistry] Ending connection to host $HOSTNAME:27019 due to bad connection status; 2 connections to that host remain open
2021-06-13T04:38:38.373+0000 I NETWORK [shard registry reload] Marking host $HOSTNAME:27019 as failed :: caused by :: NetworkInterfaceExceededTimeLimit: timed out
2021-06-13T04:38:38.373+0000 I SHARDING [shard registry reload] Operation timed out :: caused by :: NetworkInterfaceExceededTimeLimit: timed out
2021-06-13T04:38:38.373+0000 I CONNPOOL [ShardRegistry] Ending connection to host $HOSTNAME:27019 due to bad connection status; 2 connections to that host remain open
2021-06-13T04:38:38.373+0000 I SHARDING [shard registry reload] Periodic reload of shard registry failed :: caused by :: NetworkInterfaceExceededTimeLimit: could not get updated shard list from config server :: caused by :: timed out; will retry after 30s
2021-06-13T04:38:38.373+0000 I CONNPOOL [ShardRegistry] Ending idle connection to host $HOSTNAME:27019 because the pool meets constraints; 1 connections to that host remain open

8. So to resolve this, I retried the backup and restore steps 2-3 times and somehow it got resolved. further we did not see any error lines in QA environment

9. But Today in production I performed - db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } ), which got failed with regular expression is invalid utf-8

10. So we performed the workaround of backup and restore in production and we were able to run  command the db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } )  succesfully.

11. But further we saw error lines in production mongos.log file similar to QA. So I retried the backup and restore 2-3 time by selecting different options, still the lines were present in log file. We restarted all components of mongo including shards, config and mongos , still error lines are present in mongos.log file.

12. I am not sure, how we can resolve this invalid expression utf-8 issue. Also with workaround we are not able to resolve the error lines from mongos.log file. I think mongos is not able to read shards from new config replica set where config data is restored.

Comment by Edwin Zhou [ 09/Jun/21 ]

Hi sandipdivekar1@gmail.com

We still need additional information to diagnose the problem. If this is still an issue for you, would you please assess my reproduction and point out any discrepancy with your steps leading to this behavior?

Best,
Edwin

Comment by Edwin Zhou [ 20/May/21 ]

Hi sandipdivekar1@gmail.com,

This is an unusual error that we haven't previously encountered. I was successful in running the command that you put in your description:

db.adminCommand( { setFeatureCompatibilityVersion: "4.0" } )

I tried a few invalid commands:

db.adminCommand( { setFeatureCompatibilityVersion: "4.1" } )
db.adminCommand( { setFeatureCompatibilityVersion: “4.0” } ) // note the quotation marks here

but this only output the following message respectively:

{
	"ok" : 0,
	"errmsg" : "Invalid command argument. Expected '4.0' or '3.6', found 4.1 in: { setFeatureCompatibilityVersion: \"4.1\", lsid: { id: UUID(\"240112a5-6d1b-433a-a0e0-fca41d3c17f6\") }, $clusterTime: { clusterTime: Timestamp(1621438558, 1), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: \"admin\" }. See http://dochub.mongodb.org/core/4.0-feature-compatibility.",
	"code" : 2,
	"codeName" : "BadValue",
	"operationTime" : Timestamp(1621438558, 1),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1621438558, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

2021-05-20T10:09:41.256-0400 E QUERY    [js] SyntaxError: illegal character @(shell):1:51 

Is there anything different or missing from the steps I tried?

Best,
Edwin

Comment by sandip Divekar [ 19/May/21 ]

Hi Edwin Zhou,

we are using redhat el6 and el7 OS. And we are using mongos v4.0.23. 

Comment by Edwin Zhou [ 18/May/21 ]

Hi sandipdivekar1@gmail.com,

What is the operating system you're using? Can you also provide the version of the shell you're running this on using mongo --version?

Best,
Edwin

Comment by sandip Divekar [ 18/May/21 ]

@Tim Fogarty 
Thanks for moving this ticket to correct project

Comment by Tim Fogarty [ 18/May/21 ]

Hi sandipdivekar1@gmail.com, the TOOLS Jira project is for bug reports related to mongoimport/mongodump/etc, not the server itself. I will move this ticket to the SERVER project for you so the correct team can take a look at your issue.

Generated at Thu Feb 08 05:40:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.