[SERVER-9628] text indexing picks up strings in arrays ignoring remaining dotted path Created: 09/May/13  Updated: 11/Jul/16  Resolved: 12/Oct/13

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: 2.4.0, 2.4.1, 2.4.2, 2.4.3
Fix Version/s: 2.5.3

Type: Bug Priority: Minor - P4
Reporter: Paul Pedersen Assignee: J Rassi
Resolution: Done Votes: 0
Labels: query
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Gantt Dependency
has to be done before SERVER-10906 Support for legacy text index format ... Closed
Operating System: ALL
Participants:

 Description   

> use test2
switched to db test2
> db.tweets.save( {_id:1,comments:["this is important",{b:"this is unimportant"},"green",{b:"blue"}]} );
> db.tweets.ensureIndex({ "comments.b" : "text" });
> db.tweets.runCommand("text",{ search : "unimportant" });
{
	"queryDebugString" : "unimport||||||",
	"language" : "english",
	"results" : [
		{
			"score" : 1,
			"obj" : {
				"_id" : 1,
				"comments" : [
					"this is important",
					{
						"b" : "this is unimportant"
					},
					"green",
					{
						"b" : "blue"
					}
				]
			}
		}
	],
	"stats" : {
		"nscanned" : 1,
		"nscannedObjects" : 0,
		"n" : 1,
		"nfound" : 1,
		"timeMicros" : 122
	},
	"ok" : 1
}
> db.tweets.runCommand("text",{ search : "important" });
{
	"queryDebugString" : "import||||||",
	"language" : "english",
	"results" : [
		{
			"score" : 1,
			"obj" : {
				"_id" : 1,
				"comments" : [
					"this is important",
					{
						"b" : "this is unimportant"
					},
					"green",
					{
						"b" : "blue"
					}
				]
			}
		}
	],
	"stats" : {
		"nscanned" : 1,
		"nscannedObjects" : 0,
		"n" : 1,
		"nfound" : 1,
		"timeMicros" : 89
	},
	"ok" : 1
}



 Comments   
Comment by J Rassi [ 12/Oct/13 ]

Fixed by bf0f29709b19565245be370aa3f8c46f0332de91 (SERVER-9390).

Comment by Paul Pedersen [ 13/May/13 ]

The issue arises from this piece of code in FTSSpec::scoreDocument:

FTSSpec.cpp

                else if ( e.type() == Array ) {
                    BSONObjIterator j( e.Obj() );
                    while ( j.more() ) {
                        BSONElement x = j.next();
 
                        if ( leftOverName[0] && x.isABSONObj() )
                            x = x.Obj().getFieldDotted( leftOverName );
 
                        if ( x.type() == String )
                            _scoreString( tools, x.valuestr(), term_freqs, weight );
                    }
                }

The problem is the lack of conditional nesting. I suggest the following (I'd be happy to have a more compact solution!):

FTSSpec.cpp

                else if ( e.type() == Array ) {
                    BSONObjIterator j( e.Obj() );
                    while ( j.more() ) {
                        BSONElement x = j.next();
 
                        if ( leftOverName[0] ) {
                            if ( x.isABSONObj() ) {
                                x = x.Obj().getFieldDotted( leftOverName );
                                if ( x.type() == String )
                                    _scoreString( tools, x.valuestr(), term_freqs, weight );
                            }
                        }
                        else if ( x.type() == String ) {
                            _scoreString( tools, x.valuestr(), term_freqs, weight );
                        }

Generated at Thu Feb 08 03:20:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.