[SERVER-17520] Add support for pluggable FTS tokenizers Created: 09/Mar/15  Updated: 18/Sep/15  Resolved: 30/Mar/15

Status: Closed
Project: Core Server
Component/s: Text Search
Affects Version/s: None
Fix Version/s: 3.1.1

Type: Task Priority: Major - P3
Reporter: Mark Benvenuto Assignee: Mark Benvenuto
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-10842 Implement an interface for tokenizer Closed
Tested
Backwards Compatibility: Fully Compatible
Sprint: Platform 1 04/03/15
Participants:

 Description   

To support third-party tokenizers, Mongo needs to create an abstract interface for document tokenization.

  1. Create an abstract interface
  2. Move all code to use the new interface except V1 legacy
  3. Create an implementation for our

class FtsTokenizer {
    virtual ~FtsTokenizer()
    virtual void reset(const char* document) = 0; // Process a new doc
    virtual bool moveNext() = 0;	    // Moves to the next token
    virtual StringData& getStem() = 0;  // Returns stemmed form
};



 Comments   
Comment by Githook User [ 01/Apr/15 ]

Author:

{u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}

Message: SERVER-17520: Add support for FTS Tokenizer stop word filtering
Branch: master
https://github.com/mongodb/mongo/commit/937b2bdc5b85095734a9cc08fccc9a8586e871cd

Comment by Githook User [ 01/Apr/15 ]

Author:

{u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}

Message: SERVER-17520: Add support for pluggable FTS tokenizers
Branch: master
https://github.com/mongodb/mongo/commit/72598f750d732c08c98f5f578bf1335acd78e10e

Comment by Mark Benvenuto [ 30/Mar/15 ]

https://github.com/mongodb/mongo/commit/0bed4262dac849788e6571dc404d5d261b9e1c8c

Generated at Thu Feb 08 03:44:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.