Extract/Replaces keywords in sentences.
Project description
This module can be used to replace keywords in sentences or extract keywords from sentences.
Installation
$ pip install flashtext
Usage
- Extract keywords
>>> from flashtext.keyword import KeywordProcessor >>> keyword_processor = KeywordProcessor() >>> keyword_processor.add_keyword('Big Apple', 'New York') >>> keyword_processor.add_keyword('Bay Area') >>> keywords_found = keyword_processor.extract_keywords('I love Big Apple and Bay Area.') >>> keywords_found >>> ['New York', 'Bay Area']
- Replace keywords
>>> keyword_processor.add_keyword('New Delhi', 'NCR region') >>> new_sentence = keyword_processor.replace_keywords('I love Big Apple and new delhi.') >>> new_sentence >>> 'I love New York and NCR region.'
- Case Sensitive example
>>> from flashtext.keyword import KeywordProcessor >>> keyword_processor = KeywordProcessor(case_sensitive=True) >>> keyword_processor.add_keyword('Big Apple', 'New York') >>> keyword_processor.add_keyword('Bay Area') >>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.') >>> keywords_found >>> ['Bay Area']
- No clean name for Keywords
>>> from flashtext.keyword import KeywordProcessor >>> keyword_processor = KeywordProcessor() >>> keyword_processor.add_keyword('Big Apple') >>> keyword_processor.add_keyword('Bay Area') >>> keywords_found = keyword_processor.extract_keywords('I love big Apple and Bay Area.') >>> keywords_found >>> ['Big Apple', 'Bay Area']
API doc
Documentation can be found at FlashText Read the Docs.
Test
$ git clone https://github.com/vi3k6i5/flashtext $ cd flashtext $ pip install pytest $ python setup.py test
Why not Regex?
It’s a custom algorithm based on Aho-Corasick algorithm and Trie Dictionary.
To do the same with regex it will take a lot of time:
Docs count |
# Keywords |
: |
Regex |
flashtext |
---|---|---|---|---|
1.5 million |
2K |
: |
16 hours |
Not measured |
2.5 million |
10K |
: |
15 days |
15 mins |
The idea for this library came from the following StackOverflow question.
Contribute
Issue Tracker: https://github.com/vi3k6i5/flashtext/issues
Source Code: https://github.com/vi3k6i5/flashtext/
License
The project is licensed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.