Terrier IR Python API
Project description
Pyterrier
Terrier Python API
Installation
pip install python-terrier
Windows
Linux
Colab notebooks
Indexing
Indexing TREC formatted collections
index_path = "/home/alex/Documents/index"
path = "/home/alex/Downloads/books/doc-text.trec"
index_path = createTRECIndex(index_path, path)
Indexing text files
Indexing a pandas dataframe
Firstly, lets create an example dataframe
df = pd.DataFrame({'docno': ['1', '2', '3'],
'url': ['url1', 'url2', 'url3'],
'text' : ['He ran out of money, so he had to stop playing',
'The waves were crashing on the shore; it was a',
'The body may perhaps compensates for the loss']
})
Then there are a number of options to index that dataframe:
index = createDFIndex(index_path, df["text"])
index = createDFIndex(index_path, df["text"], df["docno"])
index = createDFIndex(index_path, df["text"], df["docno"], df["url"])
index = createDFIndex(index_path, df["text"], df)
index = createDFIndex(index_path, df["text"], docno=["1","2","3"])
meta_fields={"docno":["1","2","3"],"url":["url1", "url2", "url3"]}
index = createDFIndex(index_path, df["text"], **meta_fields)
Retrieval
Evaluation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size python_terrier-0.2.5-py3-none-any.whl (13.2 kB) | File type Wheel | Python version py3 | Upload date | Hashes View |
Filename, size python-terrier-0.2.5.tar.gz (10.8 kB) | File type Source | Python version None | Upload date | Hashes View |
Close
Hashes for python_terrier-0.2.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4c87764b15b9a70755ec23b1c2c8feda4ceb016f3995f4c3f94835831914be2e |
|
MD5 | f83515092fdbf497b07377f525fd1090 |
|
BLAKE2-256 | f1bb442d4e05fe3ed2ad99d6a6c8557579d20a6be6451c007732528b9a91dcf4 |