An app to output n-grams from column in Excel spreadsheet
Project description
The Excel Ngrams Project
A project to analyse a column of text in an Excel document and return a CSV file with the most common ngrams from that text. Output file is returned to the same directory as the input file.
You can choose the maximum n-gram length, and maximum number of results (rows) returned. The app defaults to looking for a column named'Keyword' but any column name can be passed in as an argument.
The column of terms to analyse must be the longest (or only) column in the document to prevent the addition of NaN as a placeholder in final cells, which will cause errors.
Words are tokenised with Spacy and ngrams are generated with NLTK.
Installation
To install the Excel Ngrams Project, run this command in your terminal:
$ pip install excel-ngrams
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file excel-ngrams-0.2.0.tar.gz
.
File metadata
- Download URL: excel-ngrams-0.2.0.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.9.1 Linux/5.4.0-1039-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e63257f8b938b62c296f87c7b1896c32d0a8d2d4793dd318a0d12930ab5942a8 |
|
MD5 | 04d2be58dc607ce1b9f5b325402c3f4f |
|
BLAKE2b-256 | 42bf48062a143dbce53dc9ccc4d17a0135d0e10bd9f93de71d0dc3e521a3e6b8 |
File details
Details for the file excel_ngrams-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: excel_ngrams-0.2.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.9.1 Linux/5.4.0-1039-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e9e0f5c62bc75739c2f0bd621dbd3d081fb6ab7867cc8495e73ae1347a6cd2b |
|
MD5 | c0193f7fbc833a1be27c5527e1f95e2e |
|
BLAKE2b-256 | 305c28f0437cb0dc1c60fba5031cca7a04aab62f7596dbc682929efd423630cb |