Fast and fully local NLP file organizer that organizes files based on their content.
Project description
Connor is a file organizer written in python. It makes use of the sentence-transformers framework for the main organization process. It features a fast and fully local file organizer that uses natural language processing to organize computer files based on their textual content.
Installation
Before installing Connor, check that Python and pip are installed on your computer:
python --version
If a message dispalying python's verson appears, it means that Python is correctly installed. If an error message appears then go to the Python website to download it.
After installing python you can use pip to install connor, type the following command:
pip install connor-nlp
If something doesnt work or you are running into problems, head to the official GitHub repository for detailed instructions or open an issue.
Features
Connor works locally on your computer using a pre-trained NLP model sentence-transformers/paraphrase-MiniLM-L6-v2
to understand the meaning of the data and calculate the cosine similarity between files. The folders are appropriately named using topic modeling through the Latent Dirichlet Allocation (LDA) technique.
File Organization
- Organize files within a selected folder.
- Organize text-based files (
.docx
,.txt
,.pdf
, etc.) using NLP. - Create a separate folder named ‘Miscellaneous’ for dissimilar or unprocessable files based on extension.
- Provide a summary (tree structure) of the organization process upon completion.
Customization Options
- Similarity Threshold: Allows yous to choose a similarity percentage threshold for grouping similar files.
- Reading Word Limit: You can to set a limit on the number of words read from file content.
- Folder Name Word Limit: You can specify a maximum number of words allowed in the created folder names.
- Default Parameters: You can modify these three parameters and save them for future sessions.
Building From Source
It is useful if you want to use features that are currently in development. To build Connor locally from source read to the instructions here.
Dependencies
docx | >=0.2.4 |
---|---|
nltk | >=3.9.1 |
numpy | >=2.1.1 |
odfpy | >=1.4.1 |
openpyxl | >=3.1.5 |
PyPDF2 | >=3.0.1 |
scikit_learn | >=1.5.2 |
sentence_transformers | >=3.1.1 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for connor_nlp-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bd33aa408995e307fa6b1bf2772621dcb111e03f1869b50e11f07780ef3206b7 |
|
MD5 | ed8d5ee693aef5f76983c4a9cce97236 |
|
BLAKE2b-256 | e4ab9872e130dd65bcf66153d3d15a42c8a2e52ac5f0452f49a46013027e2186 |