Skip to main content

Fast and fully local NLP file organizer that organizes files based on their content.

Project description

Connor

releases issues-open commits downloads

Connor is a file organizer written in python. It makes use of the sentence-transformers framework for the main organization process. It features a fast and fully local file organizer that uses natural language processing to organize computer files based on their textual content.

Installation

Before installing Connor, check that Python and pip are installed on your computer:

python --version

If a message dispalying python's verson appears, it means that Python is correctly installed. If an error message appears then go to the Python website to download it.
After installing python you can use pip to install connor, type the following command:

pip install connor-nlp

If something doesnt work or you are running into problems, head to the official GitHub repository for detailed instructions or open an issue.


Features

Connor works locally on your computer, using a pre-trained NLP model, sentence-transformers/paraphrase-MiniLM-L6-v2, to understand the meaning of the data and calculate cosine similarity between files. The files are organized into groups, and the corresponding folders are appropriately named using topic modeling through the Latent Dirichlet Allocation (LDA) technique. Subsequently, the files are moved to their respective folders.


File Organization Summary

  1. Organize files within a selected folder or manually uploaded files (uploading files is only supported for GUI).
  2. Organize text-based files (.docx, .txt, .pdf, etc.) using NLP.
  3. Creates a separate folder named "Miscellaneous" for dissimilar or unprocessable files based on extension.
  4. Provide a summary (tree structure) of the organization process upon completion.

Customization Options

  1. Similarity Threshold: Allows you to choose a similarity percentage threshold for grouping similar files.
  2. Reading Word Limit: You can set a limit on the number of words to read from the file content.
  3. Folder Name Word Limit: You can specify the maximum number of words allowed in the created folder names.
  4. Default Parameters: You can modify these three parameters and save them for future sessions.

Building From Source

It is useful if you want to use features that are currently in development. To build Connor locally from source read the instructions here.


Dependencies

docx >=0.2.4
nltk >=3.9.1
numpy >=2.1.1
odfpy >=1.4.1
openpyxl >=3.1.5
PyPDF2 >=3.0.1
scikit_learn >=1.5.2
sentence_transformers >=3.1.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

connor_nlp-0.1.6.tar.gz (11.2 kB view hashes)

Uploaded Source

Built Distribution

connor_nlp-0.1.6-py3-none-any.whl (11.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page