Application categorization tool using rule-based and AI methods
Project description
AppCategorizer
A powerful Python library designed to categorize software applications automatically using Artificial Intelligence.
Table of Contents
About
AppCategorizer is a Python package that takes an application name as input and provides its most suitable category. It achieves this by fetching application data from multiple sources including Snapcraft, Flathub, Apple Store, GOG, Itch.io, and MyAbandonware. This comprehensive data collection is then processed using Artificial Intelligence (AI) and Natural Language Processing (NLP) techniques to accurately determine and assign the most suitable category to the application.
The project is entirely written in Python.
Features
AppCategorizer offers a robust set of features to streamline the application categorization process:
- Multi-source Data Fetching: Gathers comprehensive application information from over 5 different sources, ensuring a broad and rich dataset for categorization.
- Intelligent Tag Normalization: Cleans and standardizes diverse tags obtained from various data sources, ensuring consistent and high-quality input for the categorization process.
- AI-Powered Categorization: Utilizes Natural Language Processing (NLP) techniques to intelligently analyze application data and assign the most appropriate category.
- Command Line Interface (CLI): Provides a simple and intuitive CLI for quick, on-the-fly application categorization, making it easy to use directly from the terminal.
- Python API: Offers programmatic access, allowing seamless integration into other Python projects, scripts, and automated workflows.
Installation
You can install AppCategorizer directly using pip:
pip install AppCategorizer
Quick Start
Command Line Interface (CLI)
Use the AppCategorizer command directly in your terminal for quick categorization:
# For single-word application names:
AppCategorizer facebook
# Expected Output: social media
# For multi-word application names (enclose in quotes):
AppCategorizer "Google Chrome"
# Expected Output: Web browser
Batch Mode
You can also use the batch mode to categorize multiple applications at once:
AppCategorizer batch input.txt output.csv
Using the Library in Python Code
You can use the AppCategorizer library in your Python code as follows:
from AppCategorizer import process_app, batch_process, load_model
# Single application categorization
classifier = load_model()
app_name = "Firefox"
app_name, main_cat, ai_cat, sub_cats = process_app(app_name, classifier)
print(f"Application: {app_name}")
print(f"Rule-Based Category: {main_cat}")
print(f"AI Category: {ai_cat}")
print(f"Sub-Categories: {', '.join(sub_cats)}")
# Batch categorization
batch_process("input.txt", "output.csv")
How it Works
AppCategorizer operates by first fetching relevant application data from a diverse set of online repositories, which includes Snapcraft, Flathub, Apple Store, GOG, Itch.io, and MyAbandonware. Once this raw data is collected, it undergoes an intelligent tag normalization process designed to clean and standardize various tags, ensuring uniformity and reliability. Finally, the normalized data is fed into an Artificial Intelligence model that employs Natural Language Processing (NLP) techniques to accurately analyze the information and assign the most suitable category to the software application.
API Documentation
process_app(app_name, classifier)
Categorizes a single application using the provided classifier.
Parameters:
app_name: The name of the application to categorizeclassifier: The classifier to use for categorization
Returns: Tuple containing the application name, rule-based category, AI category, and sub-categories
batch_process(input_file, output_file)
Categorizes a batch of applications from the input file and writes the results to the output file.
Parameters:
input_file: File containing list of application namesoutput_file: Desired output file name
Contributing
We welcome contributions to AppCategorizer! If you have suggestions for improvements, new features, or bug fixes, please feel free to contact Zain Ramzan
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file appcategorizer-1.0.0.tar.gz.
File metadata
- Download URL: appcategorizer-1.0.0.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5a3b67ae6915e4686ac10cf7dce106ad0342cfdefdcd867e9c42970d2aba3dd7
|
|
| MD5 |
48d37fb1a214dd1e8456fa028a2e3a0b
|
|
| BLAKE2b-256 |
c7cfa5356390176e4542422dc4525e988ccd987517df8521355966fb3c5ace92
|
File details
Details for the file appcategorizer-1.0.0-py3-none-any.whl.
File metadata
- Download URL: appcategorizer-1.0.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9584d9ee8c3e10cc840cda4fcf622f27e07aa52bbb9bb8917c85eb6bcb946173
|
|
| MD5 |
026b843f090a703b21c3b3c80a9e3a98
|
|
| BLAKE2b-256 |
4b313be1ade972c10488ca6becc3c0c19ba9a6da29b6b720bcec8e85d93cfc43
|