Skip to main content

Application categorization tool using rule-based and AI methods

Project description

AppCategorizer

A powerful Python library designed to categorize software applications automatically using Artificial Intelligence.

Table of Contents

About

AppCategorizer is a Python package that takes an application name as input and provides its most suitable category. It achieves this by fetching application data from multiple sources including Snapcraft, Flathub, Apple Store, GOG, Itch.io, and MyAbandonware. This comprehensive data collection is then processed using Artificial Intelligence (AI) and Natural Language Processing (NLP) techniques to accurately determine and assign the most suitable category to the application.

The project is entirely written in Python.

Features

AppCategorizer offers a robust set of features to streamline the application categorization process:

  • Multi-source Data Fetching: Gathers comprehensive application information from over 5 different sources, ensuring a broad and rich dataset for categorization.
  • Intelligent Tag Normalization: Cleans and standardizes diverse tags obtained from various data sources, ensuring consistent and high-quality input for the categorization process.
  • AI-Powered Categorization: Utilizes Natural Language Processing (NLP) techniques to intelligently analyze application data and assign the most appropriate category.
  • Command Line Interface (CLI): Provides a simple and intuitive CLI for quick, on-the-fly application categorization, making it easy to use directly from the terminal.
  • Python API: Offers programmatic access, allowing seamless integration into other Python projects, scripts, and automated workflows.

Installation

You can install AppCategorizer directly using pip:

pip install AppCategorizer

Quick Start

Command Line Interface (CLI)

Use the AppCategorizer command directly in your terminal for quick categorization:

# For single-word application names:
AppCategorizer facebook
# Expected Output: social media

# For multi-word application names (enclose in quotes):
AppCategorizer "Google Chrome"
# Expected Output: Web browser

Batch Mode

You can also use the batch mode to categorize multiple applications at once:

AppCategorizer batch input.txt output.csv

Using the Library in Python Code

You can use the AppCategorizer library in your Python code as follows:

from AppCategorizer import process_app, batch_process, load_model

# Single application categorization
classifier = load_model()
app_name = "Firefox"
app_name, main_cat, ai_cat, sub_cats = process_app(app_name, classifier)
print(f"Application: {app_name}")
print(f"Rule-Based Category: {main_cat}")
print(f"AI Category: {ai_cat}")
print(f"Sub-Categories: {', '.join(sub_cats)}")

# Batch categorization
batch_process("input.txt", "output.csv")

How it Works

AppCategorizer operates by first fetching relevant application data from a diverse set of online repositories, which includes Snapcraft, Flathub, Apple Store, GOG, Itch.io, and MyAbandonware. Once this raw data is collected, it undergoes an intelligent tag normalization process designed to clean and standardize various tags, ensuring uniformity and reliability. Finally, the normalized data is fed into an Artificial Intelligence model that employs Natural Language Processing (NLP) techniques to accurately analyze the information and assign the most suitable category to the software application.

API Documentation

process_app(app_name, classifier)

Categorizes a single application using the provided classifier.
Parameters:

  • app_name: The name of the application to categorize
  • classifier: The classifier to use for categorization
    Returns: Tuple containing the application name, rule-based category, AI category, and sub-categories

batch_process(input_file, output_file)

Categorizes a batch of applications from the input file and writes the results to the output file.
Parameters:

  • input_file: File containing list of application names
  • output_file: Desired output file name

Contributing

We welcome contributions to AppCategorizer! If you have suggestions for improvements, new features, or bug fixes, please feel free to contact Zain Ramzan

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

appcategorizer-1.0.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

appcategorizer-1.0.0-py3-none-any.whl (21.5 kB view details)

Uploaded Python 3

File details

Details for the file appcategorizer-1.0.0.tar.gz.

File metadata

  • Download URL: appcategorizer-1.0.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for appcategorizer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5a3b67ae6915e4686ac10cf7dce106ad0342cfdefdcd867e9c42970d2aba3dd7
MD5 48d37fb1a214dd1e8456fa028a2e3a0b
BLAKE2b-256 c7cfa5356390176e4542422dc4525e988ccd987517df8521355966fb3c5ace92

See more details on using hashes here.

File details

Details for the file appcategorizer-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: appcategorizer-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 21.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for appcategorizer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9584d9ee8c3e10cc840cda4fcf622f27e07aa52bbb9bb8917c85eb6bcb946173
MD5 026b843f090a703b21c3b3c80a9e3a98
BLAKE2b-256 4b313be1ade972c10488ca6becc3c0c19ba9a6da29b6b720bcec8e85d93cfc43

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page