Create your private AI model with no training data or GPUs 🤖🚀.
Project description
Artifex
Create Task-Specific SLMs • No training data needed • No GPU needed • CPU Inference & Fine-Tuning
Artifex is a Python library for:
- Using pre-trained task-specific Small Language Models on CPU
- Fine-tuning them on CPU without any training data — just based on your instructions for the task at hand.
How is it possible?
Artifex generates synthetic training data on-the-fly based on your instructions, and uses this data to fine-tune Small Language Models for your specific task. This approach allows you to create effective models without the need for large labeled datasets.
At this time, we support 10 models, all of which can be used out-of-the-box on CPU and can be fine-tuned on CPU.
| Task | Description | Default Model | Size | Code Examples |
|---|---|---|---|---|
| Text Classification | Classifies text into user-defined categories. | No default model — must be trained | 0.1B params, 470MB | Examples |
| Guardrail | Flags unsafe, harmful, or off-topic messages. | tanaos/tanaos-guardrail-v1 | 0.1B params, 500MB | Examples |
| Intent Classification | Classifies user messages into predefined intent categories. | tanaos/tanaos-intent-classifier-v1 | 0.1B params, 500MB | Examples |
| Reranker | Ranks a list of items or search results based on relevance to a query. | cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 | 0.1B params, 470MB | Examples |
| Sentiment Analysis | Determines the sentiment (positive, negative, neutral) of a given text. | tanaos/tanaos-sentiment-analysis-v1 | 0.1B params, 470MB | Examples |
| Emotion Detection | Identifies the emotion expressed in a given text. | tanaos/tanaos-emotion-detection-v1 | 0.1B params, 470MB | Examples |
| Named Entity Recognition | Detects and classifies named entities in text (e.g., persons, organizations, locations). | tanaos/tanaos-NER-v1 | 0.1B params, 500MB | Examples |
| Text Anonymization | Removes personally identifiable information (PII) from text. | tanaos/tanaos-text-anonymizer-v1 | 0.1B params, 500MB | Examples |
| Spam Detection | Identifies whether a message is spam or not. | tanaos/tanaos-spam-detection-v1 | 0.1B params, 500MB | Examples |
| Topic Classification | Classifies text into predefined topics. | tanaos/tanaos-topic-classification-v1 | 0.1B params, 500MB | Examples |
For each model, Artifex provides three easy-to-use APIs:
- Inference API to use a default, pre-trained Small Language Model to perform that task out-of-the-box locally on CPU.
- Fine-tune API to fine-tune the default model based on your requirements, without any training data and on CPU. The fine-tuned model is generated on your machine and is yours to keep.
- Load API to load your fine-tuned model locally on CPU, and use it for inference or further fine-tuning.
We will be adding more tasks soon, based on user feedback. Want Artifex to perform a specific task? Suggest one or vote one up.
Use Cases & Tutorials
- Cut your chatbot costs and latency by 40% by using a small, self-hosted Guardrail model.
- Analyze your users' sentiment without sending their data to third-party servers.
- Anonymize user data locally and stay GDPR-compliant.
Quick Start
Install Artifex with:
pip install artifex
Text Classification model
Create & use a custom Text Classification model
Train your own text classification model, use it locally on CPU and keep it forever:
from artifex import Artifex
model_output_path = "./output_model/"
text_classification = Artifex().text_classification
text_classification.train(
domain="chatbot conversations",
classes={
"politics": "Messages related to political topics and discussions.",
"sports": "Messages related to sports events and activities.",
"technology": "Messages about technology, gadgets, and software.",
"entertainment": "Messages about movies, music, and other entertainment forms.",
"health": "Messages related to health, wellness, and medical topics.",
},
output_path=model_output_path
)
text_classification.load(model_output_path)
print(text_classification("What do you think about the latest AI advancements?"))
# >>> [{'label': 'technology', 'score': 0.9913}]
Guardrail Model
Use the default Guardrail model
Use Artifex's default guardrail model, which is trained to flag unsafe or harmful messages out-of-the-box:
from artifex import Artifex
guardrail = Artifex().guardrail
print(guardrail("How do I make a bomb?"))
# >>> [{'label': 'unsafe', 'score': 0.9976}]
Learn more about the default guardrail model and what it considers safe vs unsafe on our Guardrail HF model page.
Create & use a custom Guardrail model
Need more control over what is considered safe vs unsafe? Fine-tune your own guardrail model, use it locally on CPU and keep it forever:
from artifex import Artifex
guardrail = Artifex().guardrail
model_output_path = "./output_model/"
guardrail.train(
unsafe_content=[
"Discussing a competitor's products or services.",
"Sharing our employees' personal information.",
"Providing instructions for illegal activities.",
],
output_path=model_output_path
)
guardrail.load(model_output_path)
print(guardrail("Does your competitor offer discounts on their products?"))
# >>> [{'label': 'unsafe', 'score': 0.9970}]
Reranker model
Use the default Reranker model
Use Artifex's default reranker model, which is trained to rank items based on relevance out-of-the-box:
from artifex import Artifex
reranker = Artifex().reranker
print(reranker(
query="Best programming language for data science",
documents=[
"Java is a versatile language typically used for building large-scale applications.",
"Python is widely used for data science due to its simplicity and extensive libraries.",
"JavaScript is primarily used for web development.",
]
))
# >>> [('Python is widely used for data science due to its simplicity and extensive libraries.', 3.8346), ('Java is a versatile language typically used for building large-scale applications.', -0.8301), ('JavaScript is primarily used for web development.', -1.3784)]
Create & use a custom Reranker model
Want to fine-tune the Reranker model on a specific domain for better accuracy? Fine-tune your own reranker model, use it locally on CPU and keep it forever:
from artifex import Artifex
reranker = Artifex().reranker
model_output_path = "./output_model/"
reranker.train(
domain="e-commerce product search",
output_path=model_output_path
)
reranker.load(model_output_path)
print(reranker(
query="Laptop with long battery life",
documents=[
"A powerful gaming laptop with high-end graphics and performance.",
"An affordable laptop suitable for basic tasks and web browsing.",
"This laptop features a battery life of up to 12 hours, perfect for all-day use.",
]
))
# >>> [('This laptop features a battery life of up to 12 hours, perfect for all-day use.', 4.7381), ('A powerful gaming laptop with high-end graphics and performance.', -1.8824), ('An affordable laptop suitable for basic tasks and web browsing.', -2.7585)]
Other Tasks
For more details and examples on how to use Artifex for the other available tasks, check out our Documentation.
Contributing
Contributions are welcome! Whether it's a new task module, improvement, or bug fix, we’d love your help. To get started, install the repository locally with:
git clone https://github.com/tanaos/artifex.git
cd artifex
pip install -r requirements.txt
Once you have the code set up, you can start working on any open issue or create a new one. To contribute code, please follow the standard fork --> push --> pull request workflow. All pull requests should be made against the development branch. The maintainers will merge development into master once development is stable.
Before making a contribution, please review the CONTRIBUTING.md and CLA.md, which include important guidelines for contributing to the project.
Not ready to contribute code? You can also help by suggesting a new task or voting up any suggestion.
FAQs
-
Why having Guardrail, Intent Classification, Emotion Detection, Sentiment Analysis etc. as separate tasks, if you already have a Text Classification task?
The Text Classification task is a general-purpose task that allows users to create custom classification models based on their specific needs. Guardrail, Intent Classification, Emotion Detection, Sentiment Analysis etc. are specialized tasks with pre-defined categories and behaviors that are commonly used in various applications. They are provided as separate tasks for two reasons: first, convenience (users can quickly use these models without needing to define their own categories); second, performance (the specialized model typically performs better than re-defining the same model through the general Text Classification model).
Documentation & Support
- Full documentation: https://docs.tanaos.com/artifex
- Get in touch: info@tanaos.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file artifex-0.5.1.tar.gz.
File metadata
- Download URL: artifex-0.5.1.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1bc1f2b85b65b58c6591eb297549c9a751c94caa8f67b035b71a5db87c1f026e
|
|
| MD5 |
31d5b7ca92838fb424bee0a94237d3c5
|
|
| BLAKE2b-256 |
e58e0cb881ac439d716724a848c8b28b0b2411afd959c1f379febb84cc14d11e
|
File details
Details for the file artifex-0.5.1-py3-none-any.whl.
File metadata
- Download URL: artifex-0.5.1-py3-none-any.whl
- Upload date:
- Size: 888.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02be39e08eca07ff65fd7094e4a388b4dd8198857020af1dc2f47f6a57696670
|
|
| MD5 |
f93a134394fd4a0b2c7868a999a028af
|
|
| BLAKE2b-256 |
ecd1ec970ea284bf0010a25cfeff6917e6562ec8d7a25e7384c2228b4269ef83
|