A python package to map your own csv files data using Atlas from NOMIC
Project description
This is a vesy simple way to map your text data using Altas from NOMIC using the lib click
.
You have to create an account to get API_KEY NOMIC.
<< Atlas enables you to:
-
Store, update and organize multi-million point datasets of unstructured text, images and embeddings.
-
Visually interact with your datasets from a web browser.
-
Run semantic search and vector operations over your datasets. Use Atlas to:
- Visualize, interact, collaborate and share large datasets of text and embeddings.
- Collaboratively clean, tag and label your datasets
- Build high-availability apps powered by semantic search
- Understand and debug the latent space of your AI model trains >>
How to use
Installation
To install the necessary dependencies, run the following command:
python -m venv mymapenv
source mymapenv/bin/activate
pip install --upgrade pip
pip install text2mapviewer
Login NOMIC server
Login/create your Nomic account:
nomic login
If you have already your account :
nomic login [YOUR_API_TOKEN_NOMIC_HERE]
Examples :
from NOMIC and with lib text2mapviewer
from text2mapviewer.examples.map_embedding import project
# Use the projet from the lib text2mapviewer
print(project)
With the lib click
after clone this ripo
python scr/text2mapviewer/examples/map_embedding_click.py --num_embeddings 10000 --embedding_dim 256
The Animation Ouput
Supported Transformer Models from Hugging Face
This project supports a variety of transformer models, including models from the Hugging Face Model Hub and sentence-transformers. Below are some examples: - Hugging Face Model: 'prajjwal1/bert-mini' - Hugging Face Model: 'Sahajtomar/french_semantic' (french version for semantic search embedding) - Sentence-Transformers Model: 'sentence-transformers/all-MiniLM-L6-v2' etc...
Please ensure that the model you choose is compatible with the project requirements and adjust the --transformer_model_name
option accordingly.
To map your text/csv files
pip install -r requirements.txt
python main.py --transformer-model-name MODEL_NAME --cache_dir CACHE_DIR --batch-size BATCH_SIZE --file-path FILE_PATH
NOTE: for the CACHE_DIR : you can setup it like ==>
export TRANSFORMERS_CACHE=/path_to_your/transformers_cache
Give a fidback.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for text2mapviewer-0.2.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07239f079beeaf95d9e6697a98322319fa02b075db105a9f2f02a2595ae66f1d |
|
MD5 | ff4a4d82a9310f1937cd5db233bd1d4b |
|
BLAKE2b-256 | c13ade12f0db582b3c75ae7815e67c170170b74339e8bcc4eefe2a0656014264 |