Skip to main content

A lightweight python tool for effortless text similarity scoring using Hugging Face models

Project description

img

Hugging-Mapper

A lightweight python tool for easy text similarity scoring using Hugging Face models

PyPI Python application Read the Docs PyPI - Python Version

GitHub issues GitHub license GitHub last commit GitHub stars

Table of Contents :bookmark_tabs:

Installation

pip install hugging-mapper

Features

  • Easily compare how similar two pieces of text are
  • Customizable model selection at initialization
  • Works with Hugging Face models that create sentence embeddings
  • Batch scoring for lists of sentence pairs

Usage

Embedding text using huggingface models

from hugger.mapper import HuggingMapper

# init
# default model_name is 'cambridgeltl/SapBERT-from-PubMedBERT-fulltext'
mapper = HuggingMapper(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

# generate embedding
embedding = mapper.embed_text("I hope you'll find this helpful.")

Similarity search of given data

from hugger.mapper import NodeMapper
import pandas as pd

# demo data
data = pd.DataFrame({
    "id": ["node1", "node2", "node3"], 
    "text": ["Disease", "Gene", "Drug"]
})

# generate embeddings for data using (default) huggingface model
node_mapper = NodeMapper(data)

# get most similar 
# threshold 0 returns all data sorted by similarity to the given term
most_similar = node_mapper.get_similar("protein", threshold=0)

# get matching node
node_id, metadata = node_mapper.get_match("genetics", threshold=0.7)

Documentation

Tutorials and documentation are available on Read the Docs :notebook_with_decorative_cover::grinning:

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hugging_mapper-1.0.1.tar.gz (23.2 kB view details)

Uploaded Source

File details

Details for the file hugging_mapper-1.0.1.tar.gz.

File metadata

  • Download URL: hugging_mapper-1.0.1.tar.gz
  • Upload date:
  • Size: 23.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hugging_mapper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 7af2b4d4a1ea73b09f7b49acbc3919c31a5cf580ce88159bf4ef5b6011602a22
MD5 2a8572313c3daa550bc8b8d0b9189fa8
BLAKE2b-256 2cc7893e8b12a57b5e57d253142fe55a29cf4af3334b48244725f97cda9e0cf5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page