Skip to main content

LangChain integration for modal-gpu-ez: serverless GPU inference with HuggingFace models

Project description

langchain-modal-gpu-ez

modal-gpu-ez를 LangChain에서 사용할 수 있게 해주는 통합 패키지.

Modal 서버리스 GPU + HuggingFace 모델을 LangChain 체인, 에이전트, RAG 파이프라인에서 원라이너로 활용한다.

설치

pip install langchain-modal-gpu-ez

사용법

LLM (텍스트 생성)

from langchain_modal_gpu_ez import ModalGpuEzLLM

llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4")
result = llm.invoke("Once upon a time")
print(result)

LangChain 체인에서 사용

from langchain_core.prompts import PromptTemplate
from langchain_modal_gpu_ez import ModalGpuEzLLM

llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4", max_new_tokens=100)
prompt = PromptTemplate.from_template("Tell me about {topic}")
chain = prompt | llm

result = chain.invoke({"topic": "open source"})

Embeddings (임베딩)

from langchain_modal_gpu_ez import ModalGpuEzEmbeddings

embeddings = ModalGpuEzEmbeddings(
    model_id="BAAI/bge-small-en-v1.5",
    gpu="Local",  # 임베딩은 로컬에서도 충분히 빠름
)

vectors = embeddings.embed_documents(["hello world", "test sentence"])
query_vector = embeddings.embed_query("search query")

벡터 스토어와 함께 사용

from langchain_community.vectorstores import FAISS
from langchain_modal_gpu_ez import ModalGpuEzEmbeddings

embeddings = ModalGpuEzEmbeddings()
vectorstore = FAISS.from_texts(
    ["Python is great", "LangChain is powerful"],
    embeddings,
)

docs = vectorstore.similarity_search("programming language")

GPU 선택

GPU VRAM 가격/시간
T4 16GB $0.27
L4 24GB $0.59
A10G 24GB $0.54
A100 40GB $1.64
H100 80GB $3.89
"Local" - 무료
"auto" - 모델 크기 기반 자동 선택

환경 변수

MODAL_TOKEN_ID=your_modal_token_id
MODAL_TOKEN_SECRET=your_modal_token_secret
HF_TOKEN=your_hf_token  # 게이트 모델 사용 시

라이선스

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_modal_gpu_ez-0.1.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_modal_gpu_ez-0.1.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file langchain_modal_gpu_ez-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_modal_gpu_ez-0.1.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for langchain_modal_gpu_ez-0.1.0.tar.gz
Algorithm Hash digest
SHA256 60f09e907815d31474d20132866129974d60dbc02b76fe81dc647c7e51d18085
MD5 a4edb6868358f39f2bdf10f28d1f1d4c
BLAKE2b-256 977b26e42b2d2e94606ba505ff2e1dd76ec7b04c62a85ea6e8a78bc02ffad453

See more details on using hashes here.

File details

Details for the file langchain_modal_gpu_ez-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_modal_gpu_ez-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79bf998bd47c411a69352dccb216ebfe4969a704d68fe8c178517e3899793b19
MD5 65622f058d66dac23dd8a3f8609dec8d
BLAKE2b-256 5dc48793199de188c6c540fc596e6bde6f09807580a542247d6f52eb6d033083

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page