LangChain integration for modal-gpu-ez: serverless GPU inference with HuggingFace models
Project description
langchain-modal-gpu-ez
modal-gpu-ez를 LangChain에서 사용할 수 있게 해주는 통합 패키지.
Modal 서버리스 GPU + HuggingFace 모델을 LangChain 체인, 에이전트, RAG 파이프라인에서 원라이너로 활용한다.
설치
pip install langchain-modal-gpu-ez
사용법
LLM (텍스트 생성)
from langchain_modal_gpu_ez import ModalGpuEzLLM
llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4")
result = llm.invoke("Once upon a time")
print(result)
LangChain 체인에서 사용
from langchain_core.prompts import PromptTemplate
from langchain_modal_gpu_ez import ModalGpuEzLLM
llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4", max_new_tokens=100)
prompt = PromptTemplate.from_template("Tell me about {topic}")
chain = prompt | llm
result = chain.invoke({"topic": "open source"})
Embeddings (임베딩)
from langchain_modal_gpu_ez import ModalGpuEzEmbeddings
embeddings = ModalGpuEzEmbeddings(
model_id="BAAI/bge-small-en-v1.5",
gpu="Local", # 임베딩은 로컬에서도 충분히 빠름
)
vectors = embeddings.embed_documents(["hello world", "test sentence"])
query_vector = embeddings.embed_query("search query")
벡터 스토어와 함께 사용
from langchain_community.vectorstores import FAISS
from langchain_modal_gpu_ez import ModalGpuEzEmbeddings
embeddings = ModalGpuEzEmbeddings()
vectorstore = FAISS.from_texts(
["Python is great", "LangChain is powerful"],
embeddings,
)
docs = vectorstore.similarity_search("programming language")
GPU 선택
| GPU | VRAM | 가격/시간 |
|---|---|---|
| T4 | 16GB | $0.27 |
| L4 | 24GB | $0.59 |
| A10G | 24GB | $0.54 |
| A100 | 40GB | $1.64 |
| H100 | 80GB | $3.89 |
"Local" |
- | 무료 |
"auto" |
- | 모델 크기 기반 자동 선택 |
환경 변수
MODAL_TOKEN_ID=your_modal_token_id
MODAL_TOKEN_SECRET=your_modal_token_secret
HF_TOKEN=your_hf_token # 게이트 모델 사용 시
라이선스
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_modal_gpu_ez-0.1.0.tar.gz.
File metadata
- Download URL: langchain_modal_gpu_ez-0.1.0.tar.gz
- Upload date:
- Size: 12.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60f09e907815d31474d20132866129974d60dbc02b76fe81dc647c7e51d18085
|
|
| MD5 |
a4edb6868358f39f2bdf10f28d1f1d4c
|
|
| BLAKE2b-256 |
977b26e42b2d2e94606ba505ff2e1dd76ec7b04c62a85ea6e8a78bc02ffad453
|
File details
Details for the file langchain_modal_gpu_ez-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_modal_gpu_ez-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79bf998bd47c411a69352dccb216ebfe4969a704d68fe8c178517e3899793b19
|
|
| MD5 |
65622f058d66dac23dd8a3f8609dec8d
|
|
| BLAKE2b-256 |
5dc48793199de188c6c540fc596e6bde6f09807580a542247d6f52eb6d033083
|