Python bindings for @ggerganov's llama.cpp
Project description
Python bindings for llama.cpp
Install
From PyPI
pip install llamacpp
Build from Source
pip install .
Get the model weights
You will need to obtain the weights for LLaMA yourself. There are a few torrents floating around as well as some huggingface repositories (e.g https://huggingface.co/nyanko7/LLaMA-7B/). Once you have them, copy them into the models folder.
ls ./models
65B 30B 13B 7B tokenizer_checklist.chk tokenizer.model
Convert the weights to GGML format using llamacpp-convert
. Then use llamacpp-quantize
to quantize them into INT4. For example, for the 7B parameter model, run
llamacpp-convert ./models/7B/ 1
llamacpp-quantize ./models/7B/
llamacpp-cli
Note that running llamacpp-convert
requires torch
, sentencepiece
and numpy
to be installed. These packages are not installed by default when your install llamacpp
.
Command line interface
The package installs the command line entry point llamacpp-cli
that points to llamacpp/cli.py
and should provide about the same functionality as the main
program in the original C++ repository. There is also an experimental llamacpp-chat
that is supposed to bring up a chat interface but this is not working correctly yet.
API
Documentation is TBD. But the long and short of it is that there are two interfaces
LlamaInference
- this one is a high level interface that tries to take care of most things for you. The demo script below uses this.LlamaContext
- this is a low level interface to the underlying llama.cpp API. You can use this similar to how the main example inllama.cpp
does uses the C API. This is a rough implementation and currently untested except for compiling successfully.
Demo script
See llamacpp/cli.py
for a detailed example. The simplest demo would be something like the following:
import sys
import llamacpp
def progress_callback(progress):
print("Progress: {:.2f}%".format(progress * 100))
sys.stdout.flush()
params = llamacpp.InferenceParams.default_with_callback(progress_callback)
params.path_model = './models/7B/ggml-model-q4_0.bin'
model = llamacpp.LlamaInference(params)
prompt = "A llama is a"
prompt_tokens = model.tokenize(prompt, True)
model.update_input(prompt_tokens)
model.ingest_all_pending_input()
model.print_system_info()
for i in range(20):
model.eval()
token = model.sample()
text = model.token_to_str(token)
print(text, end="")
# Flush stdout
sys.stdout.flush()
model.print_timings()
ToDo
- Investigate using dynamic versions using setuptools-scm (Example: https://github.com/pypa/setuptools_scm/blob/main/scm_hack_build_backend.py)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for llamacpp-0.1.14-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06b72ff0b71bbd392253e117272de1adaf61d2888cfa105da8c69f349b1f98ae |
|
MD5 | f5c74cfbd00cc67b0943a3aab6039f6d |
|
BLAKE2b-256 | 127d0bc3a741099e8af079d755772072daea682f4ef90de1d786c8db1bf0bcdf |
Hashes for llamacpp-0.1.14-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e519e8f35ddc608db495230946da2df4376db5934873dc61ff34e77a842451a6 |
|
MD5 | af8b34259dc433597456a40ed38ec3ee |
|
BLAKE2b-256 | 10f7b9363cccc4409fe38482309e66cfce654c372cdd8a6b12ee4b2cab8a3d5f |
Hashes for llamacpp-0.1.14-cp311-cp311-musllinux_1_1_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67db633a3d62a4a5cbd54994b1ff7f7e37d9e1b3bbcb46521e48a67e6e909ce7 |
|
MD5 | b8caef92efe0eb22a1fbcb363f6c658e |
|
BLAKE2b-256 | 029d04f7ec0cc283b37fd3664e9e46064db42b949256cadcb5d3c297e5c0bc01 |
Hashes for llamacpp-0.1.14-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee41eb9510e8d6c6b944d85c35fbfc51424dfb6953e21f5ab68664bb1c69f24e |
|
MD5 | 9c9469c97684f87075715241e8c9f35c |
|
BLAKE2b-256 | 56d4b51839d898960fa09527fcede8e2f0d8cf5b5842affb1f739c24ea9ad751 |
Hashes for llamacpp-0.1.14-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb550c02b090a0f9fb33add8cd11eb7afc55126b035780873a37e6ffe3ba6217 |
|
MD5 | b4431dc0fa041d935624aeb53f606e48 |
|
BLAKE2b-256 | e8502895cecce9cd32f1a7d1cbaa43ee349e6844112b6dd8c2fae776364fc912 |
Hashes for llamacpp-0.1.14-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68284591487c23c0f14d7a8408586fa6a2f52a227fbf99710bb61095caf8eef2 |
|
MD5 | 7df62a5a02b21e3cc36b0631cbaca49f |
|
BLAKE2b-256 | c8d06edda7f6ac88cebdcad5ac5a2c35bfcc4fbd498f9075470154e07c2d428a |
Hashes for llamacpp-0.1.14-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5eb34a3755fabab53a8e9925a2924aa378f3fa992e1c1ac5a6bbad04101f3069 |
|
MD5 | 3ac86a8cc72205e72049af6f923ff183 |
|
BLAKE2b-256 | 509973a563bcc82ef76ebb14984361491a8c8381ac6efcde223384e8f3092ba2 |
Hashes for llamacpp-0.1.14-cp311-cp311-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c76c6a79340c053310db0047a292412bc846468fbfa833681753c335baf91939 |
|
MD5 | d2841c5b299b22db4c1834b9ead3db3a |
|
BLAKE2b-256 | 927426e11f7dee31a4ccb88947421fc18dc8b5450b1db61777a5c2432fc25b42 |
Hashes for llamacpp-0.1.14-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 649f7d675d3b8237e13c02462b8225254c985eb3a5bcecc78c57f3ef0dc17067 |
|
MD5 | a4b4a52d4a2881d0be8443f43f2fab6d |
|
BLAKE2b-256 | 026a2576809146cfcf8e3b1b948d61938683a634222d83032a9e8fd17167f562 |
Hashes for llamacpp-0.1.14-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae81ff6e293b7798f1990acc371e2f759f53da4225ceeaf81d760a20b596abbd |
|
MD5 | 89f0dd21e0cdd21555000a034815328d |
|
BLAKE2b-256 | a4c2d40709d939c35d1b91afc49ae165584e619983cd0f1dced3eee62e02da9f |
Hashes for llamacpp-0.1.14-cp310-cp310-musllinux_1_1_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f21557b79f57970d0b243f5d84bbc608807ff7069855cc71bcd10611344f5d72 |
|
MD5 | 18a6a71e93eb341c41e46a244342eb97 |
|
BLAKE2b-256 | f88d560d58bf8d814c360e12c18597a95f48d2dd7d24b1bd4b83afee84488d15 |
Hashes for llamacpp-0.1.14-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0d77d7754183faf8eeed644b51679e5ff0676e308c12251d98521fa6f085871 |
|
MD5 | 767dccdbcfbe017ec2c7d777bd4870ec |
|
BLAKE2b-256 | 3fbfb3b7e080b161c57a8c3b816efb9974112b54e09647e4a550d5393d93bb21 |
Hashes for llamacpp-0.1.14-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3109beac3f9693f9f7b96d8b8ba3692369db1d6972fc80dfe2e85010e5396c9f |
|
MD5 | 42d76805eb345dd8d703728b1f843b11 |
|
BLAKE2b-256 | 949c57259e5360609a1c77e3c622fa4ab45a55640fe068ade1a1054dbffb9177 |
Hashes for llamacpp-0.1.14-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c32965163468f7cc617fc7ce6a722bbbc937d5a6ba1ddde95aa9cf8e80151fe3 |
|
MD5 | c91147397a2d314185b541564e5766ca |
|
BLAKE2b-256 | 67cafb6bd6dae02783b1d796286f32706f86701750347a9b3e12e6d117d89c74 |
Hashes for llamacpp-0.1.14-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 955d9cb5077d6df39781dfad5668ff3b1b427e7208d5cf482422ebb44e7cadf5 |
|
MD5 | a6e228c8f2869f4ebc009d5914903a71 |
|
BLAKE2b-256 | 54f570dd563eb8d6b11ba275db4f453bf78c8f35c1505ff8f443001df678997e |
Hashes for llamacpp-0.1.14-cp310-cp310-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aacfef0774bf5a213ebbd29a151793f352d1ee31b8220b68582d53c19a82aaa4 |
|
MD5 | 96b15b585968e9dabbfea1842af2e459 |
|
BLAKE2b-256 | 960299448e4cd35df3f9786d464113633f14711f56c470cbf9aeaf926145fe0f |
Hashes for llamacpp-0.1.14-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77663b939554b780e3208cea33c2938feb77d3594ddcbf42863c22cd9e87bdcb |
|
MD5 | ae9a6dcba367cc9fb3decdefb97eaf30 |
|
BLAKE2b-256 | 8cf61b45ab60a173f4b976b4fa12207b06c137076a72b9f10497bbc185f493d1 |
Hashes for llamacpp-0.1.14-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 36fff86f13431cae3b1406716a40337c075ecce55481d104f7be685d064be394 |
|
MD5 | 708a8e890256235261d8d4c9fede8426 |
|
BLAKE2b-256 | bb87bfc80c5ce29a2ffe2eaed70a222acd44b011c992fb1fa3c2a8312f92e3fb |
Hashes for llamacpp-0.1.14-cp39-cp39-musllinux_1_1_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c386e384d311b895b3da0b335d439e1fce5ea1ebb484730330009d514fd8b50 |
|
MD5 | 1b394abf5b636f220cc86606035fe71c |
|
BLAKE2b-256 | 762845ddb14bfd667df46a7bf576ab0761f3befc5ac17c5729ec1ea4d4526ae2 |
Hashes for llamacpp-0.1.14-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e220e6efb34eb26e5061e8c147c91c45ff529c6445f3fe63caa12a8f56fde9c |
|
MD5 | 6d0a72625526ce3e4e3627f018a1b54c |
|
BLAKE2b-256 | 28bd905c5202932a1a91149e3d7d80858fee64b7f9f3103f980b7638e41d2e50 |
Hashes for llamacpp-0.1.14-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 362d042cc07629ff74fcc3c8b98aa86566b4649a3c186fd80c027af91db25c2f |
|
MD5 | e070d5c8c2a8bc0462c53915ec118d7f |
|
BLAKE2b-256 | c2d7e2294de6e28e4fc2a255391717a221f41748fdea84e0a50833311be346d2 |
Hashes for llamacpp-0.1.14-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52fecc05bfb6a7673e375508f307f9fa953bcf224043b69f3669f5358fd3a533 |
|
MD5 | bae1d1e12928152f0d052fccee52083e |
|
BLAKE2b-256 | a9cb2bedb6fb8482b61ad9b2eb42da1b432eee2c35f9c80b5fec7815f4f41045 |
Hashes for llamacpp-0.1.14-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5949bc02c3e15b38142e2985892bc0271a5fa53794e9c2dc0db469433e5657b9 |
|
MD5 | 6d663d40fee72c4a5b93311a71017a00 |
|
BLAKE2b-256 | f47fa689d4e4a91e6272bbdfffbedfaebe06d54dfffcad71bb6364d0585d4106 |
Hashes for llamacpp-0.1.14-cp39-cp39-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c756d8a7df225ecc18ef869e344da17b9c3ce1d769f589f8e0933b2066f571e6 |
|
MD5 | 1044f8c7a8ce780748a481bc79fab0e3 |
|
BLAKE2b-256 | aa1809019f7a9dfc468433bb38d5b2900dc9ea8c1a90207c32495d9912adb683 |
Hashes for llamacpp-0.1.14-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c892e8904831d3d0e9ed6307c2d18f6b8a750ba88f59d6af8647b3b4376da977 |
|
MD5 | 84f131076ec2154383f0ac506b68becd |
|
BLAKE2b-256 | 8ab3e4bd87d5d29dbc761e22fa592cab25b39f782e021a3c6d390ad7a3b06975 |
Hashes for llamacpp-0.1.14-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e8320464f654d4eb160a3fa9b529a2b1eaf724901587fd4a2c53dfec663376b |
|
MD5 | 3e6f2fe23870d15bec0d3d5df54a2ce4 |
|
BLAKE2b-256 | 8235566b00df380288a709e9c502131fa542ee984245b222b9a444556e6b4b0d |
Hashes for llamacpp-0.1.14-cp38-cp38-musllinux_1_1_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba4bf33f6cda5f4246e60d24758d7b43b704c59ac9cf11b9bd5f7c54cde2bb1f |
|
MD5 | 04a43e8f9e0254ac8fead6b3abf7ba25 |
|
BLAKE2b-256 | 051ab39f6f8bd3acce96ef717620df59fa1c2f2c703b8865d960b791267a8cf4 |
Hashes for llamacpp-0.1.14-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1cffa2275f0f3f7d790f4fe601ffd16e74e1f03a089c907a280826caa23e9be6 |
|
MD5 | d1b84c06576982081dc119aeca280b30 |
|
BLAKE2b-256 | 930e2d67a85d24640bc0e7d5ef5daa197f2d4fcea2bdb39e9b610434ff58e22d |
Hashes for llamacpp-0.1.14-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c69b07bc101c9ca418d76d7d4e8508a2b4bbd0ec7d91d359040afb590e2c49a3 |
|
MD5 | 87994f413d9ff7dacb4726fd2271b4ba |
|
BLAKE2b-256 | 77289f0fcea9a572bef4acb7865353d6882b1de5d9e88738545e43b7ecec0499 |
Hashes for llamacpp-0.1.14-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | abc7c5fdcca9e3b68add7fd174265d0ec25678d74100ac4640a875ed38effbd1 |
|
MD5 | d56fcc8bcff7e7775259910a0a96d0e4 |
|
BLAKE2b-256 | a3c0c2156e6126b4b198bfa864199f6f5e3d800c8a2fc7fcf3f0a3c976f481a8 |
Hashes for llamacpp-0.1.14-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c5222ca3d94d43bb876965b1fa201aa6d8a9b9555fdce546095b2a3d1087d1f |
|
MD5 | 7421b3a0493d2b9239802aabc95a0ef7 |
|
BLAKE2b-256 | d8d6a4c404b447704209b80243da6f462745224fd4c01aa18e7d354d9281d300 |
Hashes for llamacpp-0.1.14-cp38-cp38-macosx_10_9_universal2.macosx_10_9_x86_64.macosx_11_0_arm64.macosx_11_0_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b4ac68cf55e7ffd1d898347ef48c3977d28345056e4b76fcd50e329bb6b0057 |
|
MD5 | bd73201b4d7fa1a9905cf807ad74eb0b |
|
BLAKE2b-256 | 92a6d8f71be40917df058c7d2cd986ea9574782e739cbf5941845b09e8ef0c0a |
Hashes for llamacpp-0.1.14-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f53bd0abe1ac3d0c30680f0537071229c5f7e9d2fb2b1cb1282f0096928908f |
|
MD5 | 46181f93a0584add9f639497205cb3ba |
|
BLAKE2b-256 | 7117be971ec7415e6cfa01fe0ae11c48124c6b7f6745ef5c87520fb3e7629293 |
Hashes for llamacpp-0.1.14-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b1699b96f279278421c5e2df859df3d4d3785a28aa2ff984b0aacc14ee0f793 |
|
MD5 | 81100a7694e8e708e700db51f624eb8a |
|
BLAKE2b-256 | 7b360343fb3834bf6b178b00264e8026e0c3e275e4b771fadd23585d3f5d0d0a |
Hashes for llamacpp-0.1.14-cp37-cp37m-musllinux_1_1_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71b247979dc49cee7889983c5f4c4b1a295554100df3655024452a162c1ce985 |
|
MD5 | 93c877ab6d002ee43b3a0e8ccb321fe9 |
|
BLAKE2b-256 | c3a270f756d1fb6405af515932637d726d8897cddca085a0c466fe99ba724f21 |
Hashes for llamacpp-0.1.14-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ffc02516886b8742d7bf17c753be74a907a600cc9edf5fc63ddaa1b7e22815c0 |
|
MD5 | fac1883deb1cfc1b3d1f11b5c318a371 |
|
BLAKE2b-256 | e474995c6891ec7d3bdc0d83b952092138c3b5c63e114acf5f385a5bdccf16c2 |
Hashes for llamacpp-0.1.14-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f52b43b2b67980a81aec61cc4368e5c28ee9895a487b5fcb82729e42cd49e79 |
|
MD5 | 41f663d8243fc63be415a6d74ae5bea9 |
|
BLAKE2b-256 | a2bb0cb21a7358f3c729205628b6987b79283ca6f86dbe346151cc0acaa96985 |
Hashes for llamacpp-0.1.14-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce4379cb8ae3355d630b3edd0f08138c3c4e8ce344ea39e5e7d9f72c0b4d1c57 |
|
MD5 | 090a199f84e2767512b29d8da92ef7de |
|
BLAKE2b-256 | 479ef3536ab627ed882fbad9dbe1e90d7478ee42606a638f0087baac1ec6ac2b |