Nano Llama
Project description
nanollama32
A compact and efficient implementation of the Llama 3.2 in a single file, featuring minimal dependencies—no transformers library required, even for tokenization.
Overview
nanollama32
provides a lightweight and straightforward implementation of the Llama model. It features:
- Minimal dependencies
- Easy-to-use interface
- Efficient performance suitable for various applications
Installation
To get started, clone this repository and install the necessary packages.
git clone https://github.com/JosefAlbers/nanollama32.git
cd nanollama32
pip install -e .
Usage
Here’s a quick example of how to use nanollama32
:
>>> from nanollama import Chat
# Initialize the chat instance
>>> chat = Chat()
# Start a conversation
>>> chat("What's the weather like in Busan?")
# Llama responds with information about the weather
# Follow-up question that builds on the previous context
>>> chat("And how about the temperature?")
# Llama responds with the temperature, remembering the previous context
# Another follow-up, further utilizing context
>>> chat("What should I wear?")
# Llama suggests clothing based on the previous responses
Command-Line Interface
You can also run nanollama32
from the command line:
nlm how to create a new conda env
# Llama responds with ways to create a new conda environment and prompts the user for further follow-up questions
Managing Chat History
- --history: Specify the path to the JSON file where chat history will be saved and/or loaded from. If the file does not exist, a new one will be created.
- --resume: Use this option to resume the conversation from a specific point in the chat history.
For example, to resume from a specific entry in history:
nlm "and to delete env?" --resume 20241026053144
You can also specify 0
to resume from the most recent entry:
nlm "and to list envs?" --resume 0
Adding Text from Files
You can include text from external files by using the {...}
syntax in your input. For example, if you have a text file named example.txt
, you can include its content in your input like this:
nlm how to load weights {langref.rst}
License
This project is licensed under the MIT License. See the LICENSE file for more details.
Acknowledgements
This project builds upon the MLX implementation and Karpathy's LLM.c implementation of the Llama model. Special thanks to the contributors of both projects for their outstanding work and inspiration.
Contributing
Contributions are welcome! Feel free to submit issues or pull requests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nanollama-0.0.1a0.tar.gz
.
File metadata
- Download URL: nanollama-0.0.1a0.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 411e9689532e630be356145cc65c346f7c9f3df93da5e0d597856692b2732bf5 |
|
MD5 | 666c853810f81ded40e9d61b90143165 |
|
BLAKE2b-256 | d802c9bcf3458eff1585e25f5f9828c26043fc25514a4ba70bbf76f4314d473e |
File details
Details for the file nanollama-0.0.1a0-py3-none-any.whl
.
File metadata
- Download URL: nanollama-0.0.1a0-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f127bbcd9f59f0c32d87b9d0b2ebe8cd06349f839cde4eb5e64b0cc73b0f3f87 |
|
MD5 | a11dfd04ae3c659805da11ffc9a4fc57 |
|
BLAKE2b-256 | 3bd07cd29673e59cb325c1c4eeeb9ebf61d3087ffa7d866958aece04a3f727a3 |