LLM prompt/context preparation utility
Project description
contextualize
contextualize
is a package to quickly retrieve and format file contents for use with LLMs.
Installation
You can install the package using pip:
pip install contextualize
or pipx for using the CLI globally:
pipx install contextualize
Usage (reference.py
)
Define FileReference
objects for specified file paths and optional ranges.
- set
range
to a tuple of line numbers to include only a portion of the file, e.g.range=(1, 10)
- set
format
to "md" (default) or "xml" to wrap file contents in Markdown code blocks or<file>
tags - set
label
to "relative" (default), "name", or "ext" to determine what label is affixed to the enclosing Markdown/XML string- "relative" will use the relative path from the current working directory
- "name" will use the file name only
- "ext" will use the file extension only
Retrieve wrapped contents from the output
attribute.
CLI
A CLI (cli.py
) is provided to print file contents to the console from the command line.
cat
: Prepare and concatenate file referencespaths
: Positional arguments for target file(s) or directories--ignore
: File(s) to ignore (optional)--format
: Output format (md
orxml
, default ismd
)--label
: Label style (relative
for relative file path,name
for file name only,ext
for file extension only; default isrelative
)--output
: Output target (console
(default),clipboard
)--output-file
: Output file path (optional, compatible with--output clipboard
)
ls
: List token countspaths
: Positional arguments for target file(s) or directories--encoding
: Encoding to use for tokenization, e.g.,cl100k_base
(default),p50k_base
,r50k_base
--model
: Model (e.g.,gpt-3.5-turbo
/gpt-4
(default),text-davinci-003
,code-davinci-002
) to determine which encoding to use for tokenization. Not used ifencoding
is provided.
Examples
cat
:contextualize cat README.md
will print the wrapped contents ofREADME.md
to the console with default settings (Markdown format, relative path label).contextualize cat README.md --format xml
will print the wrapped contents ofREADME.md
to the console with XML format.contextualize cat contextualize/ dev/ README.md --format xml
will prepare file references for files in thecontextualize/
anddev/
directories andREADME.md
, and print each file's contents (wrapped in corresponding XML tags) to the console.
ls
:contextualize ls README.md
will count and print the number of tokens inREADME.md
using the defaultcl100k_base
encoding.contextualize ls contextualize/ --model text-davinci-003
will count and print the number of tokens in each file in thecontextualize/
directory using thep50k_base
encoding associated with thetext-davinci-003
model, then print the total tokens for all processed files.
Related projects
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
contextualize-0.0.3.tar.gz
(9.0 kB
view details)
Built Distribution
File details
Details for the file contextualize-0.0.3.tar.gz
.
File metadata
- Download URL: contextualize-0.0.3.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ea987012fb6c35d2123415a621817b34565fcaa84ec512575007b8624d3426d |
|
MD5 | d5d89fae984ff1b0a398c15d24150359 |
|
BLAKE2b-256 | bd624f2336887b4e8453b6d9284fcc61f4f4b611a69d58cdc5eb30bc6e562716 |
File details
Details for the file contextualize-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: contextualize-0.0.3-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e0f76e342b435ba840f946f80e5ccbbcb3305d7a41c165a1ae6e5679026aa9d |
|
MD5 | 11ae024e0df32ac4ea00ca329351969f |
|
BLAKE2b-256 | eccd7646787d831414436499fdf7533eeacb62005c9622b4f48f4787b7983cce |