Python bindings for the Transformer models implemented in C/C++ using GGML library.

These details have not been verified by PyPI

Project links

Homepage

Project description

C Transformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.

Supported Models
Installation
Usage
- Hugging Face Hub
- LangChain
Documentation
License

Supported Models

Models	Model Type
GPT-2	`gpt2`
GPT-J, GPT4All-J	`gptj`
GPT-NeoX, StableLM	`gpt_neox`
Dolly V2	`dolly-v2`
StarCoder	`starcoder`

More models coming soon.

Installation

pip install ctransformers

Usage

It provides a unified interface for all models:

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))

Run in Google Colab

If you are getting illegal instruction error, try using lib='avx' or lib='basic':

llm = AutoModelForCausalLM.from_pretrained('/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')

It provides a generator interface for more control:

tokens = llm.tokenize('AI is going to')

for token in llm.generate(tokens):
    print(llm.detokenize(token))

This allows you to use a custom tokenizer.

It also provides access to the low-level C API. See Documentation section below.

Hugging Face Hub

It can be used with models hosted on the Hub:

llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml')

If a model repo has multiple model files (.bin files), specify a model file using:

llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml', model_file='ggml-model.bin')

It can be used with your own models uploaded on the Hub. For better user experience, upload only one model per repo.

To use it with your own model, add config.json file to your model repo specifying the model_type:

{
  "model_type": "gpt2"
}

You can also specify additional parameters under task_specific_params.text-generation:

{
  "model_type": "gpt2",
  "task_specific_params": {
    "text-generation": {
      "top_k": 40,
      "top_p": 0.95,
      "temperature": 0.8,
      "repetition_penalty": 1.1,
      "last_n_tokens": 64
    }
  }
}

See marella/gpt-2-ggml for a minimal example and marella/gpt-2-ggml-example for a full example.

LangChain

LangChain is a framework for developing applications powered by language models. A LangChain LLM object can be created using:

from ctransformers.langchain import CTransformers

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')

print(llm('AI is going to'))

If you are getting illegal instruction error, try using lib='avx' or lib='basic':

llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')

It can also be used with models hosted on the Hugging Face Hub:

llm = CTransformers(model='marella/gpt-2-ggml')

Additional parameters can be passed using the config parameter:

config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}

llm = CTransformers(model='marella/gpt-2-ggml', config=config)

It can be used with other LangChain modules:

from langchain import PromptTemplate, LLMChain

template = """Question: {question}

Answer:"""

prompt = PromptTemplate(template=template, input_variables=['question'])

llm_chain = LLMChain(prompt=prompt, llm=llm)

print(llm_chain.run('What is AI?'))

Documentation

Parameters

Name	Type	Description	Default
`top_k`	`int`	The top-k sampling parameter.	`40`
`top_p`	`float`	The top-p sampling parameter.	`0.95`
`temperature`	`float`	The temperature parameter.	`0.8`
`repetition_penalty`	`float`	The repetition penalty parameter.	`1.0`
`last_n_tokens`	`int`	Number of last tokens to use for repetition penalty.	`64`
`seed`	`int`	Seed for sampling tokens.	Random
`max_new_tokens`	`int`	Maximum number of new tokens to generate.	`256`
`reset`	`bool`	Whether to reset the model state before evaluating a new prompt.	`True`
`batch_size`	`int`	Batch size for evaluating tokens.	`8`
`threads`	`int`	Number of threads to use.	Auto

`class` `AutoModelForCausalLM`

`classmethod` `AutoModelForCausalLM.from_pretrained`

from_pretrained(
    model_path_or_repo_id: 'str',
    model_type: 'Optional[str]' = None,
    model_file: 'Optional[str]' = None,
    config: 'Optional[AutoConfig]' = None,
    lib: 'Optional[str]' = None,
    **kwargs
) → LLM

`class` `LLM`

`method` `LLM.init`

__init__(
    model_path: str,
    model_type: str,
    config: Optional[ctransformers.llm.Config] = None,
    lib: Optional[str] = None
)

`property` LLM.config

`property` LLM.model_path

`property` LLM.model_type

`method` `LLM.detokenize`

detokenize(tokens: Union[Sequence[int], int]) → str

`method` `LLM.eval`

eval(
    tokens: Sequence[int],
    batch_size: Optional[int] = None,
    threads: Optional[int] = None
) → None

`method` `LLM.generate`

generate(
    tokens: Sequence[int],
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None,
    batch_size: Optional[int] = None,
    threads: Optional[int] = None,
    reset: Optional[bool] = None
) → Generator[int, NoneType, NoneType]

`method` `LLM.is_eos_token`

is_eos_token(token: int) → bool

`method` `LLM.reset`

reset() → None

`method` `LLM.sample`

sample(
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None
) → int

`method` `LLM.tokenize`

tokenize(text: str) → List[int]

`method` `LLM.call`

__call__(
    prompt: str,
    max_new_tokens: Optional[int] = None,
    top_k: Optional[int] = None,
    top_p: Optional[float] = None,
    temperature: Optional[float] = None,
    repetition_penalty: Optional[float] = None,
    last_n_tokens: Optional[int] = None,
    seed: Optional[int] = None,
    batch_size: Optional[int] = None,
    threads: Optional[int] = None,
    reset: Optional[bool] = None
) → str

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.27

Sep 10, 2023

0.2.26

Aug 30, 2023

0.2.25

Aug 29, 2023

0.2.24

Aug 24, 2023

0.2.23

Aug 20, 2023

0.2.22

Aug 12, 2023

0.2.21

Aug 7, 2023

0.2.20

Aug 5, 2023

0.2.19

Aug 4, 2023

0.2.18

Aug 2, 2023

0.2.17

Aug 1, 2023

0.2.16

Jul 30, 2023

0.2.15

Jul 28, 2023

0.2.14

Jul 20, 2023

0.2.13

Jul 17, 2023

0.2.12

Jul 15, 2023

0.2.11

Jul 3, 2023

0.2.10

Jun 22, 2023

0.2.9

Jun 18, 2023

0.2.8

Jun 12, 2023

0.2.7

Jun 11, 2023

0.2.5

Jun 2, 2023

0.2.4

May 31, 2023

0.2.3

May 30, 2023

0.2.2

May 28, 2023

0.2.1

May 25, 2023

0.2.0

May 21, 2023

0.1.2

May 19, 2023

0.1.1

May 18, 2023

This version

0.1.0

May 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctransformers-0.1.0.tar.gz (2.1 MB view details)

Uploaded May 14, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ctransformers-0.1.0-py3-none-any.whl (2.1 MB view details)

Uploaded May 14, 2023 Python 3

File details

Details for the file ctransformers-0.1.0.tar.gz.

File metadata

Download URL: ctransformers-0.1.0.tar.gz
Upload date: May 14, 2023
Size: 2.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for ctransformers-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5bf3f253471edfcfe39db424f5d684b604610d29f1b8247e9f4c5098120eb29f`
MD5	`65a2562c86143298d80d0dec4a10bbc8`
BLAKE2b-256	`689c731f9158c2b84a8c32f17ad7e1c8e9195bed92d7d10003677388f41da799`

See more details on using hashes here.

File details

Details for the file ctransformers-0.1.0-py3-none-any.whl.

File metadata

Download URL: ctransformers-0.1.0-py3-none-any.whl
Upload date: May 14, 2023
Size: 2.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for ctransformers-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`54e9c3621a27836f2205943bc1f0065e7380157b93fdad389cf92a30992445ec`
MD5	`aca8a6731a4aeea4d72046a1a263f20c`
BLAKE2b-256	`12d4fb66cdcf784a59de35be8e63d9f85cb75ca66b911d87516ffb32c13425bb`

See more details on using hashes here.

ctransformers 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

C Transformers

Supported Models

Installation

Usage

Hugging Face Hub

LangChain

Documentation

Parameters

class AutoModelForCausalLM

classmethod AutoModelForCausalLM.from_pretrained

class LLM

method LLM.__init__

property LLM.config

property LLM.model_path

property LLM.model_type

method LLM.detokenize

method LLM.eval

method LLM.generate

method LLM.is_eos_token

method LLM.reset

method LLM.sample

method LLM.tokenize

method LLM.__call__

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`class` `AutoModelForCausalLM`

`classmethod` `AutoModelForCausalLM.from_pretrained`

`class` `LLM`

`method` `LLM.init`

`property` LLM.config

`property` LLM.model_path

`property` LLM.model_type

`method` `LLM.detokenize`

`method` `LLM.eval`

`method` `LLM.generate`

`method` `LLM.is_eos_token`

`method` `LLM.reset`

`method` `LLM.sample`

`method` `LLM.tokenize`

`method` `LLM.call`