This package is designed to improve LLMs alility of handling long contexts, which can enhance LLMs to address retrieval task in infinite-length.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Infini-Retri Package

https://github.com/MrYxJ/InfiniRetri/

How to use

Firstly, you can only using pip install package infini-retri to get our method.

pip install infini-retri==0.0.2

Our Method Initialization

It's very convenient. You just need to pass in the model and its tokenizer directly, or you can simply passing in the model name or path. Additionally, it should be noted that our method can only using in tranditional attention-based Transformer, and the parameter of "attn_implementation" currently only using "eager".

from infini_retri import InfiniRetri

model_name_or_path = "Qwen/Qwen2.5-0.5B-Instruct" #  "./models/Qwen2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, attn_implementation="eager") # attn_implementation only using "eager"
tokenizer = AutoTokenizer.from_pretrained(model_name)

ir = InfiniRetri(model, tokenizer)
# ir = InfiniRetri(name_or_path=model_name_or_path)

Our Method in Model Inference

Our methodis to design an innovative processing mechanisum during the model inference(fix model.generate()) to handle texts that exceed the upper limit of the original context length. Specifically, when use, there are there parameters of inputting text: context, question, prompt, as follows:

context: (str) required option in the input text, just the complete contextual content that needs to be procssed(no include question and prompt).
question: (str) option parameter, passed in is your question for the LLMs, for example, the Question text in QA Pair is passed in this parameter. For all tasks, this parameter is recommended to be filled in, as it has a significant impact on the understanding of the LLMs and the ability to provide correct answers.
prompt: (str) option parameter, the instruction template that concatenates the context and question text sections above, especially noting that when concatenating the context and question text sections, two "\n\n" are used to distinguish the boundaries of the text in each section. This is a necessary condition for our method to run normally. For example, its default prompt template is "Read the book and answer the question. Be very concise in your answer.\n\n{context}\n\nQuestion:\n\n{question}\n\nAnswer:". "

In addition, three parameters are provided here for everyone to adjust according to different types of tasks to achieve the best answer effect, as follows:

window_length: (int, default 1024) this controls the length of the context window during the execution of our method. When setting , it only need to ensure that it is less than the maximum context window of the your using model.
topk: (int, default 300) It affects the cache capacity size during the operation of our method and the actual length of context to be processed throughout the inference process. In theory, the larger the value, the larger the retrieval range during the operation. The actual optimal value depends on the user's problem handling and can be self adjusted.
answer_length:(int, default 8) It affects the effectiveness of outputting the correct answer, and its value can be set based on the user's expected token length of the correct answer in the context section. In theory, the closer the token length is set to the correct answer in the context, the better the effect of the model's answer under our method.

# This short passage is extracted from HarryPotter just present usage of our menthod. 
# Due to its short length, it cannot demonstrate the advantages of our method in handling task on ultra long text. 
context = """
Harry woke at five o'clock the next 
morning and was too excited and nervous to go back to sleep. He got up and pulled on his jeans because he didn't want to walk into the station in his wizard's robes — he'd change on the train. He checked his Hogwarts list yet again to make sure he had everything he needed, saw that Hedwig was shut safely in her cage, and then paced the room, waiting for the Dursleys to get up. Two hours later, Harry's huge, heavy trunk had been loaded into the Dursleys’ car, Aunt Petunia had talked Dudley into sitting next to Harry, and they had set off.They reached King's Cross at half past ten. Uncle Vernon dumped Harry's trunk onto a cart and wheeled it into the station for him. Harry thought this was strangely kind until Uncle Vernon stopped dead, facing the platforms with a nasty grin on his face.
"""  

question = "Why did Harry decide to wear jeans instead of his wizard's robes to the train station?"

prompt = "Read the book and answer the question. Be very concise in your answer.\n\n{context}\n\nQuestion:\n\n{question}\n\nAnswer:" # Note "\n\n" in boundary.

response = ir.generate(context=context, question=question, prompt=prompt)
print("Response:", response)

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.3

Feb 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infini_retri-0.0.3.tar.gz (28.2 kB view details)

Uploaded Feb 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

infini_retri-0.0.3-py3-none-any.whl (25.7 kB view details)

Uploaded Feb 18, 2025 Python 3

File details

Details for the file infini_retri-0.0.3.tar.gz.

File metadata

Download URL: infini_retri-0.0.3.tar.gz
Upload date: Feb 18, 2025
Size: 28.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for infini_retri-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`16fa2a81986ade083f1589f28bcedbc37de5de948cb6892e8a8d3ec77f971a89`
MD5	`d28acbb98497ce1fbba59012cea05c02`
BLAKE2b-256	`a7901c10f15298ef36d34d78736b9f16606bebb3a3811a8fffc8766ec5bb5588`

See more details on using hashes here.

File details

Details for the file infini_retri-0.0.3-py3-none-any.whl.

File metadata

Download URL: infini_retri-0.0.3-py3-none-any.whl
Upload date: Feb 18, 2025
Size: 25.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for infini_retri-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`92a9dd55e3211642bbd89d942f609419cf6d9758b755c433a59cb108f7dcc4fe`
MD5	`f14edabaa24436d4638f6721557615b8`
BLAKE2b-256	`91cea04386bea73d07efc65e3110e9fe19ed823d62716770ce474c725af5d924`

See more details on using hashes here.

infini-retri 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Infini-Retri Package

How to use

Our Method Initialization

Our Method in Model Inference

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes