Skip to main content

This package is designed to improve LLMs alility of handling long contexts, which can enhance LLMs to address retrieval task in infinite-length.

Project description

Infini-Retri Package

https://github.com/MrYxJ/InfiniRetri/

How to use

Firstly, you can only using pip install package infini-retri to get our method.

pip install infini-retri==0.0.2

Our Method Initialization

It's very convenient. You just need to pass in the model and its tokenizer directly, or you can simply passing in the model name or path. Additionally, it should be noted that our method can only using in tranditional attention-based Transformer, and the parameter of "attn_implementation" currently only using "eager".

from infini_retri import InfiniRetri

model_name_or_path = "Qwen/Qwen2.5-0.5B-Instruct" #  "./models/Qwen2.5-0.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, attn_implementation="eager") # attn_implementation only using "eager"
tokenizer = AutoTokenizer.from_pretrained(model_name)

ir = InfiniRetri(model, tokenizer)
# ir = InfiniRetri(name_or_path=model_name_or_path) 

Our Method in Model Inference

Our methodis to design an innovative processing mechanisum during the model inference(fix model.generate()) to handle texts that exceed the upper limit of the original context length. Specifically, when use, there are there parameters of inputting text: context, question, prompt, as follows:

  • context: (str) required option in the input text, just the complete contextual content that needs to be procssed(no include question and prompt).
  • question: (str) option parameter, passed in is your question for the LLMs, for example, the Question text in QA Pair is passed in this parameter. For all tasks, this parameter is recommended to be filled in, as it has a significant impact on the understanding of the LLMs and the ability to provide correct answers.
  • prompt: (str) option parameter, the instruction template that concatenates the context and question text sections above, especially noting that when concatenating the context and question text sections, two "\n\n" are used to distinguish the boundaries of the text in each section. This is a necessary condition for our method to run normally. For example, its default prompt template is "Read the book and answer the question. Be very concise in your answer.\n\n{context}\n\nQuestion:\n\n{question}\n\nAnswer:". "

In addition, three parameters are provided here for everyone to adjust according to different types of tasks to achieve the best answer effect, as follows:

  • window_length: (int, default 1024) this controls the length of the context window during the execution of our method. When setting , it only need to ensure that it is less than the maximum context window of the your using model.
  • topk: (int, default 300) It affects the cache capacity size during the operation of our method and the actual length of context to be processed throughout the inference process. In theory, the larger the value, the larger the retrieval range during the operation. The actual optimal value depends on the user's problem handling and can be self adjusted.
  • answer_length:(int, default 8) It affects the effectiveness of outputting the correct answer, and its value can be set based on the user's expected token length of the correct answer in the context section. In theory, the closer the token length is set to the correct answer in the context, the better the effect of the model's answer under our method.
# This short passage is extracted from HarryPotter just present usage of our menthod. 
# Due to its short length, it cannot demonstrate the advantages of our method in handling task on ultra long text. 
context = """
Harry woke at five o'clock the next 
morning and was too excited and nervous to go back to sleep. He got up and pulled on his jeans because he didn't want to walk into the station in his wizard's robes — he'd change on the train. He checked his Hogwarts list yet again to make sure he had everything he needed, saw that Hedwig was shut safely in her cage, and then paced the room, waiting for the Dursleys to get up. Two hours later, Harry's huge, heavy trunk had been loaded into the Dursleys’ car, Aunt Petunia had talked Dudley into sitting next to Harry, and they had set off.They reached King's Cross at half past ten. Uncle Vernon dumped Harry's trunk onto a cart and wheeled it into the station for him. Harry thought this was strangely kind until Uncle Vernon stopped dead, facing the platforms with a nasty grin on his face.
"""  

question = "Why did Harry decide to wear jeans instead of his wizard's robes to the train station?"

prompt = "Read the book and answer the question. Be very concise in your answer.\n\n{context}\n\nQuestion:\n\n{question}\n\nAnswer:" # Note "\n\n" in boundary.

response = ir.generate(context=context, question=question, prompt=prompt)
print("Response:", response)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infini_retri-0.0.3.tar.gz (28.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

infini_retri-0.0.3-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file infini_retri-0.0.3.tar.gz.

File metadata

  • Download URL: infini_retri-0.0.3.tar.gz
  • Upload date:
  • Size: 28.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for infini_retri-0.0.3.tar.gz
Algorithm Hash digest
SHA256 16fa2a81986ade083f1589f28bcedbc37de5de948cb6892e8a8d3ec77f971a89
MD5 d28acbb98497ce1fbba59012cea05c02
BLAKE2b-256 a7901c10f15298ef36d34d78736b9f16606bebb3a3811a8fffc8766ec5bb5588

See more details on using hashes here.

File details

Details for the file infini_retri-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: infini_retri-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for infini_retri-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 92a9dd55e3211642bbd89d942f609419cf6d9758b755c433a59cb108f7dcc4fe
MD5 f14edabaa24436d4638f6721557615b8
BLAKE2b-256 91cea04386bea73d07efc65e3110e9fe19ed823d62716770ce474c725af5d924

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page