No project description provided

These details have not been verified by PyPI

Project description

Evo Researcher

Overview

This project is aimed to be an iteration on top of Autonolas AI Mechs's research process aimed towards making informed predictions.

It contains two primary features:

An information research function
An information grading function

Additionally, Mech's predict capability has been ported into a function that can also be independently run from this repo.

Below, there's a high level explanation of their implementations, respectively.

Research Function

The research function takes a question, like "Will Twitter implement a new misinformation policy before the 2024 elections?" and will then:

Generate n web search queries
Re-rank the queries using an LLM call, and then select the most relevant ones
Search the web for each query, using Tavily
Scrape and sanitize the content of each result's website
Use Langchain's RecursiveCharacterTextSplitter to split the content of all pages into chunks.
Create embeddings of all chunks, and store the source of each as metadata
Iterate over the queries selected on step 2. And for each one of them, vector search for the most relevant embeddings for each.
Aggregate the chunks from the previous steps and prepare a report.

Grading Function

For the implmentation of this function, the information quality criteria were selected from https://guides.lib.unc.edu/evaluating-info/evaluate, ignoring usability and intended audience.

Upon receiving a question like "Will Twitter implement a new misinformation policy before the 2024 elections?" and information, it will:

Create en evaluation plan
Perform the evaluation of the information according to the plan from the previous step
Extract the scores from the evaluation

Predict

Ported implementation from: https://github.com/valory-xyz/mech/blob/main/tools/prediction_request_embedding/prediction_sentence_embedding.py

Installation

poetry install
poetry shell

Run

Research

With Evo:

poetry run python ./evo_researcher/main.py research "Will Twitter implement a new misinformation policy before the 2024 elections?" evo

With Autonolas:

poetry run python ./evo_researcher/main.py research "Will Twitter implement a new misinformation policy before the 2024 elections?" autonolas

Predict

poetry run python ./evo_researcher/main.py predict "Will Twitter implement a new misinformation policy before the 2024 elections?" ./outputs/myinfopath

Evaluate

poetry run python ./evo_researcher/main.py evaluate "Will Twitter implement a new misinformation policy before the 2024 elections?" ./outputs/myinfopath

Test

Run all questions

pytest

Run specific questions

Use pytest's -k flag and a string matcher. Example:

pytest -k "Twitter"

Example results

Evo Reports: https://hackmd.io/mrCRBJyiTi-aO1gSFPjoDg
Evo Information excerpts: https://hackmd.io/VQGJgHD1SImZrDR7puauJA?both
Autonolas Information excerpts: https://hackmd.io/5qt_0HkvQyuGZ2r0RqJtoQ

Ideas for future improvement

For the researcher:

Using LLM re-ranking, like Cursor to optimize context-space and reduce noise
Use self-consistency and generate several reports and compare them to choose the best, or even merge information
Plan research using more complex techniques like tree of thoughts
Implement a research loop, where research is performed and then evaluated. If the evaluation scores are under certain threshold, re-iterate to gather missing information or different sources, etc.
Perform web searches under different topic or category focuses like Tavily does. For example, some questions benefit more from a "social media focused" research: gathering information from twitter threads, blog articles. Others benefit more from prioritizing scientific papers, institutional statements, and so on.
Identify strong claims and perform sub-searches to verify them. This is the basis of AI powered fact-checkers like: https://fullfact.org/
Evaluate sources credibility
Further iterate over chunking and vector-search strategies
Use HyDE

For the information evaluator/grader

Use self-consistency to generate several scores and choose the most repeated ones.
Enhance the evaluation and reduce its biases through the implementation of more advanced techniques, like the ones described here https://arxiv.org/pdf/2307.03025.pdf and here https://arxiv.org/pdf/2305.17926.pdf
Further evaluate biases towards writing-style, length, among others described here: https://arxiv.org/pdf/2308.02575.pdf and mitigate them
Evaluate using different evaluation criteria

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.10

Jan 30, 2024

0.1.9

Jan 30, 2024

0.1.8

Jan 14, 2024

This version

0.1.7

Jan 12, 2024

0.1.6

Jan 12, 2024

0.1.5

Jan 12, 2024

0.1.4

Jan 12, 2024

0.1.3

Jan 11, 2024

0.1.2

Jan 11, 2024

0.1.1

Jan 11, 2024

0.1.0

Jan 10, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evo_researcher-0.1.7.tar.gz (23.2 kB view details)

Uploaded Jan 12, 2024 Source

Built Distribution

evo_researcher-0.1.7-py3-none-any.whl (26.7 kB view details)

Uploaded Jan 12, 2024 Python 3

File details

Details for the file evo_researcher-0.1.7.tar.gz.

File metadata

Download URL: evo_researcher-0.1.7.tar.gz
Upload date: Jan 12, 2024
Size: 23.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-14-generic

File hashes

Hashes for evo_researcher-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`9d78be4c83d93e800c38f4a60a3cd369fd9b1e8d914aa89867aa97bcce95727f`
MD5	`5dcbb967bec72cd6f1051e9fe20f0ae3`
BLAKE2b-256	`65b85fc9bf3b1031a67afd798090b04a9b1a1c01e81646ce2f838cdf58a03af3`

See more details on using hashes here.

File details

Details for the file evo_researcher-0.1.7-py3-none-any.whl.

File metadata

Download URL: evo_researcher-0.1.7-py3-none-any.whl
Upload date: Jan 12, 2024
Size: 26.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.5.0-14-generic

File hashes

Hashes for evo_researcher-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fab47aa48a3495de8693bd31afa99f330d4f9b2a55a89a01df41368a72d9a71b`
MD5	`6dcfb80b9f1407262bcd1546363eff8f`
BLAKE2b-256	`895b6e2647fa001be13f5a8cd228b58d44a5fae2c7a4a0f80ce7d1c414ded777`