Skip to main content

No project description provided

Project description

smart-thinking-llm

Для установки виртуального окружения через poetry используйте команду:

curl -sSL https://install.python-poetry.org | python3 - --version=1.8.5
export PATH="$HOME/.local/bin:$PATH"
poetry shell

Для скачивания данных нужно запросить к dvc у @vasgreg в тг и положить их в окружение

Далее нужно сделать:

dvc remote modify smart_thinking_llm --local access_key_id $DVC_ACCESS_KEY_ID
dvc remote modify smart_thinking_llm --local secret_access_key $DVC_SECRET_ACCESS_KEY

Далее для скачивания данных нужно использовать команду:

dvc pull data/raw_data.zip.dvc

How to создание и сравнение графов

Установить зависимости через poetry (как выше) или через файл requirements.txt

Далее нужно скачать алиасы и сам датасет со страницы. Оттуда качаем Transductive split и Entity & relation aliases.

Разархивируем, нам понадобятся файлы wikidata5m_transductive_train.txt, wikidata5m_entity.txt и wikidata5m_relation.txt.

Далее можно начинать пользоваться функционалом:

import os

import openai
from pathlib import Path

from smart_thinking_llm.tools.graph_creation import GraphCreator

# Initialization ~3-4 minutes
graph_creator = GraphCreator(
    entity_aliases_filepath=Path("data/raw_data/wikidata5m_alias/wikidata5m_entity.txt"),
    relation_aliases_filepath=Path("data/raw_data/wikidata5m_alias/wikidata5m_relation.txt"),
    dataset_filepath=Path("data/raw_data/wikidata5m_transductive/wikidata5m_transductive_train.txt"),
    triplets_prompt_filepath=Path("smart_thinking_llm/prompts/generate_triplets_prompt.txt"),
    openai_client=openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY")),
    triplets_model="gpt-4.1-mini-2025-04-14",
    norm_lev_threshold=0.8,
)

question = "What is the top-level Internet domain for the country where Miyankuh-e Gharbi is located?"
# first model part
# ...
# ...
answer = "Miyankuh-e Gharbi is located in Iran. The Internet country-code top-level domain for Iran is .ir."

# Ground truth from dataset
ground_truth_answer_path = "Q6884371-P17-Q794-P78-Q41774"
ground_truth_graph = graph_creator.get_graph_from_path(ground_truth_answer_path)

# Graph from model answer
graph = graph_creator(answer)

print("*"*50, "Model answer", "*"*50)
print(graph)
print("*"*50, "Ground truth", "*"*50)
print(ground_truth_graph)
print("*"*50, "Comparison", "*"*50)
print(graph.compare_to(ground_truth_graph))

================================================================
[2025-07-17 15:30:16,736: DEBUG WikiDataset] Start parsing entities aliases file...
[2025-07-17 15:30:29,799: DEBUG WikiDataset] Start parsing relation aliases file...
[2025-07-17 15:30:31,816: DEBUG WikiDataset] Start parsing dataset file...
[2025-07-17 15:30:32,496: WARNING WikiDataset] Error using mmap, falling back to standard processing: Do not use mmap
Processing chunk 1 of dataset: 100%|███████████████████████████████████████████████████████████████████████████████████████| 20614279/20614279 [01:41<00:00, 202859.48it/s]
[2025-07-17 15:32:15,236: DEBUG WikiDataset] Start creating entity2entity graph...████████████████████████████████████████▉| 20590650/20614279 [01:41<00:00, 447166.90it/s]
Creating entity2entity graph: 100%|████████████████████████████████████████████████████████████████████████████████████████| 20599278/20599278 [01:01<00:00, 336052.14it/s]
[2025-07-17 15:33:21,930: DEBUG WikiDataset] Dataset creation done!
************************************************** Model answer **************************************************
[Miyankuh-e Gharbi (Q6884371)]
└── located in the administrative territorial entity (P131): [Persian State of Iran (Q794)]
    └── top-level Internet domain (P78): [.sch.ir (Q41774)]

************************************************** Ground truth **************************************************
[Miyankuh-e Gharbi (Q6884371)]
└── country (P17): [Persian State of Iran (Q794)]
    └── top-level Internet domain (P78): [.sch.ir (Q41774)]

************************************************** Comparison **************************************************
1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smart_thinking_llm-0.3.1-py3-none-any.whl (51.7 kB view details)

Uploaded Python 3

File details

Details for the file smart_thinking_llm-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for smart_thinking_llm-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c038d00ef46a8917c82e85f92920d0f292b9b838bf476c68a9578e8807e0378b
MD5 6bcd4acb54da2b64c1d947309550fc05
BLAKE2b-256 205ddc23893d3bee9a335fa2f3be421a18edcf5be4aa4abdc3e548c16c688eae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page