Skip to main content

EKET: Educational Knowledge Extraction Toolkit

Project description

EKET

EKET is a package for loading the documents(.xls, .xlsx, .pdf, .csv, .md, youtube-links, .pptx, .txt, .json) and converting them into the chunks which has meaning for the further instructions from the user.

We can generate quiz, answers & explanations and answers for topic-based questions.

The Folder Structure

EKET/                         # Root directory of the project
├── EKET/                     # Main Python package (importable as EKET)   ├── __init__.py           # Initializes EKET as a package   ├── answer.py             # Answering the created quiz questions   ├── clean.py              # Functions for cleaning input/output data   ├── create.py             # Utilities for creating or generating quiz   ├── ingest.py             # Handles document ingestion (loading and chunking files)   ├── query.py              # Answering the specific question from the user   ├── utils.py              # Helper functions used across multiple modules   ├── data_ingest/          # Subpackage for modular ingestion logic   ├── evaluate_quiz/        # Subpackage to evaluate quiz answers, scoring, feedback   ├── generate_answer/      # Subpackage to generate answers   └── generate_quiz/        # Subpackage to create quiz questions from source material
├── example_usage/            # Example scripts or notebooks showing how to use the package
├── tests/                    # Unit and integration tests for all modules
├── README.md                 # Project description, usage instructions, and documentation
├── requirements.txt          # Requirements
└── setup.py                  # Setup script for packaging and installing the project

Installation

Step 0: Set Environment

First you should set your gemini api key into your environment system

setx TUTOR_API_KEY "GEMINI_API_KEY"
setx EMBEDDING_MODEL "gemini-embedding-001"

Step 1: Windows Installation Note (C++ Build Tools Required)

Some dependencies of EKET (such as PyMuPDF) include native C/C++ extensions. On Windows, if a precompiled wheel is not available for your Python version, pip may attempt to build the package from source.

If you encounter an error similar to:

please install Visual Studio C++ Build Tools: https://visualstudio.microsoft.com/visual-cpp-build-tools/

During installation, make sure to select:

  • C++ build tools
  • MSVC v143 (or latest)
  • Windows 10/11 SDK

Step 3: HTML Rendering

For HTML rendering

pip install playwright
playwright install

Step 4: Pip Install

After installation, restart your terminal and rerun:

pip install EKET

How to Use

  • For debugging and testing phase, in the current working place, you need to do files like below:
python example_usage/ingest.py --file "path/to/file.png" # supported files above.
  • for YouTube:
python example_usage/ingest.py --youtube "Youtube-URL"   # YouTube URL  
  • After starting example_usage/ingest.py, there will be folders that created in the current folder named chroma which is the vectorized database for chunks, saved_data which contains context_language.json formatted data in it.

  • Then, if you want to generate quiz (multiple-choice and open-ended), you are going to initialize example_usage/generate_quiz.py. But it has some different kind of logic. Let me explain:

    • If you directly initialize the example_usage/generate_quiz.py, it will create the questions from all the documents the user is provided.
    • If you initialize the example_usage/generate_quiz.py after initializing example_usage/query_answer.py, then the questions will be much more relevant to the asked question.
  • To initialize the example_usage/query_answer.py, you should use it in the terminal that:

python example_usage/query_answer.py --query "Your question here..."
  • Lastly, example_usage/evaluation.py which indicates the evaluation of the solved generated quiz by the user. It informs the user how good one did and explains the question to make a better understanding.

Example Usage Scenario

Here is the simple designed FlowChart of the EKET-package:

FlowChart

  1. Ingesting the documents that want to be studied on: For Files:
python example_usage/ingest.py --file "/path/to/file"

For YouTube:

python example_usage/ingest.py --youtube YOUTUBE-URL-LINK

It will yield context_language.json file. Other operations will be depend on this file. (e.g. creating questions, generating quiz).

  1. Do you want to ask a specific question? If yes:
python example_usage/query_answer.py --query "Your Question" --input "Your context_language.jon" --output "Output Folder"

And it will yield context_question_answer.json. If it is created, the quiz generation(next step) will depend on this question.

  1. Creating the quiz:
python example_usage/generate_quiz.py --input "Path to JSON file (query_answer.json or context_language.json)" --output "Output folder to save the quiz"

It will yield generated_quiz.json.

  1. Evaluating the solved quiz:
python example_usage/evaluation.py --input "generated_quiz.json file path" --output "Output folder where evaluation results will be stored"

It will yield evaluation_results.json. It will create this file from generated_quiz.json and answers of the user.

  • Optional:
python example_usage/summarize.py --input "Your context_language.json file" --output "Output Folder"

It will yield combined_summary.json. It has a summarization of context_language.json.

The example_usage folder contains the examples of the output files. While it all returns a .json file, you can also manipulate the files to the attributes we've assigned them to.

License

This project is licensed under the GNU General Public License v3.0 or later (GPL-3.0-or-later). See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eket-0.1.2.tar.gz (37.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eket-0.1.2-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file eket-0.1.2.tar.gz.

File metadata

  • Download URL: eket-0.1.2.tar.gz
  • Upload date:
  • Size: 37.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for eket-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6ee0156636b046db0e87c47620bf1e9f372c0b563b64b34283ccc30396297527
MD5 322792e19a494194e74560f3d32dc593
BLAKE2b-256 a05be81225a0374c595e8ae3e8a50a56c7473fdbbc3a57e0121841713f102fb7

See more details on using hashes here.

Provenance

The following attestation bundles were made for eket-0.1.2.tar.gz:

Publisher: workflow.yml on emirkizilcim0/uyms-eket

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file eket-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: eket-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for eket-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 291c183ba72cca2400a0a87188c1335ead3469bca1f65daf7f430da60fe984f9
MD5 a9c75b7a1eefb7b881ec037192b1eb6f
BLAKE2b-256 cefac3ced8b4442467ccc91b4c08426a5362a0b488371ed36cb5364c5331e54e

See more details on using hashes here.

Provenance

The following attestation bundles were made for eket-0.1.2-py3-none-any.whl:

Publisher: workflow.yml on emirkizilcim0/uyms-eket

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page