An intelligent literature review tool that uses AI-powered embeddings to find the most relevant research papers based on your research interests.

These details have not been verified by PyPI

Project links

Homepage

Project description

SmartReview

SmartReview is an AI-powered literature review tool that uses OpenAI text embeddings to rank a large corpus of research papers by how closely they match a free-text description of your research interests.

Features

🔍 Semantic ranking – embed every paper (title + abstract) and your interest statement, then rank by cosine similarity.
📊 Flexible top-K selection – choose a fixed K or derive it automatically (e.g. top 20 % by similarity score).
💾 Multiple export formats – CSV, Excel (.xlsx), and BibTeX (.bib).
🗄️ Embedding cache – save / reload embeddings with pickle so you don't re-call the API on every run.
🔑 Safe API-key handling – reads OPENAI_API_KEY from the environment (or a .env file) and raises a clear error if it is missing.

Installation

pip install smartreview

For development / editable installs:

git clone https://github.com/geonextgis/smartreview.git
cd smartreview
pip install -e .

Quick Start

1 – Set your OpenAI API key

# Option A: environment variable
export OPENAI_API_KEY="sk-..."

# Option B: .env file (recommended)
echo 'OPENAI_API_KEY=sk-...' > .env

2 – Generate embeddings and find top papers

from dotenv import load_dotenv
import pandas as pd
from smartreview import (
    create_openai_client, get_embedding,
    calculate_cosine_similarity, get_top_k_papers,
    create_top_k_dataframe, save_top_k_papers,
    generate_bibtex_file, save_embeddings, load_embeddings,
)

load_dotenv()  # reads OPENAI_API_KEY from .env

# 1. Load your Web of Science export
data = pd.read_excel("data/papers.xls")
summary = {i: (row["Article Title"], row["Abstract"]) for i, row in data.iterrows()}

# 2. Create OpenAI client
client = create_openai_client()  # raises ValueError if key is missing

# 3. Embed all papers
paper_embeddings = {}
for idx, (title, abstract) in summary.items():
    text = title + " " + (str(abstract) if pd.notna(abstract) else "")
    paper_embeddings[idx] = get_embedding(text, client=client)

# 4. Embed your research interest
interest_text = "Machine learning for crop yield prediction using remote sensing data."
interest_embedding = get_embedding(interest_text, client=client)

# 5. Save embeddings (avoids re-calling the API next time)
save_embeddings(paper_embeddings, interest_embedding, interest_text)

# 6. Rank papers
similarities = calculate_cosine_similarity(interest_embedding, paper_embeddings)
top_k = get_top_k_papers(similarities, k=100)

# 7. Export
df = create_top_k_dataframe(top_k, data, summary)
save_top_k_papers(df, output_dir="data", k=100)
generate_bibtex_file(df, output_dir="data", k=100)
print("Done! Check the data/ folder for your results.")

3 – Re-use cached embeddings

from dotenv import load_dotenv
from smartreview import load_embeddings, calculate_cosine_similarity, get_top_k_papers

load_dotenv()
paper_embeddings, interest_embedding, interest_text = load_embeddings()
similarities = calculate_cosine_similarity(interest_embedding, paper_embeddings)
top_k = get_top_k_papers(similarities, k=50)

API Reference

OpenAI helpers (`smartreview.embeddings`)

Function	Description
`create_openai_client(api_key=None)`	Return an `openai.OpenAI` client; reads `OPENAI_API_KEY` from env if `api_key` is omitted.
`get_embedding(text, client=None, model="text-embedding-3-large")`	Embed a single string and return a NumPy array.
`get_embeddings_batch(texts, client=None, ...)`	Embed a list of strings with optional progress logging.

Similarity (`smartreview.smartreview`)

Function	Description
`calculate_cosine_similarity(query_emb, paper_emb_dict)`	Return a list of `(idx, score)` tuples sorted by descending similarity.
`get_top_k_papers(similarities, k=100)`	Slice the top-K entries from a similarity list.

DataFrame & Export

Function	Description
`create_top_k_dataframe(top_k, data, summary)`	Build a ranked `pd.DataFrame` from top-K results.
`save_top_k_papers(df, output_dir, k)`	Write CSV + Excel files; returns a dict of file paths.
`print_top_k_summary(df, k, show_rows)`	Pretty-print a summary table.
`generate_bibtex_file(df, output_dir, k)`	Write a `.bib` file; returns a dict with path and entry count.

Embedding Persistence

Function	Description
`save_embeddings(paper_emb, interest_emb, interest_text, output_dir)`	Pickle embeddings to `output_dir`.
`load_embeddings(output_dir)`	Load and return `(paper_emb, interest_emb, interest_text)`.

Example Notebook

An end-to-end walkthrough is provided in docs/examples/example.ipynb.
Place your Web of Science .xls export in docs/examples/data/ before running.

Requirements

Package	Purpose
`openai`	Text embeddings via the OpenAI API
`numpy`	Numerical arrays
`pandas`	DataFrame I/O
`scikit-learn`	Cosine similarity
`tiktoken`	Token counting
`openpyxl`	Excel export
`python-dotenv`	`.env` file support

License

MIT © Krishnagopal Halder

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.1

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartreview-0.0.1.tar.gz (623.9 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

smartreview-0.0.1-py2.py3-none-any.whl (14.9 kB view details)

Uploaded Apr 25, 2026 Python 2Python 3

File details

Details for the file smartreview-0.0.1.tar.gz.

File metadata

Download URL: smartreview-0.0.1.tar.gz
Upload date: Apr 25, 2026
Size: 623.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for smartreview-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`648b4b5e5bd5014c94d36e8676e9c816d98dd01c882fc80cf50f3a53eef8b8e2`
MD5	`aa8fdfcd676b230d3aaef7a515069c32`
BLAKE2b-256	`53c43fa6703b2a91a8de5668622f33c6b1d8e9d90dc5f5a41ba12312cf04a50c`

See more details on using hashes here.

File details

Details for the file smartreview-0.0.1-py2.py3-none-any.whl.

File metadata

Download URL: smartreview-0.0.1-py2.py3-none-any.whl
Upload date: Apr 25, 2026
Size: 14.9 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for smartreview-0.0.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`979a8caa94aebbb5512591233e689c20255edfe9a616e3a8a2cd67854b2039b9`
MD5	`c0f02b4b190aee56422e2980205f1e3e`
BLAKE2b-256	`ac8227f63f0570fd6699771f9c93960d570cda6e6ecf142b537772d17d4520c6`

See more details on using hashes here.

smartreview 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SmartReview

Features

Installation

Quick Start

1 – Set your OpenAI API key

2 – Generate embeddings and find top papers

3 – Re-use cached embeddings

API Reference

OpenAI helpers (smartreview.embeddings)

Similarity (smartreview.smartreview)

DataFrame & Export

Embedding Persistence

Example Notebook

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

OpenAI helpers (`smartreview.embeddings`)

Similarity (`smartreview.smartreview`)