Command-line tools for Digital Article notebook application

These details have not been verified by PyPI

Project links

Project description

Digital Article

Transform computational notebooks from code-first to article-first. Write what you want to analyze in natural language; let AI generate the code.

What is Digital Article?

Digital Article inverts the traditional computational notebook paradigm. Instead of writing code to perform analysis, you describe your analysis in natural language, and the system generates, executes, and documents the code for you—automatically creating publication-ready scientific methodology text.

Digital Article

Traditional Notebook

[Code: Data loading, cleaning, analysis]
[Output: Plots and tables]

Digital Article

[Prompt: "Analyze gene expression distribution across experimental conditions"]
[Generated Methodology: "To assess gene expression patterns, data from 6 samples..."]
[Results: Plots and tables]
[Code: Available for inspection and editing]

Key Features

Natural Language Analysis: Write prompts like "create a heatmap of gene correlations" instead of Python code
Intelligent Code Generation: LLM-powered code generation using AbstractCore (supports LMStudio, Ollama, OpenAI, and more)
Auto-Retry Error Fixing: System automatically debugs and fixes generated code (up to 3 attempts)
Scientific Methodology Generation: Automatically creates article-style explanations of your analysis
Rich Output Capture: Matplotlib plots, Plotly interactive charts, Pandas tables, and text output
Publication-Ready PDF Export: Generate scientific article PDFs with methodology, results, and optional code
Transparent Code Access: View, edit, and understand all generated code
Persistent Execution Context: Variables and DataFrames persist across cells (like Jupyter)
Workspace Isolation: Each notebook has its own data workspace

Who Is This For?

Domain Experts (biologists, clinicians, social scientists): Perform sophisticated analyses without programming expertise
Data Scientists: Accelerate exploratory analysis and documentation
Researchers: Create reproducible analyses with built-in methodology text
Educators: Teach data analysis concepts without syntax barriers
Anyone who wants to think in terms of what to analyze rather than how to code it

Quick Start

Prerequisites

Python 3.8+
Node.js 16+
LMStudio or Ollama (for local LLM) OR OpenAI API key

Installation

# Clone repository
git clone https://github.com/lpalbou/digitalarticle.git
cd digitalarticle

# Set up Python environment
python -m venv .venv
source .venv/bin/activate  # On macOS/Linux
pip install -r requirements.txt
pip install -e .

# Set up frontend
cd frontend
npm install
cd ..

Start the Application

# Terminal 1: Backend
da-backend

# Terminal 2: Frontend
da-frontend

Then open http://localhost:3000

Full setup guide: See Getting Started

LLM Configuration

Digital Article requires an LLM provider to generate code from prompts. The system provides flexible configuration options:

Global Configuration

Click the Settings button in the header to select your provider and model
Changes persist across sessions and apply to all new notebooks
Configuration is saved to config.json in the project root

Per-Notebook Configuration

Each notebook can use a different provider/model if needed
New notebooks automatically inherit the global configuration
Notebook-specific settings override global defaults during execution

Visual Feedback

The status footer at the bottom shows the current provider, model, and context size
Real-time updates when configuration changes
Click the footer's Settings button for quick access to configuration

Remote Access

All configuration works seamlessly when accessing Digital Article from remote machines (e.g., http://server-ip:3000). The settings modal and status footer use relative API paths for proper remote connectivity.

Example Usage

Simple Analysis

Prompt:

Load gene_expression.csv and show the distribution of expression values

Generated Code:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv('data/gene_expression.csv')

plt.figure(figsize=(10, 6))
sns.histplot(df.values.flatten(), bins=50, kde=True)
plt.title('Distribution of Gene Expression Values')
plt.xlabel('Expression Level')
plt.ylabel('Frequency')
plt.show()

print(f"Dataset shape: {df.shape}")
print(f"Mean expression: {df.values.mean():.2f}")
print(f"Std expression: {df.values.std():.2f}")

Generated Methodology:

To assess the overall distribution of gene expression levels, the dataset
containing 20 genes across 6 experimental conditions was examined. The
analysis revealed a mean expression level of 15.3 ± 4.2 across all genes,
with a right-skewed distribution indicative of heterogeneous expression
patterns.

Progressive Analysis

Cell 1: "Load patient_data.csv and show basic statistics"
Cell 2: "Create a scatter plot of age vs blood_pressure colored by gender"
Cell 3: "Perform t-test comparing blood pressure between genders"
Cell 4: "Generate a summary table with mean values by gender"

Each cell builds on the previous context, with variables persisting across cells.

Architecture Overview

Frontend (React + TypeScript)
    ↓ HTTP/REST
Backend (FastAPI)
    ↓
Services Layer
    ├─ LLMService (AbstractCore → LMStudio/Ollama/OpenAI)
    ├─ ExecutionService (Python code execution sandbox)
    ├─ NotebookService (orchestration)
    └─ PDFService (scientific article generation)
    ↓
Data Layer
    ├─ Notebooks (JSON files)
    └─ Workspaces (isolated data directories)

Detailed architecture: See Architecture Documentation

Technology Stack

Backend

FastAPI - Modern Python web framework
AbstractCore - LLM provider abstraction
Pandas, NumPy, Matplotlib, Plotly - Data analysis and visualization
Pydantic - Data validation and serialization
ReportLab/WeasyPrint - PDF generation

Frontend

React 18 + TypeScript - UI framework with type safety
Vite - Lightning-fast dev server and build tool (runs on port 3000)
Tailwind CSS - Utility-first styling
Monaco Editor - Code viewing
Plotly.js - Interactive visualizations
Axios - HTTP client

Project Philosophy

Digital Article is built on the belief that analytical tools should adapt to how scientists think, not the other way around. Key principles:

Article-First: The narrative is primary; code is a derived implementation
Transparent Generation: All code is inspectable and editable
Scientific Rigor: Auto-generate methodology text suitable for publications
Progressive Disclosure: Show complexity only when needed
Intelligent Recovery: Auto-fix errors before asking for user intervention

Full philosophy: See Philosophy Documentation

Documentation

Getting Started Guide - Installation and first analysis
Architecture Documentation - System design and component breakdown
Philosophy - Design principles and motivation
Roadmap - Planned features and development timeline

Current Status

Version: 0.1.0 (Alpha)

Working Features:

✅ Natural language to code generation
✅ Code execution with rich output capture
✅ Auto-retry error correction (up to 3 attempts)
✅ Scientific methodology generation
✅ Matplotlib and Plotly visualization support
✅ Pandas DataFrame capture and display
✅ Multi-format export (JSON, HTML, Markdown)
✅ Scientific PDF export
✅ File upload and workspace management
✅ Persistent execution context across cells

Known Limitations:

⚠️ Single-user deployment only (no multi-user authentication)
⚠️ Code execution in same process as server (not production-safe)
⚠️ JSON file storage (not scalable to many notebooks)
⚠️ No real-time collaboration
⚠️ LLM latency makes it unsuitable for real-time applications

Production Readiness: This is a research prototype suitable for single-user or small team deployment. Production use requires:

Containerized code execution
Database storage (PostgreSQL)
Authentication and authorization
Job queue for LLM requests
See Architecture - Deployment Considerations

Example Use Cases

Bioinformatics

"Load RNA-seq counts and perform differential expression analysis between treatment and control"
"Create a volcano plot highlighting significantly differentially expressed genes"
"Generate a heatmap of top 50 DE genes with hierarchical clustering"

Clinical Research

"Analyze patient outcomes by treatment group with survival curves"
"Test for significant differences in biomarkers across cohorts"
"Create a forest plot of hazard ratios for different risk factors"

Data Exploration

"Load the dataset and identify missing values and outliers"
"Perform PCA and visualize the first two principal components"
"Fit a linear model predicting outcome from predictors and show coefficients"

Comparison to Alternatives

Feature	Digital Article	Jupyter	ChatGPT Code Interpreter	Observable
Natural language prompts	✅ Primary	❌	✅	❌
Code transparency	✅ Always visible	✅	⚠️ Limited	⚠️ Limited
Local LLM support	✅	❌	❌	❌
Auto-error correction	✅ 3 retries	❌	⚠️ Manual	❌
Scientific methodology	✅ Auto-generated	❌	❌	❌
Publication PDF export	✅	⚠️ Via nbconvert	❌	❌
Persistent context	✅	✅	⚠️ Session-based	✅
Self-hosted	✅	✅	❌	❌

Roadmap Highlights

Near Term (Q2 2025):

Enhanced LLM prompt templates for specific domains
Version control integration (git-style cell history)
Improved error diagnostics and suggestions
Additional export formats (LaTeX, Quarto)

Medium Term (Q3-Q4 2025):

Collaborative editing (real-time multi-user)
Database backend (PostgreSQL)
Containerized code execution (Docker)
Template library (common analysis workflows)

Long Term (2026+):

LLM-suggested analysis strategies
Active learning from user corrections
Integration with laboratory information systems
Plugin architecture for domain-specific extensions

Full roadmap: See ROADMAP.md

Contributing

We welcome contributions! Areas where help is needed:

Testing: Try the system with your data and report issues
Documentation: Improve guides, add examples
LLM Prompts: Enhance code generation quality
UI/UX: Improve the interface
Domain Templates: Add analysis workflows for specific fields

See CONTRIBUTING.md for development guidelines.

License

MIT License - see LICENSE file for details.

Citation

If you use Digital Article in your research, please cite:

@software{digital_article_2025,
  title = {Digital Article: Natural Language Computational Notebooks},
  author = {Laurent-Philippe Albou},
  year = {2025},
  url = {https://github.com/lpalbou/digitalarticle}
}

Acknowledgments

AbstractCore for LLM provider abstraction
LMStudio and Ollama for local LLM serving
FastAPI and React communities for excellent frameworks
Inspired by literate programming (Knuth), computational essays (Wolfram), and Jupyter notebooks

Support and Contact

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: contact@abstractcore.ai

We're not building a better notebook. We're building a different kind of thinking tool—one that speaks the language of science, not just the language of code.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.5

Oct 21, 2025

0.1.1

Oct 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

digitalarticle-0.1.5.tar.gz (11.4 kB view details)

Uploaded Oct 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

digitalarticle-0.1.5-py3-none-any.whl (12.2 kB view details)

Uploaded Oct 21, 2025 Python 3

File details

Details for the file digitalarticle-0.1.5.tar.gz.

File metadata

Download URL: digitalarticle-0.1.5.tar.gz
Upload date: Oct 21, 2025
Size: 11.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for digitalarticle-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`f161c035fdcdbbb6b82f228dfc6d1524500049045f481ff0bccba27e5e16695f`
MD5	`73bfb5617c92c6b890b23cd3289604a8`
BLAKE2b-256	`12f6d0615b8de04cfede11280f5593d363e3be0deac8791bcc1111425d62a048`

See more details on using hashes here.

File details

Details for the file digitalarticle-0.1.5-py3-none-any.whl.

File metadata

Download URL: digitalarticle-0.1.5-py3-none-any.whl
Upload date: Oct 21, 2025
Size: 12.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for digitalarticle-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ff69ec65d0b16bcc7c169d3e1fab041a1df2d1eb56a29d7e1cd58aee1bfb1281`
MD5	`ed22fb664fa0b906fefbefce173de1b4`
BLAKE2b-256	`2b8b6c7a4d11b40afd29116621ff8bb2aba3e94a7bffe8620e4df1f338c308bb`

See more details on using hashes here.

DigitalArticle 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Digital Article

What is Digital Article?

Traditional Notebook

Digital Article

Key Features

Who Is This For?

Quick Start

Prerequisites

Installation

Start the Application

LLM Configuration

Global Configuration

Per-Notebook Configuration

Visual Feedback

Remote Access

Example Usage

Simple Analysis

Progressive Analysis

Architecture Overview

Technology Stack

Backend

Frontend

Project Philosophy

Documentation

Current Status

Example Use Cases

Bioinformatics

Clinical Research

Data Exploration

Comparison to Alternatives

Roadmap Highlights

Contributing

License

Citation

Acknowledgments

Support and Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes