Skip to main content

Mirror a Zotero library into a SQL database

Project description

Karboni

Mirror a Zotero library into a SQL database.

Features

  • Fast one-way synchronization from Zotero to a SQL database.
  • Fetch library items, collections, tags, saved searches metadata.
  • Download file attachments.
  • Fetch formatted references in multiple bibliographic styles, for multiple locales.
  • Fetch multiple export formats.
  • Fetch the full text content of items.
  • Fetch the labels of item types, field names and creator types, for multiple locales.
  • Python API for managing synchronization.
  • Command line interface for managing synchronization.
  • Support for a wide range of database systems (through SQLAlchemy).

Installation

It is recommended that you install the package in a virtual environment.

The installation steps might look like the following. Replace DIR with the desired path for your new virtual environment.

Unix/macOS

Create the virtual environment:

python3 -m venv DIR

Activate the virtual environment:

source DIR/bin/activate

Install Karboni:

python3 -m pip install karboni

Windows

Create the virtual environment:

py -m venv DIR

Activate the virtual environment:

DIR\Scripts\activate

Install Karboni:

py -m pip install karboni

Command line interface

In order to use the command line interface, you must first configure your Zotero credentials. With a text editor, create a .env file in your working directory with the following content:

ZOTERO_LIBRARY_PREFIX=your_library_prefix
ZOTERO_LIBRARY_ID=your_library_id
ZOTERO_API_KEY=your_api_key

Replace your_library_prefix with users for a personal library, or groups for a group library.

Replace your_library_id with the identifier of your library. For a personal library the value is your user ID, as found on https://www.zotero.org/settings/keys (you must be logged-in). For a group library this value is the group ID of the library, as found in the URL of the library (e.g., the group ID of the library at https://www.zotero.org/groups/1234567/example is 1234567).

Replace your_api_key with your Zotero API key. You may create one for your library on https://www.zotero.org/settings/keys/new (you must be logged-in). Karboni does not need to write to your library. Thus, we recommend that your API key be read-only, and that it does not grant any more access to your Zotero data than strictly necessary.

By default, Karboni commands will manage data in a data/karboni directory under your current directory, and use SQLite as the relational database. You may change those defaults by setting the following variables in your .env file:

  • KARBONI_DATA_PATH. Defaults to ./data/karboni/ZOTERO_LIBRARY_PREFIX-ZOTERO_LIBRARY_ID/. If the directory does not already exists, Karboni will create it.
  • KARBONI_DATABASE_URL. Defaults to sqlite:///data/karboni/ZOTERO_LIBRARY_PREFIX-ZOTERO_LIBRARY_ID/library.sqlite. When using SQLite, the directory specified in the database URL must either exist prior to running the Karboni command, or match the directory specified by KARBONI_DATA_PATH. For other relational databases, see the SQLAlchemy documentation on database URLs. While SQLite support is readily available through the Python standard library, other database backends usually require that you install additional Python packages.

Once the required variables have been set, you may use Karboni commands. If you have installed Karboni in a virtual environment, make sure it is active before attempting to use the commands (see the activation command in the Installation section). Some example commands below.

Initialize the mirror database (create the tables):

karboni init

Synchronize from Zotero:

karboni sync

List the available commands and general options:

karboni --help

List the options of a specific command:

karboni COMMAND --help

A more complex example, synchronizing from Zotero with some data options enabled — format references in APA and Vancouver styles, fetch BibTeX and RIS formats, download file attachments, fetch any available full text:

karboni sync --style apa --style vancouver --export-format bibtex --export-format ris --files --fulltext

Once an initial synchronization has completed, subsequent invocations of the karboni sync command will perform incremental synchronization by default, i.e., fetching just the modified data from Zotero. However, that only works if you use the same data options as on the initial synchronization. To change the data options, add the --full option to force a full synchronization. For example:

karboni sync --style apa-5th-edition --full

For the formatting styles available for the --style option, refer to the Zotero Style Repository.

For the export formats available for the --export-format option, refer to the Zotero API documentation on export formats.

For the locales available for the --locale option, refer to the Citation Style Language locales. Note that some styles use a fixed locale and will ignore the --locale option.

Python interface

The karboni Python module provides the main entry points, with functions such as initialize() and synchronize().

If you wish to use the SQLAlchemy ORM to query the database, you might want to import models from karboni.database.schema.

Design choices

Here are some of the design choices that have guided the development of Karboni:

  • Perform Zotero API requests and file IO asynchronously to minimize idle time.
  • Use SQLite as the baseline database system (reducing the need for additional dependencies), but interface it through SQLAlchemy in order to support other databases as well.
  • Since Karboni itself only needs a few simple database operations, encapsulate SQLAlchemy under a thin abstraction layer to decouple the synchronization process from the database toolkit. Thus, only the database module has direct SQLAlchemy dependencies.
  • Stay close to the Zotero schema. Store data in the JSON format provided by the Zotero API whenever possible, for consistency and better adaptability to future Zotero schema changes. Add SQL columns where they can be useful to the synchronization process or to allow basic queries.
  • Don't fuss too much with database-level referential integrity constraints. Leave that to Zotero. In particular, the keys of parent items and parent collections are not validated (this simplifies the synchronization process).
  • Don't worry about database schema migrations. The database is just a mirror, thus its tables can be wiped when necessary and re-synchronized from Zotero.
  • Synchronization of file attachments is not atomic. If library synchronization finishes but file downloads fail, we accept that and don't rollback the database changes.

Known limitations

  • Database operations are synchronous because SQLAlchemy cannot (at least not easily) share a session between concurrent tasks.
  • During transactions, SQLite locks database access from other threads or processes. When synchronizing from Zotero, Karboni applies all changes in a single transaction (to allow rollback in case of failure), which means the database can remain locked for some time. To ensure availability during synchronization, use a database system that has more advanced locking mechanisms (such as PostgreSQL or MariaDB/MySQL).
  • Python 3.11+ is required (it facilitates exception handling with asynchronous tasks).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

karboni-0.1.0.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

karboni-0.1.0-py3-none-any.whl (50.6 kB view details)

Uploaded Python 3

File details

Details for the file karboni-0.1.0.tar.gz.

File metadata

  • Download URL: karboni-0.1.0.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.17.0 {"ci":null,"cpu":"x86_64","distro":{"id":"jammy","libc":{"lib":"glibc","version":"2.35"},"name":"Ubuntu","version":"22.04"},"implementation":{"name":"CPython","version":"3.13.14"},"installer":{"name":"hatch","version":"1.17.0"},"openssl_version":"OpenSSL 3.0.2 15 Mar 2022","python":"3.13.14","system":{"name":"Linux","release":"6.8.0-124-generic"}} HTTPX2/2.3.0

File hashes

Hashes for karboni-0.1.0.tar.gz
Algorithm Hash digest
SHA256 084a5f4e0c882c18ea83248435278e57816acce1c27f9ea2259345828c40c8f4
MD5 07df549f0aa0da86e8d27e5f3a14bef6
BLAKE2b-256 36db01062c4158f0d50f340267c5dafb082a51e357947ba1198877aec9ff96a9

See more details on using hashes here.

File details

Details for the file karboni-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: karboni-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.17.0 {"ci":null,"cpu":"x86_64","distro":{"id":"jammy","libc":{"lib":"glibc","version":"2.35"},"name":"Ubuntu","version":"22.04"},"implementation":{"name":"CPython","version":"3.13.14"},"installer":{"name":"hatch","version":"1.17.0"},"openssl_version":"OpenSSL 3.0.2 15 Mar 2022","python":"3.13.14","system":{"name":"Linux","release":"6.8.0-124-generic"}} HTTPX2/2.3.0

File hashes

Hashes for karboni-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e32ae692af9fcee6012add57e460073266ff723ba008bf16891eaa1ba9df5daa
MD5 92f66d72729730de6bbf161045130b83
BLAKE2b-256 4a716656453517109e36fb57238218716684d0e8bdf148bef9bc443046f8223c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page