Skip to main content

Convert EndNote XML to CSV with streaming parse and TXT report.

Project description

EndNote Exporter

Convert EndNote XML files into clean CSVs with automatic TXT reports.
Supports both Python API and command-line interface (CLI).


Features

  • ✅ Parse one XML file (--xml) or an entire folder of *.xml (--folder)
  • ✅ Streams <record> elements using iterparse (low memory usage)
  • ✅ Extracts fields:
    database, ref_type, title, journal, authors, year, volume, number, abstract, doi, urls, extracted_date
  • ✅ Adds a database column from the XML filename stem (IEEE.xml → IEEE)
  • ✅ Normalizes DOI (10.xxxxhttps://doi.org/...)
  • ✅ Always generates a TXT report (default: <csv>_report.txt) with:
    • per-file counts (exported/skipped records)
    • totals, files processed
    • run timestamp & duration
  • ✅ Auto-creates output folders if missing
  • ✅ CLI options for CSV formatting, filters, verbosity
  • ✅ Importable Python API for scripting & integration

Installation

From PyPI (recommended)

pip install endnote-exporter

From local source

If you have the source code in a folder or .zip:

cd /path/to/endnote-exporter
pip install .

Requires Python 3.8+.


Usage

Command Line

Single file

endnote-xml-export --xml data/IEEE.xml --csv output/ieee.csv

Folder with multiple files

endnote-xml-export --folder data/xmls --csv output/all_records.csv

Custom report path

endnote-xml-export \
  --xml data/Scopus.xml \
  --csv output/scopus.csv \
  --report reports/scopus_run.txt

If --report is not provided, it defaults to <csv>_report.txt.


CLI Options

Option Description Default
--xml Path to a single EndNote XML file
--folder Path to a folder containing multiple *.xml files
--csv Output CSV path
--report Output TXT report path <csv>_report.txt
--delimiter CSV delimiter ,
--quoting CSV quoting: minimal, all, nonnumeric, none minimal
--no-header Suppress CSV header row
--encoding Output CSV encoding utf-8
--ref-type Only include records with this ref_type name
--year Only include records with this year
--max-records Stop after N records per file (useful for testing)
--verbose Verbose logging with debug details

Example Report

Run started: 2025-09-11 14:30:22
IEEE.xml: 120 exported, 0 skipped
Scopus.xml: 95 exported, 2 skipped
TOTAL exported: 215
Files processed: 2
Duration: 3.14 seconds

Python API

You can also use it directly in Python scripts:

from pathlib import Path
from endnote_exporter import export, export_folder

# Single file
total, csv_out, report_out = export(
    Path("data/IEEE.xml"), Path("output/ieee.csv")
)

# Folder
total, csv_out, report_out = export_folder(
    Path("data/xmls"), Path("output/all.csv"),
    ref_type="Conference Proceedings", year="2024"
)

Development Notes

  • Pure Python, uses only standard library (argparse, csv, xml.etree.ElementTree, logging, pathlib).
  • Streaming XML parsing avoids high memory usage.
  • Robust error handling: skips malformed records but logs them in verbose mode.
  • Follows PEP 621 packaging (pyproject.toml).

License

MIT License © 2025 Minh Quach

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

endnote_utils-0.1.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

endnote_utils-0.1.0-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file endnote_utils-0.1.0.tar.gz.

File metadata

  • Download URL: endnote_utils-0.1.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for endnote_utils-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ae60dd7f81336e298679a88fee01709aab6fe4f103d8a3443f162d636b72445
MD5 27b8a9dfb7f941553eb8b498f57db269
BLAKE2b-256 0aa4e596170c59798f4c3f4454265f40492913101162a9d83eb79a4d9b92dc52

See more details on using hashes here.

File details

Details for the file endnote_utils-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: endnote_utils-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for endnote_utils-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e67c6564bc015fe14dc37d491c4af50c34f94eb483f3b89b91cc379d9d85fd6
MD5 6c231f0f4e13d0f0a6fd708aef262e44
BLAKE2b-256 822a2268b31fb1810671b702055195b719060dcb773c48475e4a850c65aa468d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page