NewsML converter (AFP news)
Project description
pyconverters_newsml
Convert AFP NewsML-G2 XML feeds into pymultirole Document objects.
Supports text articles, picture captions, video descriptions, and graphic items. Extracts IPTC media topics, AFP-specific subjects (persons, organisations, locations), keywords, urgency, genre, language, and other metadata.
Installation
pip install pyconverters_newsml
Usage
The converter is registered as a pyconverters.plugins entry point under the name newsml
and integrates automatically with the pymultirole plugin system.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
subjects_as_metadata |
str |
"" |
Comma-separated subject types to extract as metadata: medtop, afpperson, afporganization, afplocation, or all. |
subjects_code |
bool |
False |
When True, metadata values are "code:name" strings; when False, only the name is stored. |
mediatopics_as_categories |
bool |
False |
When True, IPTC media-topic codes are added as hierarchical Category objects. |
keywords_as_categories |
bool |
False |
When True, AFP slug keywords are added as Category objects. |
natures |
str |
"text" |
Comma-separated list of item natures to include: text, video, picture, graphic. |
Developing
Prerequisites
You will need uv and Python 3.12.
Clone the repository:
git clone https://github.com/oterrier/pyconverters_newsml
cd pyconverters_newsml
Install dependencies (including test extras):
uv sync --extra test
Running the test suite
uv run pytest
Linting
uv run ruff check .
uv run ruff format --check .
Building the documentation
uv run --extra docs sphinx-build docs docs/_build
The built documentation is available at docs/_build/index.html.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyconverters_newsml-1.8.31.tar.gz.
File metadata
- Download URL: pyconverters_newsml-1.8.31.tar.gz
- Upload date:
- Size: 732.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7ee61058f7623939597910156161509e39286bd4900cd7ab25f70b198d622cc
|
|
| MD5 |
51dc23244ba5d7f317937ecb8400f505
|
|
| BLAKE2b-256 |
b4931b485070f3461d785eff9b9cc53f3cbc13ce5b7050afab3cdae462da01b1
|
File details
Details for the file pyconverters_newsml-1.8.31-py3-none-any.whl.
File metadata
- Download URL: pyconverters_newsml-1.8.31-py3-none-any.whl
- Upload date:
- Size: 670.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08913eb427084298a40cfffccd17a31959d2d32c2843ab113f5bf338b0ce2ab8
|
|
| MD5 |
f9638a39b3a4e88693bb86de20934a08
|
|
| BLAKE2b-256 |
8c4c6beb05549c2f8964369edf44a1c5435cd06cc34b957f1b6234180035ddfe
|