A probability-based anime filename parser

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

meganeko

These details have not been verified by PyPI

Project description

Aniparse

A probability-based anime filename parser for Python.

Aniparse parses anime video filenames into structured metadata. Unlike regex-based approaches, it uses a scoring engine where confidence accumulates from multiple signals — position, context, keywords, and patterns — so it handles the wild variety of fansub naming conventions gracefully.

Based on the C++ library Anitomy, redesigned from the ground up in v2.

Installation

pip install aniparse

Usage

import aniparse

aniparse.parse('[TaigaSubs]_Toradora!_(2008)_-_01v2_-_Tiger_and_Dragon_[1280x720_H.264_FLAC][1234ABCD].mkv')

{
    'file_name': '[TaigaSubs]_Toradora!_(2008)_-_01v2_-_Tiger_and_Dragon_[1280x720_H.264_FLAC][1234ABCD].mkv',
    'audio_term': ['FLAC'],
    'file_extension': 'mkv',
    'file_checksum': '1234ABCD',
    'video_resolution': [{'video_height': 720, 'video_width': 1280}],
    'release_version': ['2'],
    'release_group': ['TaigaSubs'],
    'series': [{
        'title': 'Toradora!',
        'year': [{'number': 2008}],
        'episode': [{'number': 1, 'release_version': '2', 'title': 'Tiger and Dragon'}],
    }],
    'video_term': ['H.264'],
}

The parse function returns a dict with all identified metadata, or None if the input is empty.

Alternative titles

Pipe | is a first-class separator. Each segment after the first becomes an alternative series entry:

aniparse.parse('[TROLLORANGE] Hell Girl Season 4 (CR WEB-DL 1080p x264 AAC) | Hell Girl: Fourth Twilight')

{
    'file_name': '[TROLLORANGE] Hell Girl Season 4 (CR WEB-DL 1080p x264 AAC) | Hell Girl: Fourth Twilight',
    'audio_term': ['AAC'],
    'video_resolution': [{'video_height': 1080, 'scan_method': 'p'}],
    'release_group': ['TROLLORANGE'],
    'release_information': ['CR'],
    'series': [
        {'title': 'Hell Girl', 'season': [{'number': 4}]},
        {'title': 'Hell Girl: Fourth Twilight'},
    ],
    'source': ['WEB-DL'],
    'video_term': ['x264'],
}

Path and folder context

# Parse from a full path
aniparse.parse('', path='/anime/Toradora/[Group] Toradora! - 01.mkv')

# Or pass folder separately
aniparse.parse('[Group] Toradora! - 01.mkv', folder='/anime/Toradora')

# aniparse.parse('[Group] Toradora! - 01.mkv', folder='/anime/Toradora')
{
    'file_name': '[Group] Toradora! - 01.mkv',
    'file_extension': 'mkv',
    'release_group': ['Group'],
    'series': [
        {'title': 'Toradora!', 'episode': [{'number': 1}]},
    ],
    'folder_name': '/anime/Toradora',
}

# aniparse.parse('01 - surge.mkv', path='series s1+s2+s3/season1/01 - surge.mkv')
{
    'file_name': '01 - surge.mkv',
    'file_extension': 'mkv',
    'series': [{
        'episode': [{'number': 1, 'title': 'surge'}],
        'title': 'series',
        'season': [{'number': 1}],
    }],
    'folder_name': 'series s1+s2+s3/season1',
}

Custom instance

For repeated parsing with custom settings:

from aniparse import Aniparse, ParserConfig

parser = Aniparse(config=ParserConfig(fuzzy=True))
result = parser.parse('[Group] Title - 01 [1080p].mkv')

Custom keywords

Provide your own WordListManager to extend or replace the built-in keyword lists:

from aniparse import Aniparse, WordListManager

parser = Aniparse(wordlist_provider=my_wordlist_manager)
result = parser.parse(filename)

Debug mode

Pass debug=True to include the token scoring breakdown in the output:

aniparse.parse(filename, debug=True)

Output structure

The output is a flat dict with these top-level keys:

Key	Type	Description
`file_name`	`str`	Original input filename
`file_extension`	`str`	File extension
`file_checksum`	`str`	CRC32 checksum (e.g. `1234ABCD`)
`file_index`	`int`	File index number
`series`	`list[SeriesInfo]`	Series metadata (title, episodes, seasons, etc.)
`audio_term`	`list[str]`	Audio codec terms (FLAC, AAC, etc.)
`video_term`	`list[str]`	Video codec terms (H.264, x265, etc.)
`video_resolution`	`list[VideoResolution]`	Resolution info (height, width, scan method)
`source`	`list[str]`	Source terms (Blu-ray, WEB-DL, etc.)
`release_group`	`list[str]`	Release group names
`release_information`	`list[str]`	Release info (BATCH, REMASTER, etc.)
`release_version`	`list[str]`	Version strings
`language`	`list[str]`	Language tags
`subs_term`	`list[str]`	Subtitle terms (Subbed, Hardsub, etc.)
`device_compatibility`	`list[str]`	Device compatibility tags

Each SeriesInfo contains:

Key	Type	Description
`title`	`str`	Series title
`type`	`str`	Series type (OVA, Movie, TV, Special, etc.)
`year`	`list[Sequence]`	Year(s)
`season`	`list[Sequence]`	Season number(s)
`episode`	`list[Sequence]`	Episode number(s), with optional `title`, `release_version`, `part`
`volume`	`list[Sequence]`	Volume number(s)
`content_type`	`list[Sequence]`	Content type (NCOP, NCED, PV, etc.) with optional `identifier`

Episode/season/volume entries support ranges (start/end), totals (number, total for "X of Y"), and alternatives.

Only present keys are included — None values are omitted.

Configuration

ParserConfig options:

Attribute	Type	Default	Description
`year_min`	`int`	`1900`	Minimum valid year
`year_max`	`int`	`2099`	Maximum valid year
`range_total`	`set[str]`	`{"of"}`	Connectors for "X of Y" patterns
`range_separator`	`set[str]`	`{"-", "~", "&", "+"}`	Range delimiters
`fuzzy`	`bool`	`False`	Enable fuzzy keyword matching
`fuzzy_threshold`	`float`	`0.8`	Fuzzy match threshold

How does it work?

Aniparse processes filenames through a six-stage pipeline:

Tokenize — Split input into tokens, detecting brackets, delimiters, and text boundaries
Identify — Match tokens against keyword lists, assigning initial possibilities with base scores
Expand — Pattern-based rules add new possibilities (checksums, numbers, years, titles, etc.)
Score — Context-aware rules adjust confidence based on position, neighbors, brackets, and structural zones
Resolve — Pick the winning possibility per token based on highest score
Compose — Assemble tokens into the final metadata dict

This approach avoids hardcoded rules like "first bracket = release group". Instead, each token accumulates evidence from multiple signals, and the highest-confidence interpretation wins.

Why use Aniparse?

Anime filenames are notoriously inconsistent:

Element order varies between groups
Brackets and parentheses may be metadata containers or part of the title
Multiple delimiter styles coexist in a single filename
Numbers are ambiguous (episode? season? year? resolution?)

Regex-based parsers can't cover the combinatorial explosion of conventions. Aniparse's scoring approach handles tens of thousands of filenames with high accuracy.

Known limitations

Single-letter "E" episode prefix can be too aggressive in brackets
Number-dash-number in titles (e.g., 009-1) may be parsed as episode ranges
CJK language descriptors may be included in the title
Parenthesized alternative series after metadata may not be detected

License

Aniparse is licensed under Mozilla Public License 2.0.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

meganeko

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.0

Feb 23, 2026

1.2.2

Feb 23, 2024

1.2.1

Mar 7, 2023

1.1.3

Mar 4, 2023

1.1.2

Dec 17, 2022

1.1.1

Nov 30, 2022

1.1.0

Nov 30, 2022

1.0

Nov 24, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aniparse-2.0.0.tar.gz (88.5 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aniparse-2.0.0-py3-none-any.whl (86.5 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file aniparse-2.0.0.tar.gz.

File metadata

Download URL: aniparse-2.0.0.tar.gz
Upload date: Feb 23, 2026
Size: 88.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for aniparse-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`5630844597904415b5f0f9aa20d627f4eaa72599db0b89cdb044b4edbe47e801`
MD5	`6652297b90a2b8b96d186fac724c7a28`
BLAKE2b-256	`6e78593ec9b12630da151bf5e70bda6754d21445c11ed9799529bdda1e6569df`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aniparse-2.0.0.tar.gz:

Publisher: python-publish.yml on MeGaNeKoS/Aniparse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aniparse-2.0.0.tar.gz
- Subject digest: 5630844597904415b5f0f9aa20d627f4eaa72599db0b89cdb044b4edbe47e801
- Sigstore transparency entry: 983465355
- Sigstore integration time: Feb 23, 2026
Source repository:
- Permalink: MeGaNeKoS/Aniparse@a95d0cee486db93ac5570bfd6136bc992920698e
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/MeGaNeKoS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@a95d0cee486db93ac5570bfd6136bc992920698e
- Trigger Event: release

File details

Details for the file aniparse-2.0.0-py3-none-any.whl.

File metadata

Download URL: aniparse-2.0.0-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 86.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for aniparse-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2ded80e9d0975a2bbd4bb099cf4a42cf098fad60bc363d170bbe42dc38978756`
MD5	`b77f4b0bd8c7407892a9f7932dc8fe40`
BLAKE2b-256	`5bd87e2bc8e830226f60093a6aec65e45db77ffae2e6c786c34a99b95e0c78c0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aniparse-2.0.0-py3-none-any.whl:

Publisher: python-publish.yml on MeGaNeKoS/Aniparse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aniparse-2.0.0-py3-none-any.whl
- Subject digest: 2ded80e9d0975a2bbd4bb099cf4a42cf098fad60bc363d170bbe42dc38978756
- Sigstore transparency entry: 983465361
- Sigstore integration time: Feb 23, 2026
Source repository:
- Permalink: MeGaNeKoS/Aniparse@a95d0cee486db93ac5570bfd6136bc992920698e
- Branch / Tag: refs/tags/v2.0.0
- Owner: https://github.com/MeGaNeKoS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@a95d0cee486db93ac5570bfd6136bc992920698e
- Trigger Event: release

aniparse 2.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Aniparse

Installation

Usage

Alternative titles

Path and folder context

Custom instance

Custom keywords

Debug mode

Output structure

Configuration

How does it work?

Why use Aniparse?

Known limitations

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance