A standalone library for determining if works are likely in the public domain using multiple heuristics and validation methods
Project description
isitpublic
A lightweight, standalone Python library for determining if works are likely in the public domain using multiple heuristics and validation methods.
Find the package on PyPI at: https://pypi.org/project/isitpublic/
Overview
The isitpublic library provides focused tools to assess whether a work is likely in the public domain based on:
- Title and content analysis for public domain indicators
- Heuristic checks for historical authors and time periods
- Copyright calculations based on author death years or publication dates
- Jurisdiction-specific copyright law analysis
- Advanced analysis for databases, audio/video, and software
- JSON-based data storage for configuration and results
Installation
pip install isitpublic
Quick Start
import asyncio
from isitpublic import PublicDomainValidator, ContentItem
# Create a validator instance
validator = PublicDomainValidator()
# Create a content item to validate
item = ContentItem(
title="Shakespeare's Hamlet",
content="A classic play by William Shakespeare",
snippet="To be or not to be..."
)
# Check if the item is likely in the public domain (async function)
async def main():
is_pd = await validator.is_likely_public_domain(item)
print(f"Is likely public domain: {is_pd}") # True
# Run the async function
asyncio.run(main())
Core Features
1. Content-based Validation
The library checks titles and content for public domain indicators:
import asyncio
from isitpublic import validate_public_domain_status, ContentItem
item = ContentItem(
title="A Public Domain Work",
content="This work is in the public domain"
)
# Async function for validation
async def main():
is_pd = await validate_public_domain_status(item)
print(is_pd) # True
# Run the async function
asyncio.run(main())
2. Heuristic Analysis
The library applies heuristics based on:
- Historical authors (Shakespeare, Darwin, etc.)
- Time periods (19th century, ancient, etc.)
- Content types (biblical, folk tales, etc.)
3. Copyright Calculation
Calculate public domain status based on copyright information:
from isitpublic import calculate_pd_from_metadata
metadata = {
"author_death_year": 1601, # Over 400 years ago!
"publication_year": 1600,
"country": "worldwide"
}
result = calculate_pd_from_metadata(metadata)
print(result) # {'is_public_domain': True, 'pd_year': 1672, ...}
4. Jurisdiction-Specific Analysis
Comprehensive analysis across multiple jurisdictions:
from isitpublic import PublicDomainValidator
validator = PublicDomainValidator()
# Generate a comprehensive jurisdiction report
report = validator.generate_jurisdiction_report(
author_death_year=1601, # Shakespeare died in 1601
work_title="Shakespeare's Works",
work_type="individual"
)
print(f"PD in {report['risk_assessment']['public_domain_percentage']}% of jurisdictions")
print(f"Legal recommendation: {report['legal_recommendations'][0]}")
5. Database and Compilation Rights Recognition
Advanced analysis of database and compilation rights beyond standard copyright:
# Analyze database rights that exist in addition to copyright
db_analysis = validator.analyze_database_compilation_rights(
title="Historical Database",
creation_year=2000,
compilation_type="database",
jurisdiction="DE", # EU jurisdiction with database rights
substantial_investment_claim=True,
is_licensed_dataset=False
)
print(f"Has database rights: {db_analysis['database_rights_analysis']['has_rights']}")
print(f"In public domain: {db_analysis['database_rights_analysis']['is_public_domain']}")
print(f"Risk level: {db_analysis['risk_level']}")
6. Audio/Video Copyright Analysis (Sampling & Fair Use)
Analyze copyright status for audio, video, and sampled content:
# Analyze audio/video content with sampling considerations
av_analysis = validator.analyze_audio_video_copyright(
title="Musical Composition",
creator="Artist Name",
creation_year=1990,
sampling_info={
"sampled_from_year": 1950, # Original sample source
"sample_length_seconds": 5, # Length of sample
"sampled_from_work": "Old Song"
},
intended_use="commercial" # "personal", "educational", "commercial"
)
print(f"Original work PD: {av_analysis['is_original_pd']}")
print(f"Sampling analysis: {av_analysis['sampling_analysis']}")
print(f"Risk level: {av_analysis['risk_level']}")
print(f"Recommendations: {av_analysis['recommendations']}")
7. Software and Source Code Analysis
Analyze software licenses and public domain status for code:
# Analyze if software is in public domain based on license
software_analysis = validator.analyze_software_source_pd(
project_name="Open Source Project",
license_type="MIT", # or "GPL-3.0", "CC0", "Unlicense", etc.
creation_year=2015,
author_death_year=2000, # For individual-authored software
repository_info={
"has_license_file": True,
"license_spdx_id": "MIT-0",
"copyright_holders": ["Author Name"]
}
)
print(f"Is in public domain: {software_analysis['is_pd']}")
print(f"License analysis: {software_analysis['license_analysis']}")
print(f"Risk level: {software_analysis['risk_level']}")
8. Database Rights and Compilation Analysis
Handle special rights for databases beyond standard copyright:
# Analyze database rights which vary significantly by jurisdiction
db_rights = validator.analyze_database_compilation_rights(
title="Statistical Database",
creation_year=2010,
compilation_type="database",
jurisdiction="EU", # EU has special database rights (sui generis)
database_contents=["tables", "records", "statistics"],
substantial_investment_claim=True
)
print(f"Database rights status: {db_rights['database_rights_analysis']['has_rights']}")
print(f"Years until PD: {db_rights['database_rights_analysis']['years_until_pd']}")
print(f"Protection type: {db_rights['database_rights_analysis']['protection_type']}")
9. Performance and Neighboring Rights
Analyze rights in performances beyond the underlying work:
# Analyze performance and neighboring rights (different from the underlying composition)
perf_analysis = validator.analyze_performance_neighboring_rights(
title="Live Performance Recording",
performer="Performer Name",
performance_year=2000,
recording_year=2001,
jurisdiction="US"
)
print(f"Performance rights PD: {perf_analysis['performance_rights_analysis']['is_public_domain']}")
print(f"Recording rights PD: {perf_analysis['recording_rights_analysis']['is_public_domain']}")
print(f"Overall risk: {perf_analysis['risk_level']}")
10. Historical Copyright Law Timeline
Track and analyze changes in copyright law over time:
# Add historical law changes to track legal evolution
law_changes = [
{
"effective_date": "1995-01-01",
"terms": 70, # Extended from life+50 to life+70
"description": "Extension of copyright term",
"law_type": "standard",
"change_reason": "International treaty obligation"
}
]
# Track timeline of copyright law changes
timeline_result = validator.track_copyright_law_timeline(
country="DE",
law_changes=law_changes,
source="official_government_record",
is_historical_data=True
)
# Get historical law at a specific date
law_at_date = validator.get_copyright_law_at_date(
country="DE",
target_date="1998-06-01",
law_type="standard"
)
print(f"Laws in effect in 1998: {law_at_date['current_terms']} years")
11. Historical Analysis Reports
Generate comprehensive reports showing law evolution over time:
# Generate historical analysis report for a time period
historical_report = validator.create_historical_analysis_report(
country="FR",
start_year=1980,
end_year=2020,
include_database_rights=True
)
print(f"Report for {historical_report['country']} ({start_year}-{end_year})")
print(f"Standard changes: {historical_report['standard_changes_count']}")
print(f"Database changes: {historical_report['database_changes_count']}")
# Access year-by-year analysis
for year in ["1985", "1995", "2005"]:
if year in historical_report['analysis_by_year']:
year_analysis = historical_report['analysis_by_year'][year]
print(f" {year}: Standard={year_analysis['standard_copyright']} years, "
f"DB Rights={year_analysis['database_rights']} years")
12. Law Change Impact Analysis
Analyze how specific law changes affect work public domain status:
# Analyze the impact of a specific law change on a work
impact_analysis = validator.analyze_impact_of_law_change(
country="UK",
change_date="2013-01-01", # When UK extended some terms
work_creation_year=1940,
author_death_year=1970
)
print(f"Impact of law change on {impact_analysis['change_date']}")
print(f"Work created in {impact_analysis['work_creation_year']}")
print(f"Author died in {impact_analysis['author_death_year']}")
print(f"Potential impact: {impact_analysis['potential_impact']}")
13. JSON Data Management
Save and load configuration and results in structured format:
# Save country copyright data to JSON
validator.save_country_copyright_data('data/copyright_terms.json')
# Validate multiple items and store results in an async function
async def validate_multiple():
items = [
ContentItem(title="Work 1", content="Content of work 1"),
ContentItem(title="Work 2", content="Content of work 2")
]
await validator.validate_and_store_results(items, 'data/validation_results.json')
# Run the async function
asyncio.run(validate_multiple())
# Load educational resources about public domain
pd_basics = validator.get_educational_resource('what_is_pd')
print(f"PD basics: {pd_basics['content'][0]['section']}")
### 14. Decision Tree Workflow
The library now includes a structured decision tree workflow that follows the systematic approach you specified for determining public domain status. This provides a more detailed, step-by-step analysis based on various factors.
```python
from isitpublic import assess_public_domain_status_with_decision_tree
# Assess using the structured decision tree approach
result = assess_public_domain_status_with_decision_tree(
title="A Novel",
author_death_year=1990,
publication_year=1985,
work_type="individual", # "individual", "corporate", "anonymous", "government"
country="US",
nationality="US",
copyright_renewed=True # For US works published 1928-1963
)
print(f"Is in Public Domain: {result['is_public_domain']}")
print(f"Explanation: {result['explanation']}")
print(f"Decision Path: {result['decision_path']}")
print(f"Confidence: {result['confidence']}%")
You can also use the method directly on a validator instance for more control:
from isitpublic import PublicDomainValidator
validator = PublicDomainValidator()
# Using the decision tree method on a validator instance
result = validator.assess_public_domain_status_with_decision_tree(
title="Historical Document",
publication_year=1920, # Published before 1928, so in US PD
country="US"
)
print(f"Result: {result}")
The decision tree workflow follows these steps:
- Determines work type (literary, cinematographic, musical, artistic, anonymous/pseudonymous, corporate, etc.)
- Applies appropriate copyright term rules based on:
- Publication date
- Author death date (if individual work)
- Work type (individual vs corporate vs anonymous)
- Country of origin/publishing
- Author nationality
- Copyright renewal status (for US works 1928-1963)
- Provides detailed explanations of each decision point
- Returns confidence level in the determination
15. Alternative Usage Without Async
For simpler use cases, the library also provides a synchronous function that doesn't require async/await:
from isitpublic import calculate_pd_from_metadata
metadata = {
"author_death_year": 1601, # Over 400 years ago!
"publication_year": 1600,
"country": "worldwide"
}
result = calculate_pd_from_metadata(metadata)
print(result) # {'is_public_domain': True, 'pd_year': 1672, ...}
Note that the more advanced validation methods like is_likely_public_domain() are async, while methods like calculate_pd_from_metadata() and assess_public_domain_status_with_decision_tree() are synchronous and can be used directly without async/await.
API Reference
PublicDomainValidator
Main validator class with comprehensive validation methods.
Methods:
is_likely_public_domain(item, use_wikidata=False)- [ASYNC] Check if content is likely in public domainis_likely_public_domain_with_details(item)- [ASYNC] Detailed analysis with confidence and explanationsassess_public_domain_status_with_decision_tree(title="", author_death_year=None, publication_year=None, work_type="individual", country="worldwide", nationality="worldwide", published_with_copyright_notice=True, copyright_renewed=True)- [SYNC] Structured decision tree analysis following systematic workflowcalculate_pd_status_from_copyright_info(author_death_year=None, publication_year=None, country="worldwide", work_type="individual", is_government_work=False)- [SYNC] Calculate status from copyright datagenerate_jurisdiction_report(author_death_year, publication_year, work_title, work_type, is_government_work)- [SYNC] Comprehensive jurisdiction analysisassess_use_risk(author_death_year, publication_year, intended_jurisdictions, commercial_use)- [SYNC] Risk assessment for usagesave_country_copyright_data(filepath)- [SYNC] Save country copyright data to JSONload_country_copyright_data(filepath)- [SYNC] Load country copyright data from JSONvalidate_and_store_results(items, output_file, country, work_type, is_government_work)- [ASYNC] Validate multiple items and store resultsstore_pd_calculation_results(metadata_list, output_file)- [SYNC] Perform multiple calculations and store resultsget_educational_resources(category)- [SYNC] Retrieve educational materials about public domainget_educational_resource(resource_name)- [SYNC] Retrieve specific educational resource
ContentItem
Simple data class for content to be validated.
Attributes:
title: Title of the workcontent: Full content (optional)url: URL of the content (optional)snippet: Snippet or excerpt (optional)
Standalone Functions
validate_public_domain_status(item, use_wikidata=False)- [ASYNC] Basic PD validationvalidate_public_domain_with_explanation(item, country, work_type, is_government_work, use_wikidata)- [ASYNC] PD validation with detailed explanationsassess_public_domain_status_with_decision_tree(title="", author_death_year=None, publication_year=None, work_type="individual", country="worldwide", nationality="worldwide", published_with_copyright_notice=True, copyright_renewed=True)- [SYNC] Structured decision tree analysis following systematic workflowcalculate_pd_from_metadata(metadata)- [SYNC] Calculate status from metadata dict
Async vs Sync Usage
When using async methods, wrap your code in an async function and use asyncio.run():
import asyncio
from isitpublic import PublicDomainValidator, ContentItem
async def main():
validator = PublicDomainValidator()
item = ContentItem(title="Shakespeare's Hamlet", content="To be or not to be...")
is_pd = await validator.is_likely_public_domain(item)
print(f"Is likely public domain: {is_pd}")
asyncio.run(main())
For sync methods, use them directly:
from isitpublic import calculate_pd_from_metadata
result = calculate_pd_from_metadata({
"author_death_year": 1601,
"country": "worldwide"
})
print(result)
About Directory
The library includes educational materials about public domain concepts stored in JSON format in the data/about/ directory:
what_is_pd.json- Basic definitions and conceptscopyright_terms.json- Information about copyright termsjurisdiction_rules.json- Country-specific public domain ruleshistorical_works.json- Examples of famous public domain worksmisconceptions.json- Common myths and misunderstandingsindex.json- Master index of all educational resources
License
AGPLv3 License for code. See the LICENSE file for details.
Data files are licensed under Creative Commons Attribution Share Alike 4.0 International (CC BY-SA 4.0).
Development & Code Quality
This project uses Skylos for automated code quality and security analysis:
🔍 Code Quality Features
- Dead Code Detection: Automatically identifies and removes unused imports, variables, and unreachable code
- Security Scanning: Checks for potential vulnerabilities, path traversal issues, and hardcoded secrets
- Pre-commit Integration: Automated quality checks before each commit
- CI/CD Pipeline: GitHub Actions workflow for continuous quality monitoring
🛠️ Development Setup
# Install pre-commit hooks
pre-commit install
# Run skylos manually
uv run skylos src/ --secrets --danger
# Run with verbose output
uv run skylos src/ --verbose
📊 Quality Metrics
- Dead Code: 0 detected ✅
- Security Issues: Continuously monitored
- Code Coverage: Maintained through automated testing
- Type Safety: Pydantic models ensure data validation
Architecture Note
This is the core isitpublic library focused solely on public domain determination logic.
Web API, GraphQL, and advanced application features have been separated into a dedicated application layer
that builds upon this library, ensuring the core library remains lightweight and focused on its primary function.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isitpublic-0.0.1a6.tar.gz.
File metadata
- Download URL: isitpublic-0.0.1a6.tar.gz
- Upload date:
- Size: 63.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c45ac292f73b0cfe29d8d4f0ebb213e89dfa820418746cea865e7dcc9ad1b862
|
|
| MD5 |
a6e600c461d756b168a3173d97e707c3
|
|
| BLAKE2b-256 |
6ad5050da8bb970f0a8bd2f715d7c9c19f6a3e931d6d45b6750be0764da15f77
|
File details
Details for the file isitpublic-0.0.1a6-py3-none-any.whl.
File metadata
- Download URL: isitpublic-0.0.1a6-py3-none-any.whl
- Upload date:
- Size: 76.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3150202f8f86800fbd1321e2a07e4b6192721b722c3012f899c393a2a733129
|
|
| MD5 |
f82dfd7546fb725e1401dae4b7d5c6a7
|
|
| BLAKE2b-256 |
bc9ae063118cd4f2a22391b46092c75c2133e848d55b7620fc0ce6d3032db503
|