Spark Ingestion Framework for Tables (SIFFT) - a Python library providing a consistent interface for ingesting files into Apache Spark environments.
Project description
SIFFT
File ingestion pipelines are repetitive and fragile. SIFFT (Spark Ingestion Framework For Tables) is a Python library that aims to:
- Make file reading format-agnostic with automatic delimiter and header detection
- Validate data quality with CSVW metadata and constraint checking before it enters your tables
- Provide consistent error handling across CSV, Excel, TSV, and pipe-delimited files
- Support schema evolution and merge/upsert operations for Delta tables
- Reduce boilerplate code for common Spark ingestion patterns
Use cases
Data engineers
- Ingest files from multiple sources without writing custom parsers for each format
- Validate data constraints (required fields, patterns, ranges) before writing to tables
- Handle schema mismatches and type conversions safely with detailed error reporting
Data platform teams
- Standardize file ingestion patterns across projects with SIFFT
- Leverage CSVW metadata for self-documenting data pipelines
- Support both batch processing and single-file workflows
SIFFT and Databricks AutoLoader
SIFFT and AutoLoader solve different problems and work well together.
AutoLoader answers: "Which files are new?" It watches cloud storage for new files, tracks what's been seen, and incrementally feeds them into your pipeline using Structured Streaming. It handles file discovery and delivery at scale.
SIFFT answers: "What's in this file, and can I trust it?" It parses file contents with format detection, validates data against CSVW constraints, handles schema mismatches, and writes to tables with merge support. It handles data quality and transformation.
| Concern | AutoLoader | SIFFT |
|---|---|---|
| File discovery | ✅ Cloud file notification/listing | ❌ You provide file paths |
| Incremental processing | ✅ Streaming with checkpointing | ✅ Checksum-based deduplication |
| Format detection | ❌ You configure the schema | ✅ Auto-detects delimiters, headers, types |
| Schema validation | ❌ Relies on Spark inference or manual schema | ✅ CSVW metadata with constraint checking |
| Data quality checks | ❌ Not its job | ✅ Required fields, patterns, ranges, enums, primary keys |
| Schema evolution | ✅ mergeSchema on read |
✅ schema_mode on write (strict/merge/overwrite) |
| Write modes | ❌ Append only (you handle merge downstream) | ✅ Append, overwrite, merge/upsert, errorIfExists |
Using them together
AutoLoader discovers and delivers files. SIFFT validates and transforms the contents before they hit your tables.
Pattern 1: SIFFT handles everything, AutoLoader triggers the pipeline.
AutoLoader discovers new files and passes paths to SIFFT, which reads, validates, and writes:
from file_processing import process_file
from dataframe_validation import validate_csvw_constraints
from table_writing import write_table, TableWriteOptions
def process_new_file(file_path, spark):
"""Called per file from AutoLoader's foreachBatch or downstream processing."""
result = process_file(file_path, spark)
if not result.success:
return # Log and handle
if result.metadata:
report = validate_csvw_constraints(result.dataframe, result.metadata)
if not report.valid:
return # Quarantine bad data
options = TableWriteOptions(mode="append", add_source_metadata=True)
write_table(result.dataframe, "catalog.schema.target", spark, options,
source_file_path=file_path)
Pattern 2: AutoLoader reads the file, SIFFT validates and writes.
When AutoLoader has already parsed the file into a DataFrame, skip process_file and use SIFFT for validation and writing only:
from dataframe_validation import validate_csvw_constraints
from table_writing import write_table, TableWriteOptions
def validate_and_write(batch_df, batch_id):
"""Used as a foreachBatch function with AutoLoader's streaming DataFrame."""
metadata = load_csvw_metadata("path/to/metadata.json")
if metadata:
report = validate_csvw_constraints(batch_df, metadata)
if not report.valid:
return # Quarantine bad data
options = TableWriteOptions(mode="append")
write_table(batch_df, "catalog.schema.target", spark, options)
# AutoLoader discovers and reads files, SIFFT validates each micro-batch
(
spark.readStream
.format("cloudFiles")
.option("cloudFiles.format", "csv")
.option("header", "true")
.schema(my_schema)
.load("s3://bucket/landing/")
.writeStream
.foreachBatch(validate_and_write)
.option("checkpointLocation", "/checkpoints/pipeline")
.start()
)
When to use which
- AutoLoader alone: Simple ingestion where files have consistent schemas and you trust the data quality. AutoLoader +
COPY INTOorreadStreamis sufficient. - SIFFT alone: Batch processing, backfills, ad-hoc loads, or environments without Databricks. SIFFT's own checksum tracking handles deduplication.
- Both together: Production pipelines where files arrive continuously AND you need data quality gates before writing. AutoLoader handles the "when", SIFFT handles the "what".
Requirements
- Python 3.10+
- PySpark 3.5.7
- Optional: pandas, openpyxl (for Excel support)
Installation
pip install sifft
Quick Start
1. Read a File
The toolkit automatically detects file format by extension, applies default delimiters, and handles header inference and schema parsing:
from pyspark.sql import SparkSession
from file_processing import process_file
spark = SparkSession.builder.appName("Demo").getOrCreate()
result = process_file("data.csv", spark)
if result.success:
df = result.dataframe
print(f"Loaded {result.rows_processed} rows")
df.show()
Exception mode (fail-fast):
For pipelines where you want to fail immediately on errors:
from file_processing import process_file, FileProcessingException
try:
result = process_file("data.csv", spark, raise_on_error=True)
df = result.dataframe # No need to check success
except FileProcessingException as e:
print(f"Error: {e.message}")
Skip already-processed files:
For scheduled jobs that shouldn't reprocess the same files:
from file_processing import process_file, clear_tracking
# First run - processes file, creates marker
result = process_file(
"data.csv",
spark,
tracking="marker_file",
tracking_location="/markers"
)
print(result.rows_processed) # 1000
# Second run - skips (same file, same content)
result = process_file(
"data.csv",
spark,
tracking="marker_file",
tracking_location="/markers"
)
print(result.rows_processed) # 0 (skipped)
# Force reprocessing
clear_tracking("data.csv", "marker_file", "/markers", spark)
result = process_file(
"data.csv",
spark,
tracking="marker_file",
tracking_location="/markers"
)
print(result.rows_processed) # 1000 (reprocessed)
Delta table tracking:
For queryable audit trails and easier cleanup:
# Track processing in a Delta table
result = process_file(
"data.csv",
spark,
tracking="delta_table",
tracking_location="catalog.schema.file_tracking"
)
# Query what's been processed
spark.sql("SELECT * FROM catalog.schema.file_tracking").show()
Deduplication behaviour:
Files are skipped only if the same path AND same content have been processed before:
# Same path, same content - skipped
process_file("landing/sales.csv", spark, tracking="marker_file", tracking_location="/markers")
process_file("landing/sales.csv", spark, tracking="marker_file", tracking_location="/markers")
# Second call skipped - same path and content
# Different paths, same content - both processed
process_file("2026/03/01/sales.csv", spark, tracking="marker_file", tracking_location="/markers")
process_file("2026/03/02/sales.csv", spark, tracking="marker_file", tracking_location="/markers")
# Both processed - different paths
For cloud storage (S3, Azure, GCS):
result = process_file(
"s3://bucket/data.csv",
spark,
tracking="marker_file",
tracking_location="s3://bucket/markers",
storage_options={"key": "...", "secret": "..."}
)
2. Validate Data
If your file has CSVW metadata, validate data quality constraints before loading into tables:
from dataframe_validation import validate_csvw_constraints
# Validates if CSVW metadata exists
if result.metadata:
report = validate_csvw_constraints(result.dataframe, result.metadata)
if report.valid:
print("✅ All constraints passed")
else:
for violation in report.violations:
print(f"{violation.column}: {violation.message}")
3. Write to Table
Write validated data to Delta, Parquet, or ORC tables with support for append, overwrite, and merge operations:
from table_writing import write_table, TableWriteOptions
# Simple append
result = write_table(df, "my_table", spark)
# With options
options = TableWriteOptions(
format="delta",
mode="overwrite",
partition_by=["date"]
)
result = write_table(df, "my_table", spark, options)
if result.success:
print(f"Wrote {result.rows_written} rows in {result.duration_seconds:.2f}s")
CSVW Metadata Example
CSVW (CSV on the Web) metadata allows you to define schemas and validation constraints for your CSV files. Place a metadata file alongside your CSV with the naming pattern {filename}-metadata.json:
employees.csv:
id,name,age,department,salary
1,Alice,30,Engineering,75000
2,Bob,25,Sales,50000
3,Charlie,35,Engineering,85000
employees-metadata.json:
{
"@context": "http://www.w3.org/ns/csvw",
"tableSchema": {
"columns": [
{
"name": "id",
"datatype": "integer",
"required": true,
"constraints": {
"unique": true
}
},
{
"name": "name",
"datatype": "string",
"required": true,
"constraints": {
"minLength": 2,
"maxLength": 50
}
},
{
"name": "age",
"datatype": "integer",
"constraints": {
"minimum": 18,
"maximum": 65
}
},
{
"name": "department",
"datatype": "string",
"constraints": {
"enum": ["Engineering", "Sales", "Marketing", "HR"]
}
},
{
"name": "salary",
"datatype": "integer",
"constraints": {
"minimum": 30000
}
}
],
"primaryKey": ["id"]
}
}
Using CSVW metadata:
from file_processing import process_file
from dataframe_validation import validate_csvw_constraints
# Metadata is automatically discovered
result = process_file("employees.csv", spark)
if result.metadata:
# Schema is applied from metadata
print(f"Schema: {result.dataframe.schema}")
# Validate constraints
report = validate_csvw_constraints(result.dataframe, result.metadata)
if not report.valid:
for violation in report.violations:
print(f"❌ {violation.column}: {violation.message}")
print(f" Violating rows: {violation.violating_rows}")
Supported constraints:
required- No null valuesunique- No duplicatesminimum/maximum- Numeric boundsminLength/maxLength- String lengthpattern- Regex validationenum- Allowed valuesprimaryKey- Composite uniqueness
Complete Pipeline Example
from pyspark.sql import SparkSession
from file_processing import process_file
from dataframe_validation import validate_csvw_constraints
from table_writing import write_table, TableWriteOptions
spark = SparkSession.builder.appName("Pipeline").getOrCreate()
# Step 1: Read file
result = process_file("data.csv", spark)
# Step 2: Validate
if result.metadata:
report = validate_csvw_constraints(result.dataframe, result.metadata)
if not report.valid:
print(f"Validation failed: {len(report.violations)} violations")
exit(1)
# Step 3: Write to table
options = TableWriteOptions(format="delta", mode="overwrite")
write_result = write_table(result.dataframe, "output_table", spark, options)
print(f"Pipeline complete: {write_result.rows_written} rows written")
Extensibility
The toolkit supports custom handlers and validators across all modules, allowing you to extend functionality for your specific needs.
File Processing - Custom Format Handlers
Register custom handlers for file formats:
from file_processing import register_handler, FileProcessingResult
def custom_dat_handler(file_path, spark, options):
"""Handle .dat files with summary rows."""
lines = open(file_path).readlines()[1:] # Skip first line
df = spark.createDataFrame(...)
return FileProcessingResult(
success=True,
file=file_path,
dataframe=df,
message="Processed .dat file",
rows_processed=df.count()
)
register_handler(".dat", custom_dat_handler, priority=10)
Registry functions:
from file_processing import register_handler, unregister_handler, list_registered_extensions
Table Writing - Custom Mode and Format Handlers
Register custom write modes or table formats:
from table_writing import register_mode_handler, TableWriteResult
def incremental_mode(df, table_name, spark, options, raise_on_error):
"""Custom incremental load logic."""
# Your implementation
return TableWriteResult(
success=True,
table_name=table_name,
rows_written=df.count(),
message="Incremental load complete"
)
register_mode_handler("incremental", incremental_mode)
# Use it
options = TableWriteOptions(mode="incremental", format="parquet")
result = write_table(df, "my_table", spark, options)
Custom format handler:
from table_writing import register_format_handler
class IcebergFormatHandler:
def supports_merge(self):
return True
def write(self, df, table_name, mode, options):
df.write.format("iceberg").mode(mode).saveAsTable(table_name)
register_format_handler("iceberg", IcebergFormatHandler())
Registry functions:
from table_writing import (
register_mode_handler, unregister_mode_handler, list_registered_modes,
register_format_handler, get_format_handler
)
DataFrame Validation - Custom Constraint Validators
Register custom validation constraints for CSVW validation:
from pyspark.sql.functions import col, count, when
from pyspark.sql.functions import sum as spark_sum
from dataframe_validation import register_constraint_validator, ConstraintViolation
def balance_limit_validator(df, column_name, config):
"""Check that balance doesn't exceed credit_limit."""
limit_col = config.get("limitColumn", "credit_limit")
result = df.select(
count("*").alias("total"),
spark_sum(when(col(column_name) > col(limit_col), 1).otherwise(0)).alias("violations"),
).collect()[0]
if result["violations"] > 0:
return [ConstraintViolation(
column=column_name,
constraint_type="balanceLimit",
message=f"{column_name} exceeds {limit_col}",
violating_rows=result["violations"],
total_rows=result["total"],
)]
return []
register_constraint_validator("balanceLimit", balance_limit_validator)
Use in CSVW metadata:
{
"tableSchema": {
"columns": [{
"name": "balance",
"constraints": {
"balanceLimit": {"limitColumn": "credit_limit"}
}
}]
}
}
Registry functions:
from dataframe_validation import (
register_constraint_validator, unregister_constraint_validator, list_registered_constraints
)
Supported Formats
Input Formats
The toolkit supports reading the following file formats:
- CSV (
.csv,.txt,.eot) - Comma-delimited files following RFC 4180 - TSV (
.tsv) - Tab-delimited files - Pipe-delimited (
.dat,.out) - Pipe-separated values - Excel (
.xlsx,.xls) - Microsoft Excel files via openpyxl
Metadata Standards
- CSVW - CSV on the Web (W3C Recommendation) for schema and constraint definitions
Output Formats
The toolkit supports writing to the following table formats:
- Delta Lake - Delta Lake open-source storage layer with ACID transactions
- Parquet - Apache Parquet columnar storage format
- ORC - Apache ORC optimized row columnar format
API Reference
File Processing
Format is detected by file extension. Default delimiters are applied automatically.
Functions:
process_file(file_path, spark, read_options=None, metadata_path=None, raise_on_error=False, tracking=None, tracking_location=None, storage_options=None)
- Process a single file
raise_on_error: If True, raises FileProcessingException instead of returning error resulttracking: Tracking method -"marker_file"or"delta_table". If None, no tracking.tracking_location: Directory for markers or Delta table name. Required if tracking is set.storage_options: Optional dict for cloud storage authentication- Returns:
FileProcessingResultwithsuccess,dataframe,rows_processed,metadata,checksum
process_files_batch(file_paths, spark, read_options=None, raise_on_error=False)
- Process multiple files
raise_on_error: If True, raises FileProcessingException after processing all files if any failed- Returns: List of
FileProcessingResult
process_directory(directory_path, spark, read_options=None, raise_on_error=False)
- Process all files in a directory
raise_on_error: If True, raises FileProcessingException after processing all files if any failed- Returns: List of
FileProcessingResult
Tracking Methods:
| Method | Location | Use Case |
|---|---|---|
"marker_file" |
/markers/ab/c1/{checksum}_{filename}.processed |
Simple file-based tracking |
"delta_table" |
catalog.schema.tracking_table |
Queryable audit trail |
Deduplication Logic:
Files are skipped only when both checksum AND full path match a previous record:
- Same path, same content → skipped
- Same path, different content → processed (file was updated)
- Different path, same content → both processed
- Different path, different content → both processed
Tracking Functions:
from file_processing import check_already_processed, record_processed, clear_tracking
# Check if file was processed
check_already_processed(file_path, "marker_file", "/markers", spark)
# Manually record processing
record_processed(file_path, "marker_file", "/markers", spark, rows_processed=100)
# Clear tracking to allow reprocessing
clear_tracking(file_path, "marker_file", "/markers", spark)
Options:
read_options = {
"delimiter": "|", # Override default delimiter
"header": "true", # First row is header (default: "true")
# Options: "true", "false", "auto"
"encoding": "utf-8", # File encoding
"nullValue": "NULL", # String to treat as null
"max_file_size_mb": 100, # Maximum Excel file size in MB
"max_corrupt_records_percent": 0.1, # Maximum corruption threshold (10%)
"sheet_name": 0, # Excel sheet to read
}
Header Detection:
- Default:
"true"(assumes first row is header - most common case) "false": No header row, Spark generates column names (_c0, _c1, etc.)"auto": Automatically detect if header exists (uses heuristics)
CSVW Metadata:
Place a {filename}-metadata.json file alongside your CSV to define schema and constraints. Alternatively, specify a custom path:
result = process_file("data.csv", spark, metadata_path="/path/to/metadata.json")
Data Validation
Supported Constraints:
Column-level:
required- No null valuesunique- No duplicatesminimum/maximum- Numeric boundsminLength/maxLength- String lengthpattern- Regex validationenum- Allowed values
Table-level:
primaryKey- Composite uniqueness
Functions:
validate_csvw_constraints(df, metadata)
- Validate DataFrame against CSVW metadata
- Returns:
CSVWConstraintReportwithvalid,violations,total_violations
validate_schema(df, expected_schema)
- Validate DataFrame schema against expected schema
- Returns: Dictionary mapping column names to
ValidationLogobjects
apply_schema(df, expected_schema, raise_on_cast_failure=False)
- Apply expected schema to DataFrame (rename columns and cast types)
raise_on_cast_failure: If True, raises ValueError if any casts would produce nulls- Returns: Transformed DataFrame
format_violations(report)
- Format validation violations into readable string
- Returns: Formatted string with violation details
Example:
from dataframe_validation import validate_csvw_constraints, format_violations
report = validate_csvw_constraints(df, metadata)
if not report.valid:
print(format_violations(report))
Table Writing
Write Modes:
append- Add rows to existing table (default)overwrite- Replace all datamerge- Upsert based on merge keys (Delta only)errorIfExists- Fail if table exists
Formats:
delta(default)parquetorc
Schema Modes:
strict- Exact schema match required (default)merge- Add new columns from DataFrameoverwrite- Replace table schema
Functions:
write_table(df, table_name, spark, options=None)
- Write DataFrame to table
- Returns:
TableWriteResultwithsuccess,rows_written,duration_seconds,message,error
TableWriteOptions:
TableWriteOptions(
format="delta", # Table format
mode="append", # Write mode
partition_by=None, # List of partition columns
schema_mode="strict", # Schema evolution mode
merge_keys=None, # Columns for merge mode
create_if_not_exists=True, # Auto-create table
add_source_metadata=False, # Add source file metadata columns
extra_metadata=None # Additional custom metadata columns
)
Examples:
Simple append:
result = write_table(df, "my_table", spark)
Overwrite with partitioning:
options = TableWriteOptions(
mode="overwrite",
partition_by=["year", "month"]
)
result = write_table(df, "my_table", spark, options)
With source metadata and custom columns:
options = TableWriteOptions(
add_source_metadata=True,
extra_metadata={
"batch_id": "2026-03-10-001",
"checksum": result.checksum,
"environment": "production"
}
)
result = write_table(
df, "my_table", spark, options,
source_file_path="s3://bucket/data.csv"
)
Merge/upsert (Delta only):
options = TableWriteOptions(
format="delta",
mode="merge",
merge_keys=["id"]
)
result = write_table(df, "my_table", spark, options)
TableWriteResult:
result.success # bool - True if write succeeded
result.rows_written # int - Number of rows written
result.duration_seconds # float - Time taken
result.message # str - Success or error message
result.error # dict - Error details if failed
File Management
Functions:
safe_move(src_path, dst_path, storage_options=None, raise_on_error=False)
- Move a file safely with validation checks
storage_options: Optional dict for custom authentication credentials- Supports local and cloud storage (S3, Azure, GCS)
raise_on_error: If True, raises FileManagementException on failure- Returns:
FileOperationResultwithsuccess,source_path,destination_path,error
list_files_in_directory(directory, storage_options=None, raise_on_error=False)
storage_options: Optional dict for custom authentication credentials- List all files in a directory (non-recursive)
- Supports local and cloud storage (S3, Azure, GCS)
raise_on_error: If True, raises FileManagementException on failure
Supported Storage:
- Local filesystem:
/path/to/file.csv - AWS S3:
s3://bucket/path/to/file.csv - Azure Blob:
az://container/path/to/file.csv - Google Cloud Storage:
gs://bucket/path/to/file.csv - Returns:
FileOperationResultwithsuccess,fileslist,error
Example:
from file_management import safe_move, list_files_in_directory
# Move a file
result = safe_move("input/data.csv", "processed/data.csv")
if result.success:
print("File moved successfully")
# Move from S3 to S3
result = safe_move("s3://bucket/input/data.csv", "s3://bucket/processed/data.csv")
# S3 with custom credentials
storage_options = {
"key": "AWS_ACCESS_KEY_ID",
"secret": "AWS_SECRET_ACCESS_KEY"
}
result = safe_move(
"s3://bucket/input/data.csv",
"s3://bucket/processed/data.csv",
storage_options=storage_options
)
# List files in a directory
result = list_files_in_directory("input/")
if result.success:
for file in result.files:
print(f"{file['name']}: {file['size']} bytes")
# List files in S3 bucket
result = list_files_in_directory("s3://bucket/input/")
if result.success:
for file in result.files:
print(f"{file['name']}: {file['size']} bytes")
Development
Setup
# Clone the repository
git clone <repository-url>
cd file-ingestion-framework
# Install dependencies with uv (recommended)
uv sync
# Or with pip
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install
Running Tests
The project uses Docker to provide a consistent Spark environment for testing. This ensures tests run the same way locally and in CI/CD pipelines without requiring a local Spark installation.
# Run all tests with Docker (recommended)
just test
# Run tests locally (requires local Spark installation)
just test-local
# Run specific test file
just test-file tests/test_file_processing.py
Docker setup:
docker-compose.ymldefines the Spark test environmentDockerfilecontains Apache Spark 3.4.0 with PySpark 3.5.7 and test dependencies- Tests run in an isolated container to avoid environment conflicts
Available Commands
See justfile for all available commands:
just --list
Common commands:
just format- Format code with ruffjust lint- Run linting checksjust build- Build packagejust demo- Run demo examples
Design Decisions
Architectural Decision Records (ADRs) are in the design_decisions directory:
- Architecture Overview — why four independent packages
- File Processing API — file-to-DataFrame conversion choices
- DataFrame Validation API — schema and constraint validation choices
- Table Writing API — DataFrame-to-table persistence choices
- File Management API — file operations and storage abstraction choices
- File Tracking and Deduplication — duplicate detection and tracking choices
Compatibility Notes
Databricks / Unity Catalog
-
ANSI mode cast behaviour: Databricks clusters run with ANSI mode enabled by default. This means
apply_schema()with incompatible type casts (e.g.,"Alice"toIntegerType) will raise a SparkAnalysisExceptionrather than silently producing nulls. Theraise_on_cast_failureflag is effectively redundant on ANSI-mode clusters since Spark enforces strict casting natively. -
Schema strictness on overwrite: When using
mode="overwrite"withschema_mode="strict"(the default), Delta Lake on Databricks will reject writes where the DataFrame schema doesn't exactly match the existing table schema (e.g.,IntegerTypevsLongTypefrom inference differences). Useschema_mode="overwrite"if the incoming schema may differ.
License
MIT License - see LICENSE file for details.
Contributors
- Iwan Dyke
- Fahad Khan
- Michal Poreba
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sifft-0.8.0.tar.gz.
File metadata
- Download URL: sifft-0.8.0.tar.gz
- Upload date:
- Size: 102.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
277a1552d3a619ddb528d40fbe83c3135a764aef93c5d47713b7e02ff3780a28
|
|
| MD5 |
37b909d03ca427ff010795542e827fe8
|
|
| BLAKE2b-256 |
e311a7c4bd7757764a1d80847e8e491f9dcf3d4fcc9148e37ba64a87aea211c8
|
Provenance
The following attestation bundles were made for sifft-0.8.0.tar.gz:
Publisher:
publish.yml on dvla/sifft
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sifft-0.8.0.tar.gz -
Subject digest:
277a1552d3a619ddb528d40fbe83c3135a764aef93c5d47713b7e02ff3780a28 - Sigstore transparency entry: 1206261286
- Sigstore integration time:
-
Permalink:
dvla/sifft@aad242dea5c9ad01837e49027544be20e821ed3b -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/dvla
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aad242dea5c9ad01837e49027544be20e821ed3b -
Trigger Event:
release
-
Statement type:
File details
Details for the file sifft-0.8.0-py3-none-any.whl.
File metadata
- Download URL: sifft-0.8.0-py3-none-any.whl
- Upload date:
- Size: 53.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34093159197d9c2bd3bdc2db19a41a8ff548b3053103d6033f8b889f99f382ca
|
|
| MD5 |
3424470024141b2bd41b3fc6ef9f40e5
|
|
| BLAKE2b-256 |
24a2ca37f8a70aa83f53b2943fec0540f1a9f95bf38150d51409da6a5c3d5792
|
Provenance
The following attestation bundles were made for sifft-0.8.0-py3-none-any.whl:
Publisher:
publish.yml on dvla/sifft
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sifft-0.8.0-py3-none-any.whl -
Subject digest:
34093159197d9c2bd3bdc2db19a41a8ff548b3053103d6033f8b889f99f382ca - Sigstore transparency entry: 1206261296
- Sigstore integration time:
-
Permalink:
dvla/sifft@aad242dea5c9ad01837e49027544be20e821ed3b -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/dvla
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@aad242dea5c9ad01837e49027544be20e821ed3b -
Trigger Event:
release
-
Statement type: