Singer tap for spreadsheets, built with the Meltano Singer SDK.
Project description
tap-spreadsheets
tap-spreadsheets is a Singer tap for spreadsheets.
Built with the Meltano Tap SDK for Singer Taps.
Capabilities
catalogstatediscoveractivate-versionaboutstream-mapsschema-flatteningbatch
Supported Python Versions
- 3.10
- 3.11
- 3.12
- 3.13
- 3.14
Settings
| Setting | Required | Default | Description |
|---|---|---|---|
| files | True | None | List of file configurations. |
| stream_maps | False | None | Config object for stream maps capability. For more information check out Stream Maps. |
| stream_maps.else | False | None | Currently, only setting this to __NULL__ is supported. This will remove all other streams. |
| stream_map_config | False | None | User-defined config values to be used within map expressions. |
| faker_config | False | None | Config for the Faker instance variable fake used within map expressions. Only applicable if the plugin specifies faker as an additional dependency (through the singer-sdk faker extra or directly). |
| faker_config.seed | False | None | Value to seed the Faker generator for deterministic output: https://faker.readthedocs.io/en/master/#seeding-the-generator |
| faker_config.locale | False | None | One or more LCID locale strings to produce localized output for: https://faker.readthedocs.io/en/master/#localization |
| flattening_enabled | False | None | 'True' to enable schema flattening and automatically expand nested properties. |
| flattening_max_depth | False | None | The max depth to flatten schemas. |
| batch_config | False | None | Configuration for BATCH message capabilities. |
| batch_config.encoding | False | None | Specifies the format and compression of the batch files. |
| batch_config.encoding.format | False | None | Format to use for batch files. |
| batch_config.encoding.compression | False | None | Compression format to use for batch files. |
| batch_config.storage | False | None | Defines the storage layer to use when writing batch files |
| batch_config.storage.root | False | None | Root path to use when writing batch files. |
| batch_config.storage.prefix | False | None | Prefix to use when writing batch files. |
A full list of supported settings and capabilities is available by running: tap-spreadsheets --about
Configuration
Accepted Config Options
files (array) List of file configurations. Each entry is an object with keys:
path(string, required): Glob expression (local or S3).format(string): 'excel' or 'csv'.worksheet(string, required for type excel): Worksheet index, name or regular expression (Excel only). Using regular expressions, any matching worksheet will be processed.table_name(string): Optional stream name (defaults to file name).primary_keys(array): List of PK column names.drop_empty(boolean): Drop rows with empty/null PKs.skip_columns(integer): Number of leading columns to skip.skip_rows(integer): Rows to skip before headers.sample_rows(integer): Rows to sample for schema inference.column_headers(array): Explicit column headers.delimiter(string): CSV delimiter. Inferred if not provided or default to ",".quotechar(string): CSV quote char. Inferred if not provided or default '"'.schema_overrides(dict): Overrrides JSON schema definition per field. Eg.schema_overrides: { my_column_name: { type: [string, "null"] } }
Example
config:
files:
- path: data/*.xlsx
format: excel
# table_name: test_sheet1
primary_keys: [date]
drop_empty: true
worksheet: Sheet1
- path: data/*.xlsx
format: excel
worksheet: "Report 20[0-9]{2}"
table_name: my_xlsx_sheet2
primary_keys: [date, total]
drop_empty: true
skip_columns: 1
skip_rows: 4
- path: s3://my-bucket/reports/*.csv
format: csv
table_name: csv_reports
primary_keys: [id]
delimiter: ";"
quotechar: "'"
A full list of supported settings and capabilities for this tap is available by running:
tap-spreadsheets --about
Configure using environment variables
This Singer tap will automatically import any environment variables within the working directory's
.env if the --config=ENV is provided, such that config values will be considered if a matching
environment variable is set either in the terminal context or in the .env file.
Installation
Install from PyPI:
Install from GitHub:
uv tool install git+https://github.com/ORG_NAME/tap-spreadsheets.git@main
Usage
You can easily run tap-spreadsheets by itself or in a pipeline using Meltano.
Executing the Tap Directly
tap-spreadsheets --version
tap-spreadsheets --help
tap-spreadsheets --config CONFIG --discover > ./catalog.json
Developer Resources
Follow these instructions to contribute to this project.
Initialize your Development Environment
Prerequisites:
- Python 3.10+
- uv
uv sync
Create and Run Tests
Create tests within the tests subfolder and
then run:
uv run pytest
You can also test the tap-spreadsheets CLI interface directly using uv run:
uv run tap-spreadsheets --help
Testing with Meltano
Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.
Next, install Meltano (if you haven't already) and any needed plugins:
# Install meltano
uv tool install meltano
# Initialize meltano within this directory
cd tap-spreadsheets
meltano install
Now you can test and orchestrate using Meltano:
# Test invocation:
meltano invoke tap-spreadsheets --version
# OR run a test ELT pipeline:
meltano run tap-spreadsheets target-jsonl
SDK Dev Guide
See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tap_spreadsheets-1.0.2.tar.gz.
File metadata
- Download URL: tap_spreadsheets-1.0.2.tar.gz
- Upload date:
- Size: 174.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
619daaace5338a03311478fdf4e93952c2e997c214b0bfcf6b675cb71dd596c0
|
|
| MD5 |
67abf6dadba4b5429e54177332536c1c
|
|
| BLAKE2b-256 |
ed313d9ac494b33c5088f766502ae5d6ba03a312c4caa90039f6302671e2d607
|
Provenance
The following attestation bundles were made for tap_spreadsheets-1.0.2.tar.gz:
Publisher:
build.yml on celine-eu/tap-spreadsheets
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tap_spreadsheets-1.0.2.tar.gz -
Subject digest:
619daaace5338a03311478fdf4e93952c2e997c214b0bfcf6b675cb71dd596c0 - Sigstore transparency entry: 584113779
- Sigstore integration time:
-
Permalink:
celine-eu/tap-spreadsheets@00b009e2315d978a321026f4be9f264c3cd3c642 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/celine-eu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@00b009e2315d978a321026f4be9f264c3cd3c642 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tap_spreadsheets-1.0.2-py3-none-any.whl.
File metadata
- Download URL: tap_spreadsheets-1.0.2-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eddf41f2af1d670c22ca04dbe08afbf4d10d07e3484345586e39a7e88c2a2566
|
|
| MD5 |
7b75067f9b3f4d48935a215d093d3513
|
|
| BLAKE2b-256 |
b8c906e790d117f26488a7f98efe7bec87bfbf02b504fc704215f6c1bf2bf8a6
|
Provenance
The following attestation bundles were made for tap_spreadsheets-1.0.2-py3-none-any.whl:
Publisher:
build.yml on celine-eu/tap-spreadsheets
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tap_spreadsheets-1.0.2-py3-none-any.whl -
Subject digest:
eddf41f2af1d670c22ca04dbe08afbf4d10d07e3484345586e39a7e88c2a2566 - Sigstore transparency entry: 584113782
- Sigstore integration time:
-
Permalink:
celine-eu/tap-spreadsheets@00b009e2315d978a321026f4be9f264c3cd3c642 -
Branch / Tag:
refs/tags/v1.0.2 - Owner: https://github.com/celine-eu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@00b009e2315d978a321026f4be9f264c3cd3c642 -
Trigger Event:
push
-
Statement type: