Load data from CSV files into database tables
Project description
csv-ingestor
Load data from CSV files into PostgreSQL tables.
Installation
pip install csv-ingestor
Examples
In the simplest case, load data from a CSV file into a table:
from csv_ingestor import Ingestor, ingest_file
class MyIngestor(Ingestor):
filename_pattern = r'simple\.\d{8}_\d{4}\.csv(\.gz)?'
tables = [
{
'table': 'my_table',
'csv_columns': ('id', 'value'),
}
]
ingest_file('simple.20240910_1430.csv.gz')
But maybe you have multiple tables to load from different CSV files, or from different fields in each file, and the column names don't match what's in the CSV files, and the data isn't quite the right shape either, and you'd like to skip some CSV records, and you'd like to update existing DB records:
from csv_ingestor import CSVPicker, Ingestor, SkipRecord, ingest_file
class MyPicker(CSVPicker):
def check_skip(self, record):
if record['value'].startswith('SKIP!'):
raise SkipRecord
def modify_record(self, record):
record['value'] = record['value'].replace('bad words', '@!#$*%&')
class OneIngestor(Ingestor):
filename_pattern = r'data\.\d{8}_\d{4}\.csv(\.gz)?'
tables = [
{
'table': 'my_first_table',
'csv_columns': ('their_id', 'their_value'),
'column_map': {'their_id': 'id', 'their_value': 'value'},
'on_conflict': '(id) DO UPDATE SET value = excluded.value',
}
]
class AnotherIngestor(Ingestor):
filename_pattern = r'other_data\.\d{8}\.csv(\.gz)?'
csv_picker = MyPicker
tables = [
{
'table': 'my_other_table',
'csv_columns': ('id', 'value'),
'on_conflict': '(id) DO UPDATE SET value = excluded.value',
},
{
'table': 'a_third_table',
'csv_columns': ('id', 'metadata'),
'on_conflict': '(id) DO NOTHING',
}
]
ingest_file('data.20240910_1430.csv.gz')
ingest_file('other_data.20240910.csv')
Each Ingestor subclass will be tried in turn until one matches the filename, and that one will
be used to parse and load the data into its DB tables.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file csv_ingestor-0.1.3.tar.gz.
File metadata
- Download URL: csv_ingestor-0.1.3.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
affcd91111b1a1466bdcc20ef542dd2f74bd3b22774b6f73c59092a633fd5f54
|
|
| MD5 |
1993c4f65075297aff892c4894ff8b0f
|
|
| BLAKE2b-256 |
3896aad656e77e129e175c75579d83021a02b12d65c10dbe4b15fcf4c718a4f2
|
File details
Details for the file csv_ingestor-0.1.3-py3-none-any.whl.
File metadata
- Download URL: csv_ingestor-0.1.3-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3b49898cb851c654e9ae576947ee6b8bd6f66e6bcfa1ba28a6b2b8366a470ff
|
|
| MD5 |
f6c8b911f942c8630e2b26ad051387c3
|
|
| BLAKE2b-256 |
e7124b6e45694e386a6375e4b3f5891e48eb1a49fc9a937ee318d82cc54bab77
|