sqlitebiter is a CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.
Project description
Summary
sqlitebiter is a CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.
Features
- Create a SQLite database file from:
- File(s):
CSV / Tab separated values (TSV) / Space separated values (SSV)
Microsoft Excel TM
HTML
- JSON
See also: Limitations of JSON conversion
Line-delimited JSON(LDJSON) / NDJSON / JSON Lines
Markdown
Mediawiki
SQLite
URL (scrape tabular data from web pages)
Multi-byte character support
Automatic file encoding detection
Usage
Create SQLite database from files
Create SQLite database from URL
Following is an example that converts HTML table tags within a web page to SQLite tables by the web page URL.
- Example:
$ sqlitebiter url "https://en.wikipedia.org/wiki/Comparison_of_firewalls" [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html1' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html2' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html3' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html4' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html5' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html6' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html7' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html8' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html9' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html10' table [INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html11' table [INFO] sqlitebiter url: converted results: source=1, success=11, created-table=11 [INFO] sqlitebiter url: database path: out.sqlite
- Output:
$ sqlite3 out.sqlite .schema CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, [Cost and usage limits] TEXT, OS TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ([Can Target:] TEXT, [Changing default policy to accept/reject (by issuing a single rule)] TEXT, [IP destination address(es)] TEXT, [IP source address(es)] TEXT, [TCP/UDP destination port(s)] TEXT, [TCP/UDP source port(s)] TEXT, [Ethernet MAC destination address] TEXT, [Ethernet MAC source address] TEXT, [Inbound firewall (ingress)] TEXT, [Outbound firewall (egress)] TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' ([Can:] TEXT, [work at OSI Layer 4 (stateful firewall)] TEXT, [work at OSI Layer 7 (application inspection)] TEXT, [Change TTL? (Transparent to traceroute)] TEXT, [Configure REJECT-with answer] TEXT, [DMZ (de-militarized zone) - allows for single/several hosts not to be firewalled.] TEXT, [Filter according to time of day] TEXT, [Redirect TCP/UDP ports (port forwarding)] TEXT, [Redirect IP addresses (forwarding)] TEXT, [Filter according to User Authorization] TEXT, [Traffic rate-limit / QoS] TEXT, Tarpit TEXT, Log TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' ([Features:] TEXT, "Configuration: GUI_ text or both modes?" TEXT, "Remote Access: Web (HTTP)_ Telnet_ SSH_ RDP_ Serial COM RS232_ ..." TEXT, [Change rules without requiring restart?] TEXT, [Ability to centrally manage all firewalls together] TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' ([Features:] TEXT, [Modularity: supports third-party modules to extend functionality?] TEXT, [IPS : Intrusion prevention system] TEXT, [Open-Source License?] TEXT, [supports IPv6 ?] TEXT, [Class: Home / Professional] TEXT, [Operating Systems on which it runs?] TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' ([Can:] TEXT, "NAT44 (static_ dynamic w/o ports_ PAT)" TEXT, "NAT64_ NPTv6" TEXT, [IDS (Intrusion Detection System)] TEXT, [VPN (Virtual Private Network)] TEXT, [AV (Anti-Virus)] TEXT, Sniffer TEXT, [Profile selection] TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ([vteFirewall software] TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);
The attributes within the converted SQLite database may include symbols as the above. Symbols within attributes can be replaced by using --replace-symbol option. The following example shows replacing symbols with underscores.
- Example:
$ sqlitebiter --replace-symbol _ -q url "https://en.wikipedia.org/wiki/Comparison_of_firewalls"
- Output:
$ sqlite3 out.sqlite .schema CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, "Cost_and_usage_limits" TEXT, OS TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ("Can_Target" TEXT, "Changing_default_policy_to_accept_reject_by_issuing_a_single_rule" TEXT, "IP_destination_address_es" TEXT, "IP_source_address_es" TEXT, "TCP_UDP_destination_port_s" TEXT, "TCP_UDP_source_port_s" TEXT, "Ethernet_MAC_destination_address" TEXT, "Ethernet_MAC_source_address" TEXT, "Inbound_firewall_ingress" TEXT, "Outbound_firewall_egress" TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' (Can TEXT, "work_at_OSI_Layer_4_stateful_firewall" TEXT, "work_at_OSI_Layer_7_application_inspection" TEXT, "Change_TTL_Transparent_to_traceroute" TEXT, "Configure_REJECT_with_answer" TEXT, "DMZ_de_militarized_zone_allows_for_single_several_hosts_not_to_be_firewalled" TEXT, "Filter_according_to_time_of_day" TEXT, "Redirect_TCP_UDP_ports_port_forwarding" TEXT, "Redirect_IP_addresses_forwarding" TEXT, "Filter_according_to_User_Authorization" TEXT, "Traffic_rate_limit_QoS" TEXT, Tarpit TEXT, Log TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' (Features TEXT, "Configuration_GUI_text_or_both_modes" TEXT, "Remote_Access_Web_HTTP_Telnet_SSH_RDP_Serial_COM_RS232" TEXT, "Change_rules_without_requiring_restart" TEXT, "Ability_to_centrally_manage_all_firewalls_together" TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' (Features TEXT, "Modularity_supports_third_party_modules_to_extend_functionality" TEXT, "IPS _Intrusion_prevention_system" TEXT, "Open_Source_License" TEXT, "supports_IPv6" TEXT, "Class_Home_Professional" TEXT, "Operating_Systems_on_which_it_runs" TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' (Can TEXT, "NAT44_static_dynamic_w_o_ports_PAT" TEXT, "NAT64_NPTv6" TEXT, "IDS_Intrusion_Detection_System" TEXT, "VPN_Virtual_Private_Network" TEXT, "AV_Anti_Virus" TEXT, Sniffer TEXT, "Profile_selection" TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ("vteFirewall_software" TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT); CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);
Create SQLite database from stdin
Examples
A data format is a mandatory argument for sqlitebiter stdin subcommand:
- Example:
$ echo '[{"hoge": 4, "foo": "abc"}, {"hoge": 12, "foo": "xyz"}]' | sqlitebiter stdin json [INFO] convert 'stdin' to 'json1' table [INFO] converted results: source=1, success=1, created-table=1 [INFO] database path: out.sqlite
Command help
Usage: sqlitebiter [OPTIONS] COMMAND [ARGS]... Options: -o, --output-path PATH Output path of the SQLite database file. [default: out.sqlite] -a, --append Append table(s) to existing database. --add-primary-key PRIMARY_KEY_NAME Add 'PRIMARY KEY AUTOINCREMENT' column to a converted table with the specified name. --convert-config TEXT [experimental] Configurations for data conversion. The option can be used only for url subcommand. -i, --index INDEX_ATTR Comma separated attribute names to create indices. --no-type-inference All of the columns assume as TEXT data type in creating tables. --type-hint-header Use header suffixes as type hints. If there are type hints, convert columns by datatype corresponding with type hints. The following suffixes can be recognized as type hints (case insensitive): "text": TEXT datatype. "integer": INTEGER datatype. "real": REAL datatype. --matrix-formatting [header_aligned|trim] header_aligned: fitting table data to header columns. trim: fitting table data to minimum column size. [default: header_aligned] --replace-symbol TEXT Replace symbols in attributes. -v, --verbose Verbosity level. [default: 0] --max-workers WORKERS Specify the maximum number of workers that the command may use. [default: 1] --debug For debug print. -q, --quiet Suppress execution log messages. -h, --help Show this message and exit. Commands: completion A helper command to setup command completion. configure Configure the following application settings: file Convert tabular data within CSV/Excel/HTML/JSON/Jupyter... gs Convert a spreadsheet in Google Sheets to a SQLite database... stdin Convert tabular data within CSV/HTML/JSON/Jupyter... url Scrape tabular data from a URL and convert data to a SQLite... version Show version information
For more information
More examples are available at https://sqlitebiter.rtfd.io/en/latest/pages/usage/index.html
Installation
Installation: pip (Python package manager)
pip install sqlitebiter
Installation: apt
You can install the package by apt via a Personal Package Archive (PPA):
sudo add-apt-repository ppa:thombashi/ppa sudo apt update sudo apt install sqlitebiter
Installation: dpkg (.deb package)
The following commands will download the latest .deb package from the release page, and install it.
curl -sSL https://raw.githubusercontent.com/thombashi/sqlitebiter/master/scripts/installer.sh | sudo bash
Installation: Windows
sqlitebiter can be used in Windows environments without Python installation as follows:
Navigate to https://github.com/thombashi/sqlitebiter/releases
Download the latest version of the sqlitebiter_win_x64.zip
Unzip the file
Execute sqlitebiter.exe in either Command Prompt or PowerShell
Installation: Windows (PowerShell)
The following commands will download the latest execution binary from the release page to the current directory.
wget https://github.com/thombashi/sqlitebiter/raw/master/scripts/get-sqlitebiter.ps1 -OutFile get-sqlitebiter.ps1 Set-ExecutionPolicy Unrestricted -Scope Process -Force; .\get-sqlitebiter.ps1
Installation: brew for macOS
$ brew tap thombashi/sqlitebiter
$ brew install sqlitebiter
Command Completion (bash/zsh)
setup command completion for bash:
sqlitebiter completion bash >> ~/.bashrc
setup command completion for zsh:
sqlitebiter completion zsh >> ~/.zshrc
Dependencies
Python 3.7+
Python package dependencies
Google Sheets dependencies (Optional)
Extra Python packages are required to install to use the Google Sheets feature (gs subcommand):
The extra packages can be installed with the following pip command;
$ pip install sqlitebiter[gs]
note: binary packages include these dependencies
Misc dependencies (Optional)
Dependencies other than Python packages (Optional)
libxml2 (faster HTML/Markdown conversion)
pandoc (required when converting MediaWiki files)
Documentation
Sponsors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sqlitebiter-0.36.3.tar.gz
.
File metadata
- Download URL: sqlitebiter-0.36.3.tar.gz
- Upload date:
- Size: 50.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9c810741bd6578c4370e4b01f5398b68f42eb26184bc0735d7cafc511f18195a |
|
MD5 | ecda0f7c17cf830dd22ed6efa3f15585 |
|
BLAKE2b-256 | afdf671b132f798f5732c22a9ad59767039165de8387dcaaba38024165b84f52 |
File details
Details for the file sqlitebiter-0.36.3-py3-none-any.whl
.
File metadata
- Download URL: sqlitebiter-0.36.3-py3-none-any.whl
- Upload date:
- Size: 33.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fed73ff356429acba335fdd46a9c7a164f96124b7caae828fa5271b3dacb4419 |
|
MD5 | 49fdbb294b3e3805c6700ba60c7b761a |
|
BLAKE2b-256 | 96e9b9b0c21bf519c9efb1c4cbf0cfc2fde6961ac799c9560da4b07d6b59bd7a |