Skip to main content

sqlitebiter is a CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.

Project description

Summary

sqlitebiter is a CLI tool to convert CSV / Excel / HTML / JSON / Jupyter Notebook / LDJSON / LTSV / Markdown / SQLite / SSV / TSV / Google-Sheets to a SQLite database file.

PyPI package version Supported Python versions CI status of Linux/macOS/Windows CodeQL

Features

Usage

Create SQLite database from files

https://cdn.jsdelivr.net/gh/thombashi/sqlitebiter@master/docs/svg/usage_example.svg

Create SQLite database from URL

Following is an example that converts HTML table tags within a web page to SQLite tables by the web page URL.

Example:
$ sqlitebiter url "https://en.wikipedia.org/wiki/Comparison_of_firewalls"
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html1' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html2' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html3' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html4' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html5' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html6' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html7' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html8' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html9' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html10' table
[INFO] sqlitebiter url: convert 'Comparison_of_firewalls' to 'Comparison_of_firewalls_Wikipedia_html11' table
[INFO] sqlitebiter url: converted results: source=1, success=11, created-table=11
[INFO] sqlitebiter url: database path: out.sqlite
Output:
$ sqlite3 out.sqlite .schema
CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, [Cost and usage limits] TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ([Can Target:] TEXT, [Changing default policy to accept/reject (by issuing a single rule)] TEXT, [IP destination address(es)] TEXT, [IP source address(es)] TEXT, [TCP/UDP destination port(s)] TEXT, [TCP/UDP source port(s)] TEXT, [Ethernet MAC destination address] TEXT, [Ethernet MAC source address] TEXT, [Inbound firewall (ingress)] TEXT, [Outbound firewall (egress)] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' ([Can:] TEXT, [work at OSI Layer 4 (stateful firewall)] TEXT, [work at OSI Layer 7 (application inspection)] TEXT, [Change TTL? (Transparent to traceroute)] TEXT, [Configure REJECT-with answer] TEXT, [DMZ (de-militarized zone) - allows for single/several hosts not to be firewalled.] TEXT, [Filter according to time of day] TEXT, [Redirect TCP/UDP ports (port forwarding)] TEXT, [Redirect IP addresses (forwarding)] TEXT, [Filter according to User Authorization] TEXT, [Traffic rate-limit / QoS] TEXT, Tarpit TEXT, Log TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' ([Features:] TEXT, "Configuration: GUI_ text or both modes?" TEXT, "Remote Access: Web (HTTP)_ Telnet_ SSH_ RDP_ Serial COM RS232_ ..." TEXT, [Change rules without requiring restart?] TEXT, [Ability to centrally manage all firewalls together] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' ([Features:] TEXT, [Modularity: supports third-party modules to extend functionality?] TEXT, [IPS : Intrusion prevention system] TEXT, [Open-Source License?] TEXT, [supports IPv6 ?] TEXT, [Class: Home / Professional] TEXT, [Operating Systems on which it runs?] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' ([Can:] TEXT, "NAT44 (static_ dynamic w/o ports_ PAT)" TEXT, "NAT64_ NPTv6" TEXT, [IDS (Intrusion Detection System)] TEXT, [VPN (Virtual Private Network)] TEXT, [AV  (Anti-Virus)] TEXT, Sniffer TEXT, [Profile selection] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ([vteFirewall software] TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);

The attributes within the converted SQLite database may include symbols as the above. Symbols within attributes can be replaced by using --replace-symbol option. The following example shows replacing symbols with underscores.

Example:
$ sqlitebiter --replace-symbol _ -q url "https://en.wikipedia.org/wiki/Comparison_of_firewalls"
Output:
$ sqlite3 out.sqlite .schema
CREATE TABLE IF NOT EXISTS '_source_info_' ("source_id" INTEGER NOT NULL, "dir_name" TEXT, "base_name" TEXT NOT NULL, "format_name" TEXT NOT NULL, "dst_table" TEXT NOT NULL, size INTEGER, mtime INTEGER);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html1' (Firewall TEXT, License TEXT, "Cost_and_usage_limits" TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html2' (Firewall TEXT, License TEXT, Cost TEXT, OS TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html3' ("Can_Target" TEXT, "Changing_default_policy_to_accept_reject_by_issuing_a_single_rule" TEXT, "IP_destination_address_es" TEXT, "IP_source_address_es" TEXT, "TCP_UDP_destination_port_s" TEXT, "TCP_UDP_source_port_s" TEXT, "Ethernet_MAC_destination_address" TEXT, "Ethernet_MAC_source_address" TEXT, "Inbound_firewall_ingress" TEXT, "Outbound_firewall_egress" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html4' (Can TEXT, "work_at_OSI_Layer_4_stateful_firewall" TEXT, "work_at_OSI_Layer_7_application_inspection" TEXT, "Change_TTL_Transparent_to_traceroute" TEXT, "Configure_REJECT_with_answer" TEXT, "DMZ_de_militarized_zone_allows_for_single_several_hosts_not_to_be_firewalled" TEXT, "Filter_according_to_time_of_day" TEXT, "Redirect_TCP_UDP_ports_port_forwarding" TEXT, "Redirect_IP_addresses_forwarding" TEXT, "Filter_according_to_User_Authorization" TEXT, "Traffic_rate_limit_QoS" TEXT, Tarpit TEXT, Log TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html5' (Features TEXT, "Configuration_GUI_text_or_both_modes" TEXT, "Remote_Access_Web_HTTP_Telnet_SSH_RDP_Serial_COM_RS232" TEXT, "Change_rules_without_requiring_restart" TEXT, "Ability_to_centrally_manage_all_firewalls_together" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html6' (Features TEXT, "Modularity_supports_third_party_modules_to_extend_functionality" TEXT, "IPS _Intrusion_prevention_system" TEXT, "Open_Source_License" TEXT, "supports_IPv6" TEXT, "Class_Home_Professional" TEXT, "Operating_Systems_on_which_it_runs" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html7' (Can TEXT, "NAT44_static_dynamic_w_o_ports_PAT" TEXT, "NAT64_NPTv6" TEXT, "IDS_Intrusion_Detection_System" TEXT, "VPN_Virtual_Private_Network" TEXT, "AV_Anti_Virus" TEXT, Sniffer TEXT, "Profile_selection" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html8' ("vteFirewall_software" TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html9' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html10' (A TEXT, B TEXT);
CREATE TABLE IF NOT EXISTS 'Comparison_of_firewalls_Wikipedia_html11' (A TEXT, B TEXT);

Create SQLite database from stdin

Examples

A data format is a mandatory argument for sqlitebiter stdin subcommand:

Example:
$ echo '[{"hoge": 4, "foo": "abc"}, {"hoge": 12, "foo": "xyz"}]' | sqlitebiter stdin json
[INFO] convert 'stdin' to 'json1' table
[INFO] converted results: source=1, success=1, created-table=1
[INFO] database path: out.sqlite

Command help

Usage: sqlitebiter [OPTIONS] COMMAND [ARGS]...

Options:
  -o, --output-path PATH          Output path of the SQLite database file.
                                  [default: out.sqlite]
  -a, --append                    Append table(s) to existing database.
  --add-primary-key PRIMARY_KEY_NAME
                                  Add 'PRIMARY KEY AUTOINCREMENT' column to a
                                  converted table with the specified name.
  --convert-config TEXT           [experimental] Configurations for data
                                  conversion. The option can be used only for
                                  url subcommand.
  -i, --index INDEX_ATTR          Comma separated attribute names to create
                                  indices.
  --no-type-inference             All of the columns assume as TEXT data type
                                  in creating tables.
  --type-hint-header              Use header suffixes as type hints. If there
                                  are type hints, convert columns by datatype
                                  corresponding with type hints. The following
                                  suffixes can be recognized as type hints
                                  (case insensitive): "text": TEXT datatype.
                                  "integer": INTEGER datatype. "real": REAL
                                  datatype.
  --matrix-formatting [header_aligned|trim]
                                  header_aligned: fitting table data to header
                                  columns. trim: fitting table data to minimum
                                  column size.  [default: header_aligned]
  --replace-symbol TEXT           Replace symbols in attributes.
  -v, --verbose                   Verbosity level.  [default: 0]
  --max-workers WORKERS           Specify the maximum number of workers that
                                  the command may use.  [default: 1]
  --debug                         For debug print.
  -q, --quiet                     Suppress execution log messages.
  -h, --help                      Show this message and exit.

Commands:
  completion  A helper command to setup command completion.
  configure   Configure the following application settings:
  file        Convert tabular data within CSV/Excel/HTML/JSON/Jupyter...
  gs          Convert a spreadsheet in Google Sheets to a SQLite database...
  stdin       Convert tabular data within CSV/HTML/JSON/Jupyter...
  url         Scrape tabular data from a URL and convert data to a SQLite...
  version     Show version information

For more information

More examples are available at https://sqlitebiter.rtfd.io/en/latest/pages/usage/index.html

Installation

Installation: pip (Python package manager)

pip install sqlitebiter

Installation: apt

You can install the package by apt via a Personal Package Archive (PPA):

sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install sqlitebiter

Installation: dpkg (.deb package)

The following commands will download the latest .deb package from the release page, and install it.

curl -sSL https://raw.githubusercontent.com/thombashi/sqlitebiter/master/scripts/installer.sh | sudo bash

Installation: Windows

sqlitebiter can be used in Windows environments without Python installation as follows:

  1. Navigate to https://github.com/thombashi/sqlitebiter/releases

  2. Download the latest version of the sqlitebiter_win_x64.zip

  3. Unzip the file

  4. Execute sqlitebiter.exe in either Command Prompt or PowerShell

Installation: Windows (PowerShell)

The following commands will download the latest execution binary from the release page to the current directory.

wget https://github.com/thombashi/sqlitebiter/raw/master/scripts/get-sqlitebiter.ps1 -OutFile get-sqlitebiter.ps1
Set-ExecutionPolicy Unrestricted -Scope Process -Force; .\get-sqlitebiter.ps1

Installation: brew for macOS

$ brew tap thombashi/sqlitebiter
$ brew install sqlitebiter

Command Completion (bash/zsh)

setup command completion for bash:

    sqlitebiter completion bash >> ~/.bashrc

setup command completion for zsh:

    sqlitebiter completion zsh >> ~/.zshrc

Dependencies

Python 3.7+

Python package dependencies

Google Sheets dependencies (Optional)

Extra Python packages are required to install to use the Google Sheets feature (gs subcommand):

The extra packages can be installed with the following pip command;

$ pip install sqlitebiter[gs]

note: binary packages include these dependencies

Misc dependencies (Optional)

Dependencies other than Python packages (Optional)

  • libxml2 (faster HTML/Markdown conversion)

  • pandoc (required when converting MediaWiki files)

Documentation

https://sqlitebiter.rtfd.io/

Sponsors

Charles Becker (chasbecker) onetime: Arturi0 onetime: Dmitry Belyaev (b4tman)

Become a sponsor

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlitebiter-0.36.3.tar.gz (50.5 kB view details)

Uploaded Source

Built Distribution

sqlitebiter-0.36.3-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file sqlitebiter-0.36.3.tar.gz.

File metadata

  • Download URL: sqlitebiter-0.36.3.tar.gz
  • Upload date:
  • Size: 50.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for sqlitebiter-0.36.3.tar.gz
Algorithm Hash digest
SHA256 9c810741bd6578c4370e4b01f5398b68f42eb26184bc0735d7cafc511f18195a
MD5 ecda0f7c17cf830dd22ed6efa3f15585
BLAKE2b-256 afdf671b132f798f5732c22a9ad59767039165de8387dcaaba38024165b84f52

See more details on using hashes here.

File details

Details for the file sqlitebiter-0.36.3-py3-none-any.whl.

File metadata

File hashes

Hashes for sqlitebiter-0.36.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fed73ff356429acba335fdd46a9c7a164f96124b7caae828fa5271b3dacb4419
MD5 49fdbb294b3e3805c6700ba60c7b761a
BLAKE2b-256 96e9b9b0c21bf519c9efb1c4cbf0cfc2fde6961ac799c9560da4b07d6b59bd7a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page