Skip to main content

UNKNOWN

Project description

sqoopy
======

Python CLI to generate custom [sqoop](http://sqoop.apache.org/) import statements.
Modified from [https://github.com/wikimedia/sqoopy/](https://github.com/wikimedia/sqoopy/).


## Installation

You can install `sqoopy` via `pip`:

```bash
$ pip install sqoopy
```

## Usage

`sqoopy` will generate custom [sqoop](http://sqoop.apache.org/) import statements given a few simple options:

```bash
usage: sqoopy [-h] [-c CONNECT] [-d TARGET_DIR] [-t TABLES] [-x EXCL_TABLES]
[--generate] [--pool-size POOL_SIZE]
[--max-pool-maps MAX_POOL_MAPS] [--min-mbs MIN_MBS]
[--max-mbs MAX_MBS]

Python CLI to generate custom sqoop import statements.

optional arguments:
-h, --help show this help message and exit
-c CONNECT, --connect CONNECT
A jdbc connection string.
-d TARGET_DIR, --target-dir TARGET_DIR
The directory to send output to. If sending to s3, use
"{table}" to insert the table name into the directory.
EG: s3://my-bucket/{table}/
-t TABLES, --tables TABLES
(Optional) comma-separated list of tables that need to
be inspected. If not supplied, all tables will be
imported.
-x EXCL_TABLES, --excl-tables EXCL_TABLES
(Optional) comma-separated list of tables to exclude.
If not supplied and --tables not specified, all tables
will be imported.
--generate Just generate the sqoop commands and print them to the
console.
--pool-size POOL_SIZE
The number of commands to execute concurrently
--max-pool-maps MAX_POOL_MAPS
The number of mappers at which the import of a table
will occur serially, after all other pooled imports
are complete
--min-mbs MIN_MBS The minimim chunk size (in MBs). Used to determine the
number of mappers needed for a given table
--max-mbs MAX_MBS The maximum chunk size (in MBs). Used to determine the
number of mappers needed for a given table
```

You can also passthrough any other `sqoop import` arguments:

```bash
$ sqoopy --connect=sqlite:///tests/test.db --target-dir=s3://foo-bar/{table} --tables=test --split-by id --num-mappers 4
```

This will output:

```
sqoop import --connect=sqlite:///tests/test.db --table=test --target-dir=s3://foo-bar/test/ --split-by id --num-mappers 4
```

## Tests

You can run tests by first installing `nose`:

```
$ pip install nose
$ nosetests
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqoopy-0.0.8.5.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

sqoopy-0.0.8.5.macosx-10.10-intel.exe (109.8 kB view details)

Uploaded Source

File details

Details for the file sqoopy-0.0.8.5.tar.gz.

File metadata

  • Download URL: sqoopy-0.0.8.5.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sqoopy-0.0.8.5.tar.gz
Algorithm Hash digest
SHA256 af7b2df3c10323c6cd5aff1706f8b469187e2118e027043934abba21f5f09b09
MD5 0d0fafca61d916c89170686e6185ae24
BLAKE2b-256 b4bbe5d44a7d89269f85f5ec4ce8cd3b86348ce0d81766f4a3549a9dec376c04

See more details on using hashes here.

File details

Details for the file sqoopy-0.0.8.5.macosx-10.10-intel.exe.

File metadata

File hashes

Hashes for sqoopy-0.0.8.5.macosx-10.10-intel.exe
Algorithm Hash digest
SHA256 1c740b470c53fae8667be907324e4dd43c74326fc885df7c75dea9a37422c7b3
MD5 0331e273db478c856c6aff5ff6bb0e85
BLAKE2b-256 44013ca9c504a6fb31d41190f05c0bc579974e2fa65193ac3ce57eb4986fe0dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page