Skip to main content

Airtable Download CSV helper

Project description

Airscraper

Open In Colab PyPI version

A simple scraper to download csv from any airtable shared view programatically, think of it as a programatic way of downloading csv from airtable shared view. Use it if:

  • You want to download a shared view periodically
  • You don't mind the shared view to be accessed basically without authorization

Requirements

Because its a simple scraper, basically only beautifulsoup is needed

  • BeautifulSoup4
  • Pandas

Installation

Using pip (Recommended)

pip install airscraper

Build From Source

  • Install build dependencies:
pip install --upgrade pip setuptools wheel
pip install tqdm
pip install --user --upgrade twine
  • Build the Package
    • python setup.py bdist_wheel
  • Install the built Package
    • pip install --upgrade dist/airscraper-0.1-py3-none-any.whl
  • Use it without adding python in front of it
    • airscraper [url]

Direct Execution (Testing Purpose)

  • Clone this project
  • Install the requirements
    • pip install -r requirements.txt
  • run the code
    • python airscraper/airscraper.py [url]

Usage

Create a shared view link and use that link to download the shared view into csv. All [url] mentioned in the examples are referring to the shared view link you get from this step.

As CLI

# Print Result to Terminal
python airscraper/airscraper.py [url]

# Pipe the result to csv file
python airscraper/airscraper.py [url] > [filename].csv

As Python Package

from airscraper import AirScraper

client = AirScraper([url])
data = client.get_table().text

# print the result
print(data)

# save as file
with open('data.csv','w') as f:
  f.write(data)

# use it with pandas
from io import StringIO
import pandas as pd

df = pd.read_csv(StringIO(data), sep=',')
df.head()

Help

usage: airscraper [-h] [-l LOCALE] [-tz TIMEZONE] view_url

Download CSV from Airtable Shared View Link, You can pass the result to file using
'> name.csv'

positional arguments:
  view_url              url generated from sharing view using link in airtable

optional arguments:
  -h, --help            show this help message and exit
  -l LOCALE, --locale LOCALE
                        Your locale, default to 'en'
  -tz TIMEZONE, --timezone TIMEZONE
                        Your timezone, use URL encoded string, default to
                        'Asia/Jakarta'

What's next

Currently I'm thinking of several things in mind:

  • ✅ Making this installed package
  • Adds accessibility to use it in FaaS Platform (most use case I could thought of are related to this)
  • ✅ Create a proper package that can be imported (so I could use it in my ETL script)
  • ✅ Fill in LICENSE and setup.py, (to be honest I have no idea yet what to put into it)
    • It turns out there are a lot of resources out there if you know what to look for :)

Contributing

If you have similar problem or have any idea to improve this package please let me know in the issues or just hit me up on twitter @BanditelolRP

Development

If you're going to try to develop it yourself, here's my overall workflow

1. Create a virtual environment

I usually used venv on python 3.8 to create a new virtualenvironment

python -m venv venv
# and activate the environment
source venv/bin/activate

2. Create a virtual environment

Install necessary requirements and install the package for development using editable

pip install wheels pytest -q
pip install -r requirements.txt
pip install -e .

3. Play around with the code

You can browse the notebook for explanation on how it works and some example use case, and I really appreciate helps in documentation and testing. Have fun!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airscraper-0.1.4.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

airscraper-0.1.4-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file airscraper-0.1.4.tar.gz.

File metadata

  • Download URL: airscraper-0.1.4.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for airscraper-0.1.4.tar.gz
Algorithm Hash digest
SHA256 1b2c8c9fd9d2267f02e9dde119ac6b44849fc7d2fdc1f3ca1adf03898136fc76
MD5 5ddc06a0e33d98fcb3d40a0b6203b1c4
BLAKE2b-256 01cb300dfd0df79bff4b11ec4263a0b979f6298708b488f86d7e233dfb321d72

See more details on using hashes here.

File details

Details for the file airscraper-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: airscraper-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for airscraper-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6f839ec17073939f5ce782bdad2e1ee2096b39ef936fbafb2c69315b9cbf895c
MD5 0d02f8c96ab9075941d7439ce9bd60b4
BLAKE2b-256 18d6e9e7b3fa43088bb4fac813ecf6404f100473c7527c239df3c50f32fbf923

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page