Bridging the gap across the different file formats and streamlining the process to accessing ingested data via Python objects
Project description
pyobjectify
Bridge the gap across the different file formats and streamline the process to accessing ingested data via Python objects
Overview
Open data is abound. For example, NYC Open Data has over 3,000 datasets spanning over 97 agencies in New York City. This data comes in many different formats, including CSV, JSON, XML, XLS/XLSX, KML, KMZ, Shapefile, GeoJSON, JSON, and more.
In order to import and analyze the data in Python involves sending a request to download the raw data, then converting it into a Python object so that methods can be used to parse its contents. However, this process varies across the many different data types.
This project aims to streamline this process and bridge the gap across the different file formats to allow the end user to get started on data analytics more quickly with a quick function call.
Install from pip
pip install pyobjectify
Quick start
import pyobjectify
import pandas as pd
json_dict = pyobjectify.from_url("https://bit.ly/42KCUSv") # URL holds JSON data, returns data in dict
json_df = pyobjectify.from_url("https://bit.ly/42KCUSv", pd.DataFrame) # User-specified output data type
Supported types
Connectivity tyes
- Local files (e.g.
./relative/example.json
,/absolute/path/example.json
) - Online, static (e.g.
https://some.website/example.json
,http://bit.ly/some-json-endpoint
)
For example, at the moment, a data stream from the Internet is not supported.
Resource (input) data types
- JSON
- CSV
- TSV
- XML
- XLSX
Supported conversions
- JSON →
dict
,list
,pandas.DataFrame
- CSV →
list
- TSV →
list
- XML →
dict
- XLSX →
dict
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.