Skip to main content

Transform Pandas DataFrames into Exports to be sent to DGraph

Project description

dgraphpandas

Build PyPI License: MIT Coverage Status Codacy Badge

A Library (with accompanying cli tool) to transform Pandas DataFrames into Exports (RDF) to be sent to DGraph Live Loader

python -m pip install dgraphpandas

Usage

Command Line

 dgraphpandas --help
usage: dgraphpandas [-h] [-x {upserts,schema,types}] [-f FILE] -c CONFIG
                    [-ck CONFIG_FILE_KEY] [-o OUTPUT_DIR] [--console]
                    [--export_csv] [--encoding ENCODING]
                    [--chunk_size CHUNK_SIZE]
                    [--gz_compression_level GZ_COMPRESSION_LEVEL]
                    [--key_separator KEY_SEPARATOR]
                    [--add_dgraph_type_records ADD_DGRAPH_TYPE_RECORDS]
                    [--drop_na_intrinsic_objects DROP_NA_INTRINSIC_OBJECTS]
                    [--drop_na_edge_objects DROP_NA_EDGE_OBJECTS]
                    [--illegal_characters ILLEGAL_CHARACTERS]
                    [--illegal_characters_intrinsic_object ILLEGAL_CHARACTERS_INTRINSIC_OBJECT]
                    [--version] [-v {DEBUG,INFO,WARNING,ERROR,NOTSET}]

This is a real example which you can find in the samples folder and run from the root of this repository:

dgraphpandas \
  --config samples/planets/dgraphpandas.json \
  --config_file_key planet \
  --file samples/planets/solar_system.csv \
  --output samples/planets/output

Module

This example can also be found in Notebook form.

import dgraphpandas as dpd

# Define a Configuration for your data files(s). Explained further in the Configuration section.
config = {
  "transform": "horizontal",
  "files": {
    "planet": {
      "subject_fields": ["id"],
      "edge_fields": ["type"],
      "type_overrides": {
        "order_from_sun": "int32",
        "diameter_earth_relative": "float32",
        "diameter_km": "float32",
        "mass_earth_relative": "float32",
        "mean_distance_from_sun_au": "float32",
        "orbital_period_years": "float32",
        "orbital_eccentricity": "float32",
        "mean_orbital_velocity_km_sec": "float32",
        "rotation_period_days": "float32",
        "inclination_axis_degrees": "float32",
        "mean_temperature_surface_c": "float32",
        "gravity_equator_earth_relative": "float32",
        "escape_velocity_km_sec": "float32",
        "mean_density": "float32",
        "number_moons": "int32",
        "rings": "bool"
      },
      "ignore_fields": ["image", "parent"]
    }
  }
}

# Perform a Horizontal Transform on the passed file using the config/key
# Generate RDF Upsert statements
intrinsic, edges = dpd.to_rdf('solar_system.csv', config, 'planet', output_dir='.', export_rdf=True)

# Do something with these statements e.g write to zip and ship to DGraph
# The cli will zip this output automatically
# In module mode when you provide output_dir and export_rdf it will automatically zip and write to disk
print(intrinsic)
print(edges)

Alternatively, you could call the underlying methods

# Perform a Horizontal Transform on the passed file using the config/key
intrinsic, edges = horizontal_transform('solar_system.csv', config, "planet")
# Generate RDF Upsert statements
intrinsic_upserts, edges_upserts = generate_upserts(intrinsic, edges)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dgraphpandas-0.1.5.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dgraphpandas-0.1.5-py3-none-any.whl (43.0 kB view details)

Uploaded Python 3

File details

Details for the file dgraphpandas-0.1.5.tar.gz.

File metadata

  • Download URL: dgraphpandas-0.1.5.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for dgraphpandas-0.1.5.tar.gz
Algorithm Hash digest
SHA256 e0d91ed8e92999a5ed1ed559006c1a4a9c29ab03df2b4151bb80dc2ab8a3c388
MD5 8c27adb0b84bad184f8c89f54be307af
BLAKE2b-256 0d7f04fa02797b9b43b466ac2c594dd876d895aa436b78ae40f086cd26940143

See more details on using hashes here.

File details

Details for the file dgraphpandas-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: dgraphpandas-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 43.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for dgraphpandas-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3dd72b6ba530145d7ee71cab5315cc8ac0a775efa817def62fa68fe9176f4d98
MD5 6d82cf2c10d907cca3c04e3f59cef181
BLAKE2b-256 ef3e74204c3a7c7f97bd978eee7b7c3d845130dd79a772ccf446dd558918e6d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page