Skip to main content

No project description provided

Project description

Datashack

Pre-requisites

Getting Started

run in your terminal

pip install datashack
git clone .../datashack/exmples
datashack plan ./datashack-examples/py/my_app/models

What datashack do

Streaming data platform

Spark's streaming pipelines enable real-time data analysis, empowering businesses to monitor their operations, identify trends, and make informed decisions. Datashack's streaming data platform provides a seamless solution for processing, storage, and analytics, with Spark's streaming capabilities creating trailing tables while still keeping the schema and evolving it. Datashack also provides storage on S3, with tables and databases in Glue Catalog for Athena analytics, which can be integrated with BI tools. Additionally, Datashack's platform allows events to be sent to machine learning models or data warehouses.

Datashack capabilities

  1. Datashack allows you to create a fully operational data platform with just a few lines of code. The platform includes infrastructure for:
    1. Storing data in S3 as Parquet/Delta/Iceberg, with partitions and other relevant configurations.
    2. Preparing Spark streaming jobs.
    3. Managing data using the Glue catalog, including tables and databases.
    4. Leveraging big data features such as:
      1. Compaction using Delta optimize/Iceberg compact.
      2. CRUD operations using Delta optimize/Iceberg compact.
      3. Schema evolutions using Delta/Iceberg.
    5. Ensuring proper catalog governance using Lake Formation*.
    6. Saving costs by setting a retention policy on your data platform and utilizing intelligent tiering to move data to archives or cold storage.
  2. You can send data directly from your code to the platform for processing, storage, and analytics. You can combine events from various organizational tools, such as Salesforce or Google Analytics, and integrate them into your data platform.
    1. Ingestion:
      1. Send events to Datashack from your organization's environment.
      2. Datashack can run ingestion from different organizational tools such as Google Analytics, Shopify, and more.
    2. Processing: This is done with streaming capabilities (Apache Spark) and creates trailing tables while still keeping the schema, and evolving the schema.
      1. Build your bronze, silver, and gold data models with Spark processing.
      2. Process your data with various transformations such as joins, aggregations, PII masking, and more.
    3. Analytics & querying:
      1. Athena: Tables and databases are instantly queryable with AWS Athena.
      2. Warehouse: You can set the data to be written to a warehouse or synced with an S3 table, supporting various warehouses such as Redshift, RDS(SQL), and DynamoDB.
  3. Business Tools and Integrations:
    1. Business Intelligence (BI) tools that can integrate with AWS Athena, AWS S3, and AWS Glue Table include Amazon QuickSight, Tableau, Power BI, QlikView, and Looker. These tools provide data visualization and analysis capabilities, enabling businesses to extract insights and make informed decisions based on their data.
    2. Machine Learning (ML) tools such as QWAK and SageMaker are powerful tools for your data science needs. With Datashack, you no longer need ML Ops and data engineers for your data, as you can connect your Data Scientists directly to your data.
    3. API and Exporting tools - connect your customers directly to Datashack's powerful data exporting tool.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datashack-0.0.4-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file datashack-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: datashack-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for datashack-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a542b49131fb36a9717bb5f24d5f12056278b75efb0d4bdb48d5c21e1bb1204d
MD5 af5599056f90034e9b8648d0e0ed3c39
BLAKE2b-256 10fbae4e7a834f40bcf5942b764ef8826ae55b216d64af779be77b2eb5f390ec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page