Skip to main content

Magical tools to interact with web APIs from a data scientist's perspective.

Project description

build codecov Contributor Covenant

Apicadabri

Apicadabri is a magical set of tools to interact with web APIs from a data scientist's perspective to "just get the damn data"™.

It focuses on simplicity and speed while being agnostic about what kind of API you're calling. If you know how to send a single call to the API you're interested in, you should be good to go to scale up to 100k calls with apicadabri.

Current status

This is still an early alpha. Some basic examples already work, though (see below).

Assumptions

For now, apicadabri assumes that you want to solve a task for which the following holds:

  • All inputs fit into memory
  • All results fit into memory (you can write directly to a JSONL file)
  • The number of requests will not overwhelm the asyncio event loop (which is apparently hard to achieve anyway unless you have tens of millions of calls).
  • You want to observe and process results as they come in.
  • You want your results in the same order as the input with no gaps in between.

Future relaxing of constraints

  • For an extreme numbers of calls (>> 1M), add another layer of batching to avoid creating all asyncio tasks at the same time while also avoiding that one slow call in a batch slows down the whole task.
    • Through the same mechanism, allow loading inputs one batch at a time.

Examples

Multiple URLs

import apicadabri
pokemon = ["bulbasaur", "squirtle", "charmander"]
data = apicadabri.bulk_get(
    urls=(f"https://pokeapi.co/api/v2/pokemon/{p}" for p in pokemon),
).json().to_list()

Multiple payloads

TODO

Multivariate (zipped)

TODO

Multivariate (multiply)

TODO

Multivariate (pipeline)

TODO

Error Handling

API calls can always fail and you don't want your script with 100k API calls to crash on call number 10k because you forgot to handle a None somewhere. At the same time, though, you might not even care about errors and just want to set up a test scenario quick and dirty. Apicadabri adapts to both scenarios, by providing you three options for error handling, managed by the on_error parameter:

  • raise: The exception is not caught at all, instead it is just raised as normal and the bulk call will fail.

  • return: The exception is caught and encapsulated in an ApicadabriErrorResponse object, that also contains the input that triggered the exception.

  • A lambda function: The exception is caught and the provided error handling function is called with the triggering input and the error message and type. The error handling function must return a result of the same type as would be expected by a successful call. This can, for example, be used to return an "empty" result that does not lead to exceptions in further processing.

    ℹ️ If you need to return a different type of object in case of an error, you can instead use map with on_error="return" and then do another map that transforms the error response into the type you want.

The on_error parameter is available for multiple central methods of return objects, most notably map and reduce.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apicadabri-0.2.0.tar.gz (56.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

apicadabri-0.2.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file apicadabri-0.2.0.tar.gz.

File metadata

  • Download URL: apicadabri-0.2.0.tar.gz
  • Upload date:
  • Size: 56.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.9

File hashes

Hashes for apicadabri-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e65fcacd10f9438f8d3cc353ec8b5bba123f356925279dfebf211917f117c313
MD5 c56f925e9c211fa6d74ec1f3eda6051c
BLAKE2b-256 ea5ef5410e281c929bd070d8802274edfa312ee382e74079d320441c69392b68

See more details on using hashes here.

File details

Details for the file apicadabri-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for apicadabri-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ea7cfcd898bf8a4307277f00c46bb8bcf1010318fbe19b8f537027a221ab7db
MD5 2937d8952fe3c35dc2f59b8ff629358d
BLAKE2b-256 80afc912b7d1160435ca8328a8f3a8345fe41ab51b7624db703308b01c4b6d35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page