Skip to main content

Generic Etl Job template that can be imported

Project description

aind-data-transformation

License Code Style semantic-release: angular Interrogate Coverage Python

Usage

Please import this package and extend the abstract base class to define a new transformation job

from aind_data_transformation.core import (
    BasicJobSettings,
    GenericEtl,
    JobResponse,
)

# An example JobSettings
class NewTransformJobSettings(BasicJobSettings):
  # Add extra fields needed, for example, a random seed
  random_seed: Optional[int] = 0

# An example EtlJob
class NewTransformJob(GenericEtl[NewTransformJobSettings]):

    # This method needs to be defined
    def run_job(self) -> JobResponse:
        """
        Main public method to run the transformation job
        Returns
        -------
        JobResponse
          Information about the job that can be used for metadata downstream.

        """
        job_start_time = datetime.now()
        # Do something here
        job_end_time = datetime.now()
        return JobResponse(
            status_code=200,
            message=f"Job finished in: {job_end_time-job_start_time}",
            data=None,
        )

Contributing

The development dependencies can be installed with

pip install -e .[dev]

Adding a new transformation job

Any new job needs a settings class that inherits the BasicJobSettings class. This requires the fields input_source and output_directory and makes it so that the env vars have the TRANSFORMATION_JOB prefix.

Any new job needs to inherit the GenericEtl class. This requires that the main public method to execute is called run_job and returns a JobResponse.

Linters and testing

There are several libraries used to run linters, check documentation, and run tests.

  • Please test your changes using the coverage library, which will run the tests and log a coverage report:
coverage run -m unittest discover && coverage report
  • Use interrogate to check that modules, methods, etc. have been documented thoroughly:
interrogate .
  • Use flake8 to check that code is up to standards (no unused imports, etc.):
flake8 .
  • Use black to automatically format the code into PEP standards:
black .
  • Use isort to automatically sort import statements:
isort .

Pull requests

For internal members, please create a branch. For external members, please fork the repository and open a pull request from the fork. We'll primarily use Angular style for commit messages. Roughly, they should follow the pattern:

<type>(<scope>): <short summary>

where scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:

  • build: Changes that affect build tools or external dependencies (example scopes: pyproject.toml, setup.py)
  • ci: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)
  • docs: Documentation only changes
  • feat: A new feature
  • fix: A bugfix
  • perf: A code change that improves performance
  • refactor: A code change that neither fixes a bug nor adds a feature
  • test: Adding missing tests or correcting existing tests

Semantic Release

The table below, from semantic release, shows which commit message gets you which release type when semantic-release runs (using the default configuration):

Commit message Release type
fix(pencil): stop graphite breaking when too much pressure applied Patch Fix Release, Default release
feat(pencil): add 'graphiteWidth' option Minor Feature Release
perf(pencil): remove graphiteWidth option

BREAKING CHANGE: The graphiteWidth option has been removed.
The default graphite width of 10mm is always used for performance reasons.
Major Breaking Release
(Note that the BREAKING CHANGE: token must be in the footer of the commit)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aind_data_transformation-0.1.2.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aind_data_transformation-0.1.2-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file aind_data_transformation-0.1.2.tar.gz.

File metadata

  • Download URL: aind_data_transformation-0.1.2.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for aind_data_transformation-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d7d25b3d7bbc9c5736c31dc9e584c93b1d3bab456cc64b53b07e8d74d30547b4
MD5 c1cd237cd7dc328429601b15c2d5503b
BLAKE2b-256 416a2fc61224154f92a08d5ced11763d52acbcc17df65824c677934b750f5e0e

See more details on using hashes here.

File details

Details for the file aind_data_transformation-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for aind_data_transformation-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e92656556d7e1d286ac77ffddd109b77961ed159ea08c5ee36404eb7d096ed4a
MD5 8ad3207cf04b2e4f54900d4baec8fb70
BLAKE2b-256 89673377db8eb2da4c30a04b9952e5a5d32bba99b5ea8f67ec7b09fc009f10d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page