No project description provided
Project description
Feathr Online Transformation Python Support
This is the Python wrapper of the Feathr online transformation service.
There are 2 major classes in this package:
-
PiperService
, this is the service class, it is used to start a HTTP service to handle the transformation requests. It doesn't support HTTPS and authentication, so you may need to setup gateway or proxy to handle the security issues. ThePiperService
class has astart
method to start the HTTP service in blocking mode, andstart_async
method to start the service in theasync
context. -
Piper
, this is the transformation engine, it can be use to transform data directly, mainly for development and testing purpose. ThePiper
class has aprocess
method to transform data in blocking mode, andprocess_async
method to transform data in theasync
context.
Both above classes support UDF and UDLF written in Python.
NOTE: Because of the GIL, pure Python code cannot run concurrently, that means using Python UDF could slow down the transformation service, especially on heavy load.
Value Types
All values passed to the pipeline must be in one of the following types:
None
- Simple types:
bool
,int
,float
,str
- Date/time is represented as
datetime.DateTime
. - List: List of supported types.
- Map: Map of supported types, keys must be string, and value can be any supported type.
All values returned by the pipeline will also be in above types.
NOTE: When using Python big integer, exception will be thrown if any it exceeds the range of 64-bit signed integer.
User Defined Function (UDF) in Python
The UDF is implemented as ordinary Python function, and it must be registered to the service before it can be used in the pipeline.
- The UDF function can only accept positional arguments, keyword arguments are not supported.
- The UDF function must be able to be invoked by the usage in the DSL script, i.e. a UDF with 2 fixed arguments and 1 optional argument can be invoked as
udf(1, 2)
orudf(1, 2, 3)
, but notudf(1, 2, 3, 4)
orudf(1)
. - The UDF function may raise any exception, and the returned value will be record as an error. This error will be propagated to the caller.
- Every function and operator that takes error as the input will return the error.
- At the final output stage of the pipeline, the error value will be converted to
None
, and the error will be recorded in a separated error list. - The UDF function will never see the error as the input, the invocation is bypassed before the UDF function is called if any of the argument is error.
- The execution order is non-deterministic, so the UDF function shall not make any assumptions.
- The UDF function should not block, such behavior is not strictly forbidden but the performance will be impacted significantly.
User Defined Lookup Function (UDLF) in Python
Usually lookup
is to fetch external data, such as a database or a web service, so the lookup data source is implemented as a Python async functions, and it must be registered to the piper or the service before it can be used in the pipeline.
The lookup function is called with a single key and a list of requested field names, and it should return a list of rows that each row is a list that aligns with the requested fields, or an empty list when lookup failed. The key must be in the supported simple types, list and dict cannot be used as key, and using None
as the key will get None
as the value of all returned fields without actually calling the lookup function.
async def my_fancy_lookup_function(key: Any, fields: List[str]) -> List[List[Any]]:
...
return [
[some_data[f] for f in fields],
[some_other_data[f] for f in fields],
]
The lookup function must be added to the Piper
or PiperService
before it can be used in the pipeline:
piper = Piper(pipeline_def, {"lookup_name": my_fancy_lookup_function}, ...)
or
svc = PiperService(pipeline_def, {"lookup_name": my_fancy_lookup_function}, ...)
Then you can use the lookup data source in the pipeline in a lookup
transformation:
pipeline_name(...)
| ...
| lookup field1, field2 from lookup_name on key
| ...
;
or a join
transformation:
pipeline_name(...)
| ...
| join kind=left-inner field1, field2 from lookup_name on key
| ...
;
Once the user-defined lookup function is used, the Piper
and PiperService
must be used in async
context, otherwise all async function will never be executed and the program may hang forever.
Also you need to replace process
with process_async
, and start
with start_async
.
piper = Piper(pipeline_def, {"lookup_name": lookup_function})
async def test():
await piper.process_async(...)
asyncio.run(test())
For more information about Python async programming, please refer to Python Asyncio.
NOTE:
- Because of the asynchronous nature of the lookup function, it's recommended to use
asyncio
compatible libraries to implement the lookup function, traditional blocking libraries may cause the performance issue, e.g. useaiohttp
orHTTPX
instead ofRequests
. - This package only supports
asyncio
,Twisted
orGevent
based libraries are not supported. - In order to lookup data from a standard JSON-based HTTP API, you can use builtin HTTP client instead of implementing your own lookup function, register the lookup data source either in a JSON string or a
dict
with correct content, detailed doc is at here. - The
feathrpiper
also has builtin support of SqlServer/AzureSQL, Sqlite3, and Azure CosmosDb.
Integration with Other Web-Service Frameworks
The feathrpiper
contains built-in web service, but it doesn't support HTTPS and authentication, and has a specific HTTP API spec which cannot be changed from the Python side. In case you need to use it in any other scenario, you may integrate it with other Web service frameworks.
- Flask: prefer to use async version of Flask, such as Flask-Async, Flask-RESTful-Async, Flask-RESTX-Async, etc. And you should use
process_async
to process the request. - FastAPI: FastAPI is fully async-based, use
process_async
to process the request. - Any other Web framework that doesn't support async: You can use
process
in non-async context, but the user-defined lookup function feature will be unavailable.
A demo of integrating with FastAPI is at here
Packaging and Deployment
The feathrpiper
package is a standard Python package without external dependency, you need to write your own code using the package to implement your own transformation service.
The packaging and the deployment process is also standard, refer to the official document if you need to build Docker image, currently we don't have any pre-built Docker image for the Python package.
In most cases, the packaging process could be like:
- Prepare the
requirements.txt
file which includes thefeathrpiper
package and all the other dependencies.# This package feathrpiper >= 0.4.3 # Any other dependencies pandas == 1.5.2 pytorch >= 1.0.0 ...
- Prepare a
Dockerfile
file which includes therequirements.txt
file and the code to run the service.FROM python:3.9-slim-buster COPY requirements.txt /tmp/ RUN pip install -r /tmp/requirements.txt COPY . /app WORKDIR /app # In case you want to use the built-in web service provided by `PiperService` class and it's listening at the port 8000 # Or you write your own web service and it's listening at the port 8000 EXPOSE 8000 CMD ["python", "main.py"]
- Build the Docker image:
docker build -t my_image .
- Run the Docker image:
docker run -p 8000:8000 my_image
Building from Source
The feathrpiper
package is written in Rust, so you need to setup the Rust toolchain to build it from source. The Rust toolchain can be installed from here. The development is done in Rust 1.65, older version may not work.
- Install
maturin
:pip install maturin
- Build the package under the
feathrpiper_root/python
directory:maturin build --release
More information about maturin
can be found here. Please note that running cargo build
in the top level directory won't build the Python package because the python package project is excluded from the workspace for some technical issues.
Limitations and Known Issues
- The
PiperService
class supports plain HTTP only, and it doesn't support any kind of authentication. - The
feathrpiper
supports Python 3.7~3.11, no support for Python 3.6 or earlier, and no support for Python 2. - The package published on PyPI only support following platforms:
- Linux arm64
- Linux armv7
- Linux x86_64
- macOS x86_64/AppleSilicon universal
- Windows x86_64
You need to build the package from source if you need to use it on other platforms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for feathrpiper-0.4.9-cp311-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dac662bf14957120a2b9fd40b41002bd006537f108a274848d851dd2c71bf19d |
|
MD5 | b3c3e085ccaf1c5c23474640f566d149 |
|
BLAKE2b-256 | 9062ccd8eeaa82de225819af5819ccd1e7c4967a8a18e6e9be77f3defb9a6b8b |
Hashes for feathrpiper-0.4.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6f5d9349a04049c52e185e19d414d4d784c76b439f9d8dd1160bf7b19c0df40 |
|
MD5 | 82cf3ad214e0cf9736cec92738a7b0ba |
|
BLAKE2b-256 | 127f4d04a068759fbb3201c6e1ba1febaa36137705ad6d7d5e6f3dd451bf4ed1 |
Hashes for feathrpiper-0.4.9-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f85e6f7d1a131f2d7bc8116da263f1532706856ba1a757ac525ae216b371cdb2 |
|
MD5 | 7d8ec8dbe8a223268f0e694e7978c16c |
|
BLAKE2b-256 | 009fdc45c60e858f213301bbbfdb486011e85b8f6e175bb2e4f992bf856e2802 |
Hashes for feathrpiper-0.4.9-cp311-cp311-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7243fdbdbd2eef292f430c4557ab60082ed745ccec495efdefaadcc46ab154b3 |
|
MD5 | 69e624719fab4abda831bb2da492308a |
|
BLAKE2b-256 | 86fbdccc2eda61b1b91c6345237cddb47e40fb7d92ebcef87eebc1fa7655b54e |
Hashes for feathrpiper-0.4.9-cp311-cp311-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32fdbccf5896af31987d245ccbb82d3626ab4c117bd84d078d90dc925d79c17d |
|
MD5 | e26b6baaaa551988774e56c546c270c3 |
|
BLAKE2b-256 | eacb181b78789167b5a585d926d819576e4d88408b4df40f194ce3664dbcb4a5 |
Hashes for feathrpiper-0.4.9-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 725c5eb466953ef67ee7d0ae4a897bd2f337a80aca3dd08e506a21793cd08504 |
|
MD5 | d60a4066e7f32cb2eec44c36227c1952 |
|
BLAKE2b-256 | 2749f843e8a02439a4d68541946d957df6044e91f5e4fb288b4b80033727cd7e |
Hashes for feathrpiper-0.4.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 964155eff3b12745374c6a51f947bd45ba0d085ba3d70c4ac0e0344d798ede35 |
|
MD5 | c9d724be1beaf0351388fc14809b2380 |
|
BLAKE2b-256 | d8e1cda3319f159286353f5f914fa5337f62ed868a6c7bebf7681fb28d61db82 |
Hashes for feathrpiper-0.4.9-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6feb131f2b211341c6d55048fc37f4d3234a2c4f772bd8e8adcd854bf1e4691 |
|
MD5 | 498533a2f5a0b675b1100ad0106d0e55 |
|
BLAKE2b-256 | 0d23c783d41ba0550d12c4ab2932d3eee2570f2150ef7861a26a4e777a8e61ff |
Hashes for feathrpiper-0.4.9-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4f12445ac3aaa27612cd896ef34676820e02f389e3799d334ad1123b69a7211 |
|
MD5 | 4e6abf6c927b5dff20dd81835b73fca9 |
|
BLAKE2b-256 | 5f459e1e659c03962cac1b0d253d619157c1a72c520d673934f873f98e2a879b |
Hashes for feathrpiper-0.4.9-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cae92e41d049a77a5967f7a274c5eef84c371ecd29c501d7b062f960021632a3 |
|
MD5 | b64d7954c729172b3edcd727bbd142e0 |
|
BLAKE2b-256 | 042bea38bc399e1bccd3216c8618e0e009af4f76dfb2538563ec88285e61b999 |
Hashes for feathrpiper-0.4.9-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99870509a7a3125f7198e1d862dc2277fd3f2f84809b679b78fbd3c07ca06fb5 |
|
MD5 | 0ab74f81db6e587e249ee60fbade045c |
|
BLAKE2b-256 | e7515520c942d4863edf6f9976d5d368742920ac66a42112ef6f3958872fcde0 |
Hashes for feathrpiper-0.4.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6eb71c129be1a2a51440f7d78020b17628f88b5f0ffba47ebf2b75d4ba46ae29 |
|
MD5 | 8d4d77848ed25c156e3fb62a8080d83e |
|
BLAKE2b-256 | 7a4b48dde1a76e06015f127a9ac10f242489d0af7e4d4b5e1c9595c0a112607c |
Hashes for feathrpiper-0.4.9-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34e54dcd69032caa52a5f361bb5b64132ae11c02b67d30543f084216616ce2d6 |
|
MD5 | ec507aa8d709dcdaaa3f72fff9d85059 |
|
BLAKE2b-256 | 291fd9ed949ed77e5e44fc5d3a2d5217ea95bf3c1ce7422b7d07ce798ad79286 |
Hashes for feathrpiper-0.4.9-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c82657359c6024b8393c93c410d5f7689a26a07b68e13658d4f9ab66aef79015 |
|
MD5 | a181cb5b95ffb949e1d8979f1a8bdb47 |
|
BLAKE2b-256 | a3a06760f620aefcf21b35b9d743d176e0720db10ba21298bc03104adf62b568 |
Hashes for feathrpiper-0.4.9-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb1ef1ce3a149f23fb24e8196ccac0033f0b4a0062d1df50c343423ae6349bb0 |
|
MD5 | 6825ae6d200121e31e2aab4319d14467 |
|
BLAKE2b-256 | d20dee5055fc326f501138b101219fb41c2eab346393a5537055087cf5afde46 |
Hashes for feathrpiper-0.4.9-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | affa59b0ac64dc6dcb1e110f6ea0cda624670b44289cbc3b9b6c0d528266ac10 |
|
MD5 | 2e095f68d3a1affd5d0edf9630528d8c |
|
BLAKE2b-256 | 47e1c33f8a7c7c444edcd902484a0a6a5e902771fee873618e4253fea0a8eebf |
Hashes for feathrpiper-0.4.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fa0de747f00e65ff0d3e5404d1a802e23e227df83b08d64bc3c7cb786c79ea2 |
|
MD5 | 1570796a900aa44bfb91bdf18a3c567c |
|
BLAKE2b-256 | c29e4dfcc2e52564dfaf6e1f55e96d85b4c7df899d01dbc1b66b8622e1503f2b |
Hashes for feathrpiper-0.4.9-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd542b40929ae216cc74c47ccd8bb25ec35c151c1f13b9b8a2107137ad4b869a |
|
MD5 | 2dbb183175bce7beddadc5286fa11c7d |
|
BLAKE2b-256 | 62c393487d1e3a765382e3cdca5f19507caf9f98e24ca614ba2d69c5910abefa |
Hashes for feathrpiper-0.4.9-cp38-cp38-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b0d2cbdce2fe5d81c78bb60d89c423d07628bc44d669eae120dbdd43a791b00 |
|
MD5 | a28337c1abf007d2bca513863dc30d52 |
|
BLAKE2b-256 | 0b17c99979c07564ba7fc5abfffefef25ef922db80d1ecfc2996a78aa205f476 |
Hashes for feathrpiper-0.4.9-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bfa3ea1cfc1f65238975ad335bde984874f2da2ab8deb70881063eb97a564ea0 |
|
MD5 | 589ab5dcb61ad86ea42a71fb52d195bc |
|
BLAKE2b-256 | 29cd2e3ea7f649e29242204900b1fe0e7153398c5f63f06663065d01f3ad8032 |
Hashes for feathrpiper-0.4.9-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 89a1847b7dcdf01039b547b5528a678b92743a548f0f791a0a05837dafcbacc2 |
|
MD5 | 92fe0789081709e3f75e7b66382f3e96 |
|
BLAKE2b-256 | da457ff8648f16b165d5d60c3e39136a7441d5af36bae6a689058128c4570605 |
Hashes for feathrpiper-0.4.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44922dcf9e2b39f08a1cad83b3fe1652e17852d40200edba1f6558be17f856e4 |
|
MD5 | 538927d6093e8ec5e68b31eebeb5f282 |
|
BLAKE2b-256 | e90dfa264ce739291205aa409fd01af4c951bf255ed1edbc4b2d8e9f27d7b10b |
Hashes for feathrpiper-0.4.9-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ffd44d195b56a0f227240d0473d130034c745f23c49ac7e89e0db5e933a8b648 |
|
MD5 | 4b4900a648dccda52532ed57a96e7568 |
|
BLAKE2b-256 | 2f1545068d2eed68ceaf8283f976d12c4e9c644372db591d5a9a653fc7a00f86 |
Hashes for feathrpiper-0.4.9-cp37-cp37m-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cc984b988a129d86c291aafdb5de20e76b87eb52e413f79d2f122dfc75c88f9 |
|
MD5 | a86987119b1e2c4ebcc1f04bd90d7eae |
|
BLAKE2b-256 | 8f5b4bf34e466d1b6537012a54dc5206bf612584cab6da2239443d0ced1caa45 |
Hashes for feathrpiper-0.4.9-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1c954cbfa3cd80334c28df602a10a70174c9268c28ff53875d05394d1687287 |
|
MD5 | 0ef6cd4b1101504563a32a1cb198be2c |
|
BLAKE2b-256 | a83b79db78b38a11f5fa1966a4e045708828483ccaa983a6ae97a2aa3b7f067b |