A package with helper tools to build an API Inference docker app for Hugging Face API inference using huggingface_hub
Project description
This repositories enable third-party libraries integrated with huggingface_hub to create
their own docker so that the widgets on the hub can work as the transformers
one do.
The hardware to run the API will be provided by Hugging Face for now.
The docker_images/common
folder is intended to be a starter point for all new libs that
want to be integrated.
Adding a new container from a new lib.
-
Copy the
docker_images/common
folder into your library's namedocker_images/example
. -
Edit:
docker_images/example/requirements.txt
docker_images/example/app/main.py
docker_images/example/app/pipelines/{task_name}.py
to implement the desired functionality. All required code is marked with
IMPLEMENT_THIS
markup. -
Remove:
- Any pipeline files in
docker_images/example/app/pipelines/
that are not used. - Any tests associated with deleted pipelines in
docker_images/example/tests
. - Any imports of the pipelines you deleted from
docker_images/example/app/pipelines/__init__.py
- Any pipeline files in
-
Feel free to customize anything required by your lib everywhere you want. The only real requirements, are to honor the HTTP endpoints, in the same fashion as the
common
folder for all your supported tasks. -
Edit
example/tests/test_api.py
to add TESTABLE_MODELS. -
Pass the test suite
pytest -sv --rootdir docker_images/example/ docker_images/example/
-
Submit your PR and enjoy !
Going the full way
Doing the first 7 steps is good enough to get started, however in the process you can anticipate some problems corrections early on. Maintainers will help you along the way if you don't feel confident to follow those steps yourself
- Test your creation within a docker
./manage.py docker MY_MODEL
should work and responds on port 8000. curl -X POST -d "test" http://localhost:8000
for instance if
the pipeline deals with simple text.
If it doesn't work out of the box and/or docker is slow for some reason you can test locally (using your local python environment) with :
./manage.py start MY_MODEL
- Test your docker uses cache properly.
When doing subsequent docker launch with the same model_id, the docker should start up very fast and not redownload the whole model file. If you see the model/repo being downloaded over and over, it means the cache is not being used correctly.
You can edit the docker_images/{framework}/Dockerfile
and add an environment variable (by default it assumes HUGGINGFACE_HUB_CACHE
), or your code directly to put
the model files in the /data
folder.
- Add a docker test.
Edit the tests/test_dockers.py
file to add a new test with your new framework
in it (def test_{framework}(self):
for instance). As a basic you should have 1 line per task in this test function with a real working model on the hub. Those tests are relatively slow but will check automatically that correct errors are replied by your API and that the cache works properly. To run those tests your can simply do:
RUN_DOCKER_TESTS=1 pytest -sv tests/test_dockers.py::DockerImageTests::test_{framework}
Modifying files within api-inference-community/{routes,validation,..}.py
.
If you ever come across a bug within api-inference-community/
package or want to update it
the development process is slightly more involved.
- First, make sure you need to change this package, each framework is very autonomous so if your code can get away by being standalone go that way first as it's much simpler.
- If you can make the change only in
api-inference-community
without depending on it that's also a great option. Make sure to add the proper tests to your PR. - Finally, the best way to go is to develop locally using
manage.py
command: - Do the necessary modifications within
api-inference-community
first. - Install it locally in your environment with
pip install -e .
- Install your package dependencies locally.
- Run your webserver locally:
./manage.py start --framework example --task audio-source-separation --model-id MY_MODEL
- When everything is working, you will need to split your PR in two, 1 for the
api-inference-community
part. The second one will be for your package specific modifications and will only land once theapi-inference-community
tag has landed. - This workflow is still work in progress, don't hesitate to ask questions to maintainers.
Another similar command ./manage.py docker --framework example --task audio-source-separation --model-id MY_MODEL
Will launch the server, but this time in a protected, controlled docker environment making sure the behavior
will be exactly the one in the API.
Available tasks
- Automatic speech recognition: Input is a file, output is a dict of understood words being said within the file
- Text generation: Input is a text, output is a dict of generated text
- Image recognition: Input is an image, output is a dict of generated text
- Question answering: Input is a question + some context, output is a dict containing necessary information to locate the answer to the
question
within thecontext
. - Audio source separation: Input is some audio, and the output is n audio files that sum up to the original audio but contain individual sources of sound (either speakers or instruments for instant).
- Token classification: Input is some text, and the output is a list of entities mentioned in the text. Entities can be anything remarkable like locations, organisations, persons, times etc...
- Text to speech: Input is some text, and the output is an audio file saying the text...
- Sentence Similarity: Input is some sentence and a list of reference sentences, and the list of similarity scores.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file api_inference_community-0.0.36-py3-none-any.whl
.
File metadata
- Download URL: api_inference_community-0.0.36-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0ec517b7f86fdbf11d4c01168b5e70e28f5b1627a945987f5a32434058ccd58 |
|
MD5 | eb6e2fb6b62cc79db9793dbd0bd59af8 |
|
BLAKE2b-256 | 3a424f6db38718b9446783613d5500085998d29f03022e4a171ccaa3785516bd |