gcs file sharding map functions, for traversing gcs text files at any scale, for Google App Engine, Python standard environment
Project description
# appenginetaskutils
This is the repo for the appengine task utils library. It generates the appenginetaskutils package
## Install
Use the python package for this library. You can find the package online [here](https://pypi.python.org/pypi/appenginetaskutils).
Change to your Python App Engine project's root folder and do the following:
> pip install appenginetaskutils --target lib
Or add it to your requirements.txt. You'll also need to set up vendoring, see [app engine vendoring instructions here](https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27).
## @task
The most basic element of the taskutils library is task(). This decorator function is designed to be used as a replacement for [deferred](https://cloud.google.com/appengine/articles/deferred).
### Configuring @task
When using deferred you have a builtin to configure in app.yaml to make it work. For taskutils.task, you need to add the following to your app.yaml and/or \<servicename\>.yaml file:
handlers:
- url: /_ah/task/.*
script: taskutils.app
login: admin
This rule creates a generic handler for task to defer work to background push tasks.
Add it at the top of the list (to make sure other rules don't override it).
### Importing task
You can import task into your modules like this:
from taskutils import task
### Using task as a decorator
You can take any function and make it run in a separate task, like this:
@task
def myfunction():
... do stuff ...
Just call the function normally, eg:
myfunction()
You can use @task on any function, including nested functions, recursive functions, recursive nested functions, the sky is the limit. This is possible because of use of [yccloudpickle](https://medium.com/the-infinite-machine/python-function-serialisation-with-yccloudpickle-b2ff6b2ad5da#.zei3n0ibu) as the underlying serialisation library.
Your function can also have arguments, including other functions:
def myouterfunction(mapf):
@task
def myinnerfunction(objects):
for object in objects:
mapf(object)
...get some list of lists of objects...
for objects in objectslist:
myinnerfunction(objects)
def dosomethingwithobject(object):
... do something with an object ...
myouterfunction(dosomethingwithobject)
The functions and arguments are being serialised and deserialised for you behind the scenes.
When enqueuing a background task, the App Engine Task and TaskQueue libraries can take a set of parameters. You can pass these to the decorator:
@task(queue="myqueue", countdown=5)
def anotherfunction():
... do stuff ...
Details of the arguments allowed to Tasks are available [here](https://cloud.google.com/appengine/docs/python/refdocs/google.appengine.api.taskqueue), under **class google.appengine.api.taskqueue.Task(payload=None, \*\*kwargs)**. The task decorator supports a couple of extra ones, detailed below.
### Using task as a factory
You can also use task to decorate a function on the fly, like this:
def somefunction(a, b):
... does something ...
somefunctionintask = task(somefunction, queue="myqueue")
Then you can call the function returned by task when you are ready:
somefunctionintask(1, 2)
You could do both of these steps at once, too:
task(somefunction, queue="myqueue")(1, 2)
### transactional
Pass transactional=True to have your [task launch transactionally](https://cloud.google.com/appengine/docs/python/datastore/transactions#transactional_task_enqueuing). eg:
@task(transactional=True)
def myserioustransactionaltask():
...
### includeheaders
If you'd like access to headers in your function (a dictionary of headers passed to your task, it's a web request after all), set includeheaders=True in your call to @task. You'll also need to accept the headers argument in your function.
@task(includeheaders=True)
def myfunctionwithheaders(amount, headers):
... stuff ...
myfunctionwithheaders(10)
App Engine passes useful information to your task in headers, for example X-Appengine-TaskRetryCount.
### other bits
When using deferred, all your calls are logged as /_ah/queue/deferred. But @task uses a url of the form /_ah/task/\<module\>/\<function\>, eg:
/_ah/task/mymodule/somefunction
which makes debugging a lot easier.
This is the repo for the appengine task utils library. It generates the appenginetaskutils package
## Install
Use the python package for this library. You can find the package online [here](https://pypi.python.org/pypi/appenginetaskutils).
Change to your Python App Engine project's root folder and do the following:
> pip install appenginetaskutils --target lib
Or add it to your requirements.txt. You'll also need to set up vendoring, see [app engine vendoring instructions here](https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27).
## @task
The most basic element of the taskutils library is task(). This decorator function is designed to be used as a replacement for [deferred](https://cloud.google.com/appengine/articles/deferred).
### Configuring @task
When using deferred you have a builtin to configure in app.yaml to make it work. For taskutils.task, you need to add the following to your app.yaml and/or \<servicename\>.yaml file:
handlers:
- url: /_ah/task/.*
script: taskutils.app
login: admin
This rule creates a generic handler for task to defer work to background push tasks.
Add it at the top of the list (to make sure other rules don't override it).
### Importing task
You can import task into your modules like this:
from taskutils import task
### Using task as a decorator
You can take any function and make it run in a separate task, like this:
@task
def myfunction():
... do stuff ...
Just call the function normally, eg:
myfunction()
You can use @task on any function, including nested functions, recursive functions, recursive nested functions, the sky is the limit. This is possible because of use of [yccloudpickle](https://medium.com/the-infinite-machine/python-function-serialisation-with-yccloudpickle-b2ff6b2ad5da#.zei3n0ibu) as the underlying serialisation library.
Your function can also have arguments, including other functions:
def myouterfunction(mapf):
@task
def myinnerfunction(objects):
for object in objects:
mapf(object)
...get some list of lists of objects...
for objects in objectslist:
myinnerfunction(objects)
def dosomethingwithobject(object):
... do something with an object ...
myouterfunction(dosomethingwithobject)
The functions and arguments are being serialised and deserialised for you behind the scenes.
When enqueuing a background task, the App Engine Task and TaskQueue libraries can take a set of parameters. You can pass these to the decorator:
@task(queue="myqueue", countdown=5)
def anotherfunction():
... do stuff ...
Details of the arguments allowed to Tasks are available [here](https://cloud.google.com/appengine/docs/python/refdocs/google.appengine.api.taskqueue), under **class google.appengine.api.taskqueue.Task(payload=None, \*\*kwargs)**. The task decorator supports a couple of extra ones, detailed below.
### Using task as a factory
You can also use task to decorate a function on the fly, like this:
def somefunction(a, b):
... does something ...
somefunctionintask = task(somefunction, queue="myqueue")
Then you can call the function returned by task when you are ready:
somefunctionintask(1, 2)
You could do both of these steps at once, too:
task(somefunction, queue="myqueue")(1, 2)
### transactional
Pass transactional=True to have your [task launch transactionally](https://cloud.google.com/appengine/docs/python/datastore/transactions#transactional_task_enqueuing). eg:
@task(transactional=True)
def myserioustransactionaltask():
...
### includeheaders
If you'd like access to headers in your function (a dictionary of headers passed to your task, it's a web request after all), set includeheaders=True in your call to @task. You'll also need to accept the headers argument in your function.
@task(includeheaders=True)
def myfunctionwithheaders(amount, headers):
... stuff ...
myfunctionwithheaders(10)
App Engine passes useful information to your task in headers, for example X-Appengine-TaskRetryCount.
### other bits
When using deferred, all your calls are logged as /_ah/queue/deferred. But @task uses a url of the form /_ah/task/\<module\>/\<function\>, eg:
/_ah/task/mymodule/somefunction
which makes debugging a lot easier.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file im_gcsfilesharded-0.1.0.tar.gz
.
File metadata
- Download URL: im_gcsfilesharded-0.1.0.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c0b108065e3237736d9a8475de1a9e797167f17f877acf75bbbbe37442b9dfef |
|
MD5 | cb10aab9d7e6c013976a26d52a24d12d |
|
BLAKE2b-256 | eceb8c9dac1d9c44d9b271173d2a2176cb5979b75c8224cf9d55507bbde4045d |
File details
Details for the file im_gcsfilesharded-0.1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: im_gcsfilesharded-0.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba0a4ba493f6d6c55dfd8ee8904c7197199e7b0ee530d0b2109179cac458e6c4 |
|
MD5 | e6816290611556e306c7560af714f968 |
|
BLAKE2b-256 | db00e3bf6697aeef72cd07a06668f0590538da81fef23ea78472045614694271 |