Skip to main content

gcs file sharding map functions, for traversing gcs text files at any scale, for Google App Engine, Python standard environment

Project description

# appenginetaskutils
This is the repo for the appengine task utils library. It generates the appenginetaskutils package

## Install

Use the python package for this library. You can find the package online [here](https://pypi.python.org/pypi/appenginetaskutils).

Change to your Python App Engine project's root folder and do the following:

> pip install appenginetaskutils --target lib

Or add it to your requirements.txt. You'll also need to set up vendoring, see [app engine vendoring instructions here](https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27).

## @task

The most basic element of the taskutils library is task(). This decorator function is designed to be used as a replacement for [deferred](https://cloud.google.com/appengine/articles/deferred).

### Configuring @task

When using deferred you have a builtin to configure in app.yaml to make it work. For taskutils.task, you need to add the following to your app.yaml and/or \<servicename\>.yaml file:

handlers:
- url: /_ah/task/.*
script: taskutils.app
login: admin

This rule creates a generic handler for task to defer work to background push tasks.

Add it at the top of the list (to make sure other rules don't override it).

### Importing task

You can import task into your modules like this:

from taskutils import task

### Using task as a decorator

You can take any function and make it run in a separate task, like this:

@task
def myfunction():
... do stuff ...

Just call the function normally, eg:

myfunction()

You can use @task on any function, including nested functions, recursive functions, recursive nested functions, the sky is the limit. This is possible because of use of [yccloudpickle](https://medium.com/the-infinite-machine/python-function-serialisation-with-yccloudpickle-b2ff6b2ad5da#.zei3n0ibu) as the underlying serialisation library.

Your function can also have arguments, including other functions:

def myouterfunction(mapf):

@task
def myinnerfunction(objects):
for object in objects:
mapf(object)

...get some list of lists of objects...
for objects in objectslist:
myinnerfunction(objects)

def dosomethingwithobject(object):
... do something with an object ...

myouterfunction(dosomethingwithobject)

The functions and arguments are being serialised and deserialised for you behind the scenes.

When enqueuing a background task, the App Engine Task and TaskQueue libraries can take a set of parameters. You can pass these to the decorator:

@task(queue="myqueue", countdown=5)
def anotherfunction():
... do stuff ...

Details of the arguments allowed to Tasks are available [here](https://cloud.google.com/appengine/docs/python/refdocs/google.appengine.api.taskqueue), under **class google.appengine.api.taskqueue.Task(payload=None, \*\*kwargs)**. The task decorator supports a couple of extra ones, detailed below.

### Using task as a factory

You can also use task to decorate a function on the fly, like this:

def somefunction(a, b):
... does something ...

somefunctionintask = task(somefunction, queue="myqueue")

Then you can call the function returned by task when you are ready:

somefunctionintask(1, 2)

You could do both of these steps at once, too:


task(somefunction, queue="myqueue")(1, 2)

### transactional

Pass transactional=True to have your [task launch transactionally](https://cloud.google.com/appengine/docs/python/datastore/transactions#transactional_task_enqueuing). eg:

@task(transactional=True)
def myserioustransactionaltask():
...

### includeheaders

If you'd like access to headers in your function (a dictionary of headers passed to your task, it's a web request after all), set includeheaders=True in your call to @task. You'll also need to accept the headers argument in your function.

@task(includeheaders=True)
def myfunctionwithheaders(amount, headers):
... stuff ...

myfunctionwithheaders(10)

App Engine passes useful information to your task in headers, for example X-Appengine-TaskRetryCount.

### other bits

When using deferred, all your calls are logged as /_ah/queue/deferred. But @task uses a url of the form /_ah/task/\<module\>/\<function\>, eg:

/_ah/task/mymodule/somefunction

which makes debugging a lot easier.







Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

im_gcsfilesharded-0.1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

im_gcsfilesharded-0.1.0-py2.py3-none-any.whl (7.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file im_gcsfilesharded-0.1.0.tar.gz.

File metadata

File hashes

Hashes for im_gcsfilesharded-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c0b108065e3237736d9a8475de1a9e797167f17f877acf75bbbbe37442b9dfef
MD5 cb10aab9d7e6c013976a26d52a24d12d
BLAKE2b-256 eceb8c9dac1d9c44d9b271173d2a2176cb5979b75c8224cf9d55507bbde4045d

See more details on using hashes here.

File details

Details for the file im_gcsfilesharded-0.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for im_gcsfilesharded-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ba0a4ba493f6d6c55dfd8ee8904c7197199e7b0ee530d0b2109179cac458e6c4
MD5 e6816290611556e306c7560af714f968
BLAKE2b-256 db00e3bf6697aeef72cd07a06668f0590538da81fef23ea78472045614694271

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page