Skip to main content

UNKNOWN

Project description

<div id="container" align="center">
<img
src="http://tellapart.com/wp-content/uploads/2012/06/taba-kanji.gif"
alt="Taba" />
</div>

Introduction
====================

Taba is a service for aggregating instrumentation events from large distributed
systems in near real-time. It was built to handle high throughput and scale
easily.

Check out an overview of Taba's architecture on the TellApart Eng Blog:
[http://www.tellapart.com/taba-low-latency-event-aggregation
](http://www.tellapart.com/taba-low-latency-event-aggregation)

Example
====================

Taba helps you instrument your services and provide a near real-time view into
what's happening across a large cluster. For example, you could use it to track
the winning bid price for a certain type of bid:

from taba import client

...

client.Counter('bids_won', 1)
client.Counter('winning_bid_price', wincpm)

When those Events reach the Taba Server and are aggregated, they produce an
output like the following:

$ taba-cli agg winning_bid_price
winning_bid_price: {
"1m": {"count": 436, "total": 571.64},
"10m": {"count": 5285, "total": 6884.57},
"1h": {"count": 34265, "total": 44175.47},
"1d": {"count": 569787, "total": 744423.87},
"pct": [0.09, 0.47, 2.19, 3.55, 4.37, 14.09, 17.59]}

There are many other input data types and aggregation methods available. See the
Types and Handlers documentation.

Overview
====================

A Taba deployment consists of 6 layers, each horizontally scalable. These
layers are:

- Taba Client code integrated into applications
- Taba Agent process running locally to the application servers
- Taba Server processes in the frontend ('fe') role
- Taba Server processes in the backend ('be') role
- Redis sentinel processes
- Redis database processes

The Taba Client is integrated into the application it is instrumenting, and
exposes an API for recording events to different counter types. The Python
distribution includes a default Client implementation based on threads, and
a Gevent engine. There is also a Java Client available
([https://github.com/tellapart/taba-java-client]
(https://github.com/tellapart/taba-java-client))

The Client typically sends events to a Taba Agent process running on the same
server. While the Client will only forward events on a best-effort basis, the
Agent provides more robust buffering and failure recovery. It is also
significantly smarter about load balancing.

The will forward events to one of the Taba Server end-points it has been
configured to connect to. Any Server in the cluster can receive any set of
events.

The Taba Server processes are split into two groups: frontend ('fe') and backend
('be'). These roles are configured when the process starts. Assigning a process
a 'fe' role has no effect on its operation -- it is simply a marker that the
process will use to advertise its role. (The intention it to allow a
load-balancer to use that indicator to route traffic to just the 'fe'
processes). Assigning the 'be' role will configure the process to launch a
background worker that processes queued events. There must be at least one 'be'
Server process in the cluster.

A Server process can be assigned both 'fe' and 'be' roles. For small clusters,
this will work well. However, once a cluster becomes large enough to require
multiple Server processes, separating 'fe' and 'be' processes will perform
better.

There is a third role 'ctrl', which essentially marks a Server process as
neither 'fe' nor 'be'. This is useful for maintaining a separate set of
processes for querying.

The Taba Server uses a group of Redis databases and Sentinels. Having at least
one Sentinel is a requirement, as it is used for service discovery of the
individual database processes. Sharding across the databases is accomplished by
splitting the key-space into a large number of virtual buckets, and assigning
ranges of buckets to each process.

Installing Taba
====================

Taba was designed to run on Python 2.6/2.7. It has the following Python package
dependencies, which should be installed automatically:

- gevent (>= 0.13.1)
- python-cjson (>= 1.0.5)
- cython (>= 0.13)
- redis (>= 2.9)
- requests (>= 1.2.0)

Additionally, building the Python dependencies requires the following. These
dependencies are _not_ installed automatically:

- gcc
- make
- python-dev
- libevent-dev

The latest stable release can be installed from PyPi:

pip install taba

Or Taba can be installed directly from the repository:

git clone https://github.com/tellapart/taba.git
cd taba
python setup.py install

Installing Redis
====================

The Taba Server uses a group of Redis instances with Sentinels as its
database. It requires at least Redis 2.8. For details about installing Redis,
please visit the [Redis Downloads page](http://redis.io/download)

Deploying Taba
====================

There are many ways to deploy Taba, depending on the use case. See
`examples/EXAMPLES` for pointers on how to get started.

About
====================

Taba is a project at TellApart led by
[Kevin Ballard](https://github.com/kevinballard) to create a reliable, high
performance platform for real-time monitoring. It is used to monitor over
30,000 Tabs, consuming nearly 10,000,000 Events per second, and an average
latency of under 15s.

Any questions or comments can be forwarded to
[taba@tellapart.com](mailto:taba@tellapart.com)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

taba-0.3.5.tar.gz (131.3 kB view details)

Uploaded Source

File details

Details for the file taba-0.3.5.tar.gz.

File metadata

  • Download URL: taba-0.3.5.tar.gz
  • Upload date:
  • Size: 131.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for taba-0.3.5.tar.gz
Algorithm Hash digest
SHA256 b1dc0e099d2bde86dd05b7894a69b5ce052154958d7be3b1ed89f3858f95be71
MD5 cfd0a85b51cf1c993215b2a97128093f
BLAKE2b-256 f6efa96fb4a9d990cfe24b2c321f470ae7a9510979705f865b5790aa7d7f1182

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page