Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Python client for Apache Tika App

Project Description



tika-app-python is a wrapper for Apache Tika App.

Apache 2 Open Source License

tika-app-python can be downloaded, used, and modified free of charge. It is available under the Apache 2 license.


Main Author

Fedele Mantuano (Twitter: [@fedelemantuano](


Clone repository

git clone

and install tika-app-python with

cd tika-app-python

python install

or use pip:

pip install tika-app

Usage in a project

Import TikaApp class:

from tikapp import TikaApp

tika_client = TikaApp(file_jar="/opt/tika/tika-app-1.15.jar")

For get content type:


For detect language:


For detect all metadata and content:


For detect only content:


If you want to use payload in base64, you can use the same methods with payload argument:


Usage from command-line

If you installed tika-app-python with pip or you can use it with command-line. To use tika-app-python you should submit the Apache Tika app JAR. You can: - leave the default value: /opt/tika/tika-app-1.15.jar - set the enviroment value TIKA_APP_JAR - use --jar switch

The last one overwrite all the others.

These are all swithes:

usage: tikapp [-h] (-f FILE | -p PAYLOAD) [-j JAR] [-d] [-t] [-l] [-a]

Wrapper for Apache Tika App.

optional arguments:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  File to submit (default: None)
  -p PAYLOAD, --payload PAYLOAD
                        Base64 payload to submit (default: None)
  -j JAR, --jar JAR     Apache Tika app JAR (default: None)
  -d, --detect          Detect document type (default: False)
  -t, --text            Output plain text content (default: False)
  -l, --language        Output only language (default: False)
  -a, --all             Output metadata and content from all embedded files
                        (default: False)
  -v, --version         show program's version number and exit


```shell $ tikapp -f example_file -a

Performance tests

These are the results of performance tests in tests folder:

(Python 2)
tika_content_type()             0.704840 sec
tika_detect_language()          1.592066 sec
magic_content_type()            0.000215 sec
tika_extract_all_content()      0.816366 sec
tika_extract_only_content()     0.788667 sec

(Python 3)
tika_content_type()             0.698357 sec
tika_detect_language()          1.593452 sec
magic_content_type()            0.000226 sec
tika_extract_all_content()      0.785915 sec
tika_extract_only_content()     0.766517 sec
Release History

Release History

This version
History Node


History Node


History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
tika-app-1.1.1.tar.gz (6.7 kB) Copy SHA256 Checksum SHA256 Source Jun 25, 2017

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting