Skip to main content
Help improve PyPI by participating in a 5-minute user interface survey!

Textable add-on for Orange 3 data mining software package.

Project Description

Textable is an open source add-on bringing advanced text-analytical functionalities to the Orange Canvas data mining software package (itself open source). Look at the following example to see it in typical action.

The project’s website is http://textable.io. It hosts a repository of recipes to help you get started with Textable.

Documentation is hosted at http://orange3-textable.readthedocs.io/ and you can get further support at https://textable.freshdesk.com/ or by e-mail to support@textable.io

Orange Textable was designed and implemented by LangTech Sarl on behalf of the department of language and information sciences (SLI) at the University of Lausanne (see Credits and How to cite Orange Textable).

Features

Basic text analysis

  • use regular expressions to segment letters, words, sentences, etc. or full-text query
  • use regexes to extract annotations from many input formats
  • import in-line XML markup (e.g. TEI)
  • include/exclude segments based on user-defined lists (stoplists)
  • filter segments based on frequency
  • easily generate random text samples

Quantitative text analysis

  • concordances and collocations, also based on annotations
  • segment distribution, document-term matrix, transition matrix, etc.
  • co-occurrence tables, also between different types of segments
  • robust linguistic complexity measures, incl. mean length of word, lexical diversity, etc.
  • many advanced data mining algorithms: clustering, classification, factor analyses, etc.

Text recoding

  • Unicode-aware preprocessing functions, e.g. remove accents from Ancient Greek text
  • recode and restructure texts using regexes, e.g. rewrite CSV as XML

Extensibility

  • handles hundreds of text files
  • use Python script for custom text processing or to access external tools: NLTK, Pattern, GenSim, etc.

Interoperability

  • import text from keyboard, files, or URLs
  • process any kind of raw text format: TXT, HTML, XML, CSV, etc.
  • supports many text encodings, incl. Unicode
  • export results in text files or copy–paste

Ease of access

  • user-friendly visual interface
  • ready-made recipes for a range of frequent use cases
  • extensive documentation
  • support and community forums

Release history Release notifications

History Node

3.1.0b3

History Node

3.1.0b2

History Node

3.1.0b1

History Node

3.1.0b0

History Node

3.1.0a0

This version
History Node

3.0.7

History Node

3.0.6

History Node

3.0.5

History Node

3.0.4

History Node

3.0.3

History Node

3.0.2

History Node

3.0.1

History Node

3.0.0

History Node

3.0b0

History Node

3.0a5

History Node

3.0a4

History Node

3.0a3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
Orange3_Textable-3.0.7-py3-none-any.whl (180.3 kB) Copy SHA256 hash SHA256 Wheel py3 Oct 30, 2017
Orange3-Textable-3.0.7.tar.gz (138.1 kB) Copy SHA256 hash SHA256 Source None Oct 30, 2017

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page