Skip to main content

Textable add-on for Orange 3 data mining software package.

Project description

Textable is an open source add-on bringing advanced text-analytical functionalities to the Orange Canvas data mining software package (itself open source). Look at the following example to see it in typical action.

The project’s website is http://textable.io. It hosts a repository of recipes to help you get started with Textable.

Documentation is hosted at http://orange3-textable.readthedocs.io/ and you can get further support at https://textable.freshdesk.com/ or by e-mail to support@textable.io

Orange Textable was designed and implemented by LangTech Sarl on behalf of the department of language and information sciences (SLI) at the University of Lausanne (see Credits and How to cite Orange Textable).

Features

Basic text analysis

  • use regular expressions to segment letters, words, sentences, etc. or full-text query

  • use regexes to extract annotations from many input formats

  • import in-line XML markup (e.g. TEI)

  • include/exclude segments based on user-defined lists (stoplists)

  • filter segments based on frequency

  • easily generate random text samples

Advanced text analysis

  • concordances and collocations, also based on annotations

  • segment distribution, document-term matrix, transition matrix, etc.

  • co-occurrence tables, also between different types of segments

  • lemmatization and POS-tagging via Treetagger

  • robust linguistic complexity measures, incl. mean length of word, lexical diversity, etc.

  • many advanced data mining algorithms: clustering, classification, factor analyses, etc.

Text recoding

  • Unicode-aware preprocessing functions, e.g. remove accents from Ancient Greek text

  • recode and restructure texts using regexes, e.g. rewrite CSV as XML

Extensibility

  • handles hundreds of text files

  • use Python script for custom text processing or to access external tools: NLTK, Pattern, GenSim, etc.

Interoperability

  • import text from keyboard, files, or URLs

  • process any kind of raw text format: TXT, HTML, XML, CSV, etc.

  • supports many text encodings, incl. Unicode

  • export results in text files or copy-paste

  • easy interfacing with Orange’s Text Mining add-on

Ease of access

  • user-friendly visual interface

  • ready-made recipes for a range of frequent use cases

  • extensive documentation

  • support and community forums

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orange3_textable-3.2.7.tar.gz (7.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orange3_textable-3.2.7-py3-none-any.whl (205.0 kB view details)

Uploaded Python 3

File details

Details for the file orange3_textable-3.2.7.tar.gz.

File metadata

  • Download URL: orange3_textable-3.2.7.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for orange3_textable-3.2.7.tar.gz
Algorithm Hash digest
SHA256 92721d90acc1c76776b67f42587a9ef6b36a1dcffdc0c4eebd6e9b22afa750f4
MD5 c233bfad1c0ce88a4ef28f8c2b44f111
BLAKE2b-256 6e4b82a1cd2a4521d8dc06b441ea3ec9be8cad98570aee4496fb9cc758e4d6d2

See more details on using hashes here.

File details

Details for the file orange3_textable-3.2.7-py3-none-any.whl.

File metadata

File hashes

Hashes for orange3_textable-3.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2758987393e7a54a6c80d630a619b6a4527b87ebbc43140fdecacec4dd048c65
MD5 6fb2cced4101911db53ff0b5ab3b75ee
BLAKE2b-256 b4fc64b2504d41603e1b53c411492eca3d4f5b09133494194823da4ade75f346

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page