Skip to main content

Typology-based semantic labelling of numeric columns

Project description

TTLA

DOI

This application is meant to be an automated experiment and not an application by it self to annotated numeric columns. Nonetheless, we are planning to create an application based on this approach details will be mentioned here once we start.

Prerequisits (one time)

  1. pip
  2. virtualenv
  3. create virtualenv: virtualenv -p /usr/bin/python2.7 .venv
  4. access the virtualenv: source .venv/bin/activate
  5. install dependencies: pip install -r requirements.txt

Run the experiments

To download the data of T2Dv2 automatically

python data/preprocessing.py

Detection

python experiments/web_commons_v2.py detect

Labeling

  1. Label (may take up to an hour, it needs to be connected to the internet)
python experiments/web_commons_v2.py label
  1. Get the kinds (offline, quick)
python experiments/web_commons_v2.py addkinds
 
  1. Show scores (offline, quick)
python experiments/web_commons_v2.py scores
 

Tests

Quick tests (test the algorithms, but does not include the t2d experiment)

sh run_tests.sh

run tests with the T2Dv2 experiment (may take up to an hour)

sh run_t2dv2_tests.sh

not that some tests may fail overtime as they depend on dbpedia

Coverage:

Coverage of the quick tests

sh run_cov.sh

Coverage of T2Dv2 tests

sh run_t2dv2_cov.sh

To publish

python setup.py sdist bdist_wheel
twine upload dist/*

Contribution

To contribute, please read the below to follow the same convention

Code structure

  • The source code related to detection of data types (e.g. categorical, continuous, ...) is located under detect.
  • while the files related to the annotation of the semantic types (e.g. height of a person) are located under label.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ttla-1.0.1.tar.gz (19.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ttla-1.0.1-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file ttla-1.0.1.tar.gz.

File metadata

  • Download URL: ttla-1.0.1.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ttla-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d54adf9956f8cf340951f56761da89af172c18c32334be134fa3065e13ed3c13
MD5 dcf8d7763c4f3890e84bccdefc5ced17
BLAKE2b-256 1f411e52391bb5d7933783c2cfa860e19bb4aaeecd624584f35236ebbe1c5036

See more details on using hashes here.

File details

Details for the file ttla-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ttla-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ttla-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6f27b641732a1487113846a63b100b75c883c240ab762142411903f22eedcd1b
MD5 22ff45e532083fc6b6a67f07bca6c465
BLAKE2b-256 ec3d0efb5bd78de109532ae1b1ce53e290ac6b7aecc89740096c1ca5dd1e47dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page