Skip to main content

a simple utility to take in a sentence and output information about the AWL words in it

Project description

Awlify

made-with-python GitHub license

A very basic tool that takes in a sentence of text and outputs the same text, annotated with information about whether any of its words are in the Academic Word List.

installing

pip install awlify

and if you haven't used spacy on your system before, you'll need to install the model we're using here with the command below:

python -m spacy download en_core_web_sm

tests

python -m unittest

usage inside a file

from awlify import awlify

result = awlify('please inform me of the academic words in this sentence')

print(result)
{"data": {"sentence": "please inform me of the academic words in this sentence", "awl_words": [{"index": 5, "word": "academic", "meta": {"head": "academy", "sublist": 5}}]}}

usage from the command line

python -m awlify 'this is a sentence to check'

{"data": {"sentence": "this is a sentence to check", "awl_words": []}}

expected input / output

format for output:

{
  "data": {
    "sentence": "THIS IS THE ORIGINAL SENTENCE",
    "awl_words": [
      {
        "index": INDEX_OF_AWL_WORD_FOUND,
        "word": "AWL_WORD_FOUND",
        "meta": {
          "head": "THE_HEADWORD_FROM_THE_AWL",
          "sublist": THE_AWL_SUBLIST_OF_THE_WORD
        }
      }
    ]
  }
}

example input for a simple sentence (no AWL words):

simple_sentence = awlify('this is a sentence')

example output for a simple sentence (no AWL words):

{
  "data": {
    "sentence": "this is a sentence",
    "awl_words": []
  }
}

example input for a complex sentence (a few AWL words):

complex_sentence = awlify('the economic recovery is ongoing and potentially problematic')

example output for a complex sentence (a few AWL words):

{
  "data": {
    "sentence": "the economic recovery is ongoing and potentially problematic",
    "awl_words": [
      {
        "index": 1,
        "word": "economic",
        "meta": {
          "head": "economy",
          "sublist": 1
        }
      },
      {
        "index": 2,
        "word": "recovery",
        "meta": {
          "head": "recover",
          "sublist": 6
        }
      },
      {
        "index": 6,
        "word": "potentially",
        "meta": {
          "head": "potential",
          "sublist": 2
        }
      }
    ]
  }
}

NOTES

The current implementation of the sentence tokenization uses spacy, and so it's a bit heavier than absolutely necessary, since we're not taking advantage of any of the more advanced characteristics of the package.

In theory, it could probably perform 98% as well with just a simple regex, so I might add the option to do that in the future if there aren't any real use cases for needing the full weight of spacy.

REFERENCES

Coxhead, Averil (2000) A New Academic Word List. TESOL Quarterly, 34(2): 213-238.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

awlify-1.1.2.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

awlify-1.1.2-py3-none-any.whl (19.0 kB view details)

Uploaded Python 3

File details

Details for the file awlify-1.1.2.tar.gz.

File metadata

  • Download URL: awlify-1.1.2.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for awlify-1.1.2.tar.gz
Algorithm Hash digest
SHA256 7e9881c19331da26d57e760923e2340ca096e9e4862546a568c9e545cbc772d9
MD5 f750a24caf96b42659770f0b2461ea4a
BLAKE2b-256 5bcf148ee90c5282c32f71e0a3dc3b2530998dae20fb826d98e5248f9d47ceb9

See more details on using hashes here.

File details

Details for the file awlify-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: awlify-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.19.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for awlify-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4e5b51cefeef52189fa383ce1c8df8d570c6d7e266cd2de0f9dafdd2526ba0cd
MD5 ba93879ae5a0273618438f283be053eb
BLAKE2b-256 4b2f077051d052d673086a372afe368bc03f2478666fb8bcc29866f4ed36ce03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page