Skip to main content

Reports coverage given a set of unicode values

Project description

https://travis-ci.org/jackjennings/smorgasbord.svg?branch=master

Smörgåsbord tests coverage over unicode character sets.

The Smorgasbord class inherits from UnicodeSet and supports the same features.

Supports Python 2.6 – 3.x

$ pip install smorgasbord
from smorgasbord import Smorgasbord

# Provide a path to a file or folder of character sets
Smorgasbord.paths.prepend("/my/path/to/language/en.txt")

>>> bord = Smorgasbord([97, "b", "c", u"ü", u"\u0660"])
Smorgasbord([u"a", u"c", u"b", u"\xfc", u"\u0660"])

# Reports are accessed though the "reports" dict using the language code
>>> en = bord.reports["en"]

# Basic information about the report's language is accessible
>>> en.language.code
"en"
>>> en.language.name
"English"
>>> en.language.characters
FrozenUnicodeSet([u"a", u"b", u"c", ...])

# Amount of coverage is availbe as float and string representations
>>> en.coverage
0.057
>>> en.coverage.percentage
u"5.7%"

# Sets of glyphs can be accessed
>>> en.covered
FrozenUnicodeSet([u"a", u"b", u"c"])
>>> en.uncovered
FrozenUnicodeSet([u"d", u"e", u"f", ...])

# Reports can also return a boolean value for completeness:
>>> en.complete
False
>>> en.incomplete
True

Character Set Files

Each character set is defined in a text file. Characters are separated by spaces, and lines starting with a # are ignored as comments. Line breaks can and should be used to wrap lines to 80 characters maximum.

For example, an en.txt definition for an English coverage character set:

# Language: English
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

The first line is a special comment that will be parsed as the language name. Other special comments may be added in the future, but for now only Language is supported.

Supplying Character Sets

Character sets are made available by supplying a path to a folder, or directly to a language file.

Smorgasbord.paths.prepend("/my/path/to/language/files/dir")
Smorgasbord.paths.prepend("/my/path/to/language/file.txt")

Character set files are searched for in each succesive folder, using the first matching file.

Alternatively, the paths array can be replaced entirely:

Smorgasbord.paths = ["/my/path/to/language/files/dir"]

Roadmap

This is a quick list of features that will need to be added in the near future (and will probably comprise a 1.0 release).

  • Lazily evaluate reports. Currently the library loads all language files when a Smorgasbord is initialized, which will get slow, fast. This should happen at the latest possible moment.

  • Unicode ranges in language files. Adding support for unicode ranges will probably be necesary for languages with large character sets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smorgasbord-0.0.2.tar.gz (5.2 kB view details)

Uploaded Source

File details

Details for the file smorgasbord-0.0.2.tar.gz.

File metadata

  • Download URL: smorgasbord-0.0.2.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for smorgasbord-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c7994838dec4eb6ef57c2d905903e9e863de797495132de13688cfb01ce7a281
MD5 605ea49164892c637369235b7f8f2ef0
BLAKE2b-256 26485a74059e7bd2f12347134454a0e086e11f6a3d213368a2aa34cdbdccd9be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page