Tags and sets of tags with __format__ support and optional ontology information.
Project description
Tags and sets of tags with format support and optional ontology information.
Latest release 20210404: Bugfix TagBasedTest.COMPARISON_FUNCS["="]: if cmp_value is None, return true (the tag is present).
See `cs.fstags` for support for applying these to filesystem objects
such as directories and files.
See `cs.sqltags` for support for databases of entities with tags,
not directly associated with filesystem objects.
This is suited to both log entries (entities with no "name")
and large collections of named entities;
both accept `Tag`s and can be searched on that basis.
All of the available complexity is optional:
you can use `Tag`s without bothering with `TagSet`s
or `TagsOntology`s.
This module contains the following main classes:
* `Tag`: an object with a `.name` and optional `.value` (default `None`)
and also an optional reference `.ontology`
for associating semantics with tag values.
The `.value`, if not `None`, will often be a string,
but may be any Python object.
If you're using these via `cs.fstags`,
the object will need to be JSON transcribeable.
* `TagSet`: a `dict` subclass representing a set of `Tag`s
to associate with something;
it also has setlike `.add` and `.discard` methods.
As such it only supports a single `Tag` for a given tag name,
but that tag value can of course be a sequence or mapping
for more elaborate tag values.
* `TagsOntology`:
a mapping of type names to `TagSet`s defining the type.
This mapping also contains entries for the metadata
for specific type values.
Here's a simple example with some `Tag`s and a `TagSet`.
>>> tags = TagSet()
>>> # add a "bare" Tag named 'blue' with no value
>>> tags.add('blue')
>>> # add a "topic=tagging" Tag
>>> tags.add('topic','tagging')
>>> # make a "subtopic" Tag and add it
>>> subtopic = Tag('subtopic', 'ontologies')
>>> tags.add(subtopic)
>>> # Tags have nice repr() and str()
>>> subtopic
Tag(name='subtopic',value='ontologies',ontology=None)
>>> print(subtopic)
subtopic=ontologies
>>> # TagSets also have nice repr() and str()
>>> tags
TagSet:{'blue': None, 'topic': 'tagging', 'subtopic': 'ontologies'}
>>> print(tags)
blue subtopic=ontologies topic=tagging
>>> tags2 = TagSet({'a': 1}, b=3, c=[1,2,3], d='dee')
>>> tags2
TagSet:{'a': 1, 'b': 3, 'c': [1, 2, 3], 'd': 'dee'}
>>> print(tags2)
a=1 b=3 c=[1,2,3] d=dee
>>> # since you can print a TagSet to a file as a line of text
>>> # you can get it back from a line of text
>>> TagSet.from_line('a=1 b=3 c=[1,2,3] d=dee')
TagSet:{'a': 1, 'b': 3, 'c': [1, 2, 3], 'd': 'dee'}
>>> # because TagSets are dicts you can format strings with them
>>> print('topic:{topic} subtopic:{subtopic}'.format_map(tags))
topic:tagging subtopic:ontologies
>>> # TagSets have convenient membership tests
>>> # test for blueness
>>> 'blue' in tags
True
>>> # test for redness
>>> 'red' in tags
False
>>> # test for any "subtopic" tag
>>> 'subtopic' in tags
True
>>> # test for subtopic=ontologies
>>> subtopic in tags
True
>>> # test for subtopic=libraries
>>> subtopic2 = Tag('subtopic', 'libraries')
>>> subtopic2 in tags
False
== Ontologies ==
Tag
s and TagSet
s suffice to apply simple annotations to things.
However, an ontology brings meaning to those annotations.
See the TagsOntology
class for implementation details,
access methods and more examples.
Consider a record about a movie, with this TagSet
:
title="Avengers Assemble"
series="Avengers (Marvel)"
cast={"Scarlett Johansson":"Black Widow (Marvel)"}
where we have the movie title, a name for the series in which it resides, and a cast as an association of actors with roles.
An ontology lets us associate implied types and metadata with these values.
Here's an example ontology supporting the above TagSet
:
type.cast type=dict key_type=person member_type=character description="members of a production"
type.character description="an identified member of a story"
type.series type=str
meta.character.marvel.black_widow type=character names=["Natasha Romanov"]
meta.person.scarlett_johansson fullname="Scarlett Johansson" bio="Known for Black Widow in the Marvel stories."
The type information for a cast
is defined by the ontology entry named type.cast
,
which tells us that a cast
Tag
is a dict
,
whose keys are of type person
and whose values are of type character
.
(The default type is str
.)
To find out the underlying type for a character
we look that up in the ontology in turn;
because it does not have a specified type
Tag
, it it taken to be a str
.
Having the types for a cast
,
it is now possible to look up the metadata for the described cast members.
The key "Scarlett Johansson"
is a person
(from the type definition of cast
).
The ontology entry for her is named meta.person.scarlett_johansson
which is computed as:
meta
: the name prefix for metadata entriesperson
: the type namescarlett_johansson
: obtained by downcasing"Scarlett Johansson"
and replacing whitespace with an underscore. The full conversion process is defined by theTagsOntology.value_to_tag_name
function.
The key "Black Widow (Marvel)"
is a character
(again, from the type definition of cast
).
The ontology entry for her is named meta.character.marvel.black_widow
which is computed as:
meta
: the name prefix for metadata entriescharacter
: the type namemarvel.black_widow
: obtained by downcasing"Black Widow (Marvel)"
, replacing whitespace with an underscore, and moving a bracketed suffix to the front as an unbracketed prefix. The full conversion process is defined by theTagsOntology.value_to_tag_name
function.
== Format Strings ==
While you can just use str.format_map
as shown above
for the directvalues in a TagSet
(and some command line tools like fstags
use this in output format specifications
you can also use TagSet
s in format strings.
There is a TagSet.ns()
method which constructs
an enhanced type of SimpleNamespace
from the tags in the set
which allows convenient dot notation use in format strings,
for example:
tags = TagSet(colour='blue', labels=['a','b','c'], size=9, _ontology=ont)
ns = tags.ns()
print(f'colour={ns.colour}, info URL={ns.colour._meta.url}')
colour=blue, info URL=https://en.wikipedia.org/wiki/Blue
There is a detailed run down of this in the TagSetNamespace
docstring below.
Function as_unixtime(*a, **kw)
Convert a tag value to a UNIX timestamp.
This accepts int
, float
(already a timestamp)
and date
or datetime
(use datetime.timestamp() for a nonnaive
datetime, otherwise
time.mktime(tag_value.time_tuple())`,
which assumes the local time zone).
Class ExtendedNamespace(types.SimpleNamespace)
Subclass SimpleNamespace
with inferred attributes
intended primarily for use in format strings.
As such it also presents attributes as []
elements via __getitem__
.
Because [:alpha:]* attribute names
are reserved for "public" keys/attributes,
most methods commence with an underscore (_
).
Method ExtendedNamespace.__format__(self, *a, **kw)
The default formatted form of this node.
The value to format is '{
type':'path'['public_keys']'`.
Method ExtendedNamespace.__getattr__(self, attr)
Just a stub so that (a) subclasses can call super().__getattr__
and (b) a pathbased AttributeError
gets raised for better context.
Method ExtendedNamespace.__len__(self)
The number of public keys.
Method ExtendedNamespace.__str__(self)
Return a visible placeholder, supporting exposing this object in a format string so that the user knows there wasn't a value at this point in the dotted path.
Function main(_)
Test code.
Class RegexpTagRule
A regular expression based Tag
rule.
This applies a regular expression to a string
and returns inferred Tag
s.
Method RegexpTagRule.infer_tags(self, *a, **kw)
Apply the rule to the string s
, return a list of Tag
s.
Class Tag(Tag,builtins.tuple)
A Tag has a .name
(str
) and a .value
and an optional .ontology
.
The name
must be a dotted identifier.
Terminology:
- A "bare"
Tag
has avalue
ofNone
. - A "naive"
Tag
has anontology
ofNone
.
The constructor for a Tag
is unusual:
- both the
value
andontology
are optional, defaulting toNone
- if
name
is astr
then we always construct a newTag
with the suppplied values - if
name
is not astr
it should be aTag
like object to promote; it is an error if thevalue
parameter is notNone
in this case
The promotion process is as follows:
- if
name
is aTag
subinstance then if the suppliedontology
is notNone
and is not the ontology associated withname
then a newTag
is made, otherwisename
is returned unchanged - otherwise a new
Tag
is made fromname
using its.value
and overriding its.ontology
if theontology
parameter is notNone
Method Tag.__str__(self)
Encode name
and value
.
Property Tag.basetype
The base type name for this tag.
Returns None
if there is no ontology.
This calls TagsOntology.basetype(self.ontology,self.type)
.
Method Tag.from_str(s, offset=0, ontology=None)
Parse a Tag
definition from s
at offset
(default 0
).
Method Tag.is_valid_name(name)
Test whether a tag name is valid: a dotted identifier.
Method Tag.key_metadata(self, *a, **kw)
Return the metadata definition for key
.
The metadata TagSet
is obtained from the ontology entry
'meta.*type*
.*key_tag_name* where *type* is the
Tag's
key_typeand *key_tag_name* is the key converted into a dotted identifier by
TagsOntology.value_to_tag_name`.
Property Tag.key_type
The type name for members of this tag.
This is required if .value
is a mapping.
Property Tag.key_typedata
The typedata definition for this Tag
's keys.
This is for Tag
s which store mappings,
for example a movie cast, mapping actors to roles.
The name of the member type comes from
the key_type
entry from self.typedata
.
That name is then looked up in the ontology's types.
Method Tag.matches(self, name, value=None, *a, **kw)
Test whether this Tag
matches (tag_name,value)
.
Method Tag.member_metadata(self, *a, **kw)
Return the metadata definition for self[member_key].
The metadata TagSet
is obtained from the ontology entry
'meta.*type*
.*member_tag_name* where *type* is the
Tag's
member_typeand *member_tag_name* is the member value converted into a dotted identifier by
TagsOntology.value_to_tag_name`.
Property Tag.member_type
The type name for members of this tag.
This is required if .value
is a sequence or mapping.
Property Tag.member_typedata
The typedata definition for this Tag
's members.
This is for Tag
s which store mappings or sequences,
for example a movie cast, mapping actors to roles,
or a list of scenes.
The name of the member type comes from
the member_type
entry from self.typedata
.
That name is then looked up in the ontology's types.
Property Tag.meta
The Tag
metadata derived from the Tag
's ontology.
Method Tag.metadata(self, ontology=None, convert=None)
Fetch the metadata information about this specific tag value,
derived through the ontology
from the tag name and value.
The default ontology
is self.onotology
.
For a scalar type (int
, float
, str
) this is the ontology TagSet
for self.value
.
For a sequence (list
) this is a list of the metadata
for each member.
For a mapping (dict
) this is mapping of key->value_metadata
.
Method Tag.parse(s, offset=0, *, ontology)
Parse tag_name[=value], return (Tag,offset)
.
Method Tag.parse_name(s, offset=0)
Parse a tag name from s
at offset
: a dotted identifier.
Method Tag.parse_value(s, offset=0)
Parse a value from s
at offset
(default 0
).
Return the value, or None
on no data.
Method Tag.transcribe_value(value)
Transcribe value
for use in Tag
transcription.
Property Tag.type
The type name for this Tag
.
Unless the definition for self.name
has a type
tag,
the type is self.ontology.value_to_tag_name(self.name)
.
For example, the tag series="Avengers (Marvel)"
would look up the definition for series
.
If that had no type=
tag, then the type
would default to series
which is what would be returned.
The corresponding metadata TagSet
for that tag
would have the name series.marvel.avengers
.
By contrast, the tag cast={"Scarlett Johansson":"Black Widow (Marvel)"}
would look up the definition for cast
which might look like this:
cast type=dict key_type=person member_type=character
That says that the type name is dict
,
which is what would be returned.
Because the type is dict
the definition also has key_type
and member_type
tags
identifying the type names for the keys and values
of the cast=
tag.
As such, the corresponding metadata TagSet
s
in this example would be named
person.scarlett_johansson
and character.marvel.black_widow
respectively.
Property Tag.typedata
The defining TagSet
for this tag's name.
This is how its type is defined,
and is obtained from:
self.ontology['type.'+self.name]
For example, a Tag
colour=blue
gets its type information from the type.colour
entry in an ontology.
Method Tag.with_prefix(name, value, *, ontology=None, prefix)
Make a new Tag
whose name
is prefixed with prefix+'.'
.
Function tag_or_tag_value(*da, **dkw)
A decorator for functions or methods which may be called as:
func(name, [value])
or as:
func(Tag, [None])
The optional decorator argument no_self
(default False
)
should be supplied for plain functions
as they have no leading self
parameter to accomodate.
Example:
@tag_or_tag_value
def add(self, tag_name, value, *, verbose=None):
This defines a .add()
method
which can be called with name
and value
or with single Tag
like object
(something with .name
and .value
attributes),
for example:
tags = TagSet()
....
tags.add('colour', 'blue')
....
tag = Tag('size', 9)
tags.add(tag)
Class TagBasedTest(TagBasedTest,builtins.tuple,TagSetCriterion)
A test based on a Tag
.
Attributes:
spec
: the source text from which this choice was parsed, possiblyNone
choice
: the apply/reject flagtag
: theTag
representing the criterioncomparison
: an indication of the test comparison
The following comparison values are recognised:
None
: test for the presence of theTag
'='
: test that the tag value equalstag.value
'<'
: test that the tag value is less thantag.value
'<='
: test that the tag value is less than or equal totag.value
'>'
: test that the tag value is greater thantag.value
'>='
: test that the tag value is greater than or equal totag.value
'~/'
: test if the tag value as a regexp is present intag.value
- '~': test if a matching tag value is present in
tag.value
Method TagBasedTest.by_tag_value(name, value=None, *a, **kw)
Return a TagBasedTest
based on a Tag
or tag_name,tag_value
.
Method TagBasedTest.match_tagged_entity(self, te: 'TagSet') -> bool
Test against the Tag
s in tags
.
Note: comparisons when self.tag.name
is not in tags
always return False
(possibly inverted by self.choice
).
Method TagBasedTest.parse(s, offset=0, delim=None)
Parse tag_name[{<
|<=
|'='|'>='|>
|'~'}value]
and return (dict,offset)
where the dict
contains the following keys and values:
tag
: aTag
embodying the tag name and valuecomparison
: an indication of the test comparison
Class TagFile(cs.obj.SingletonMixin,TagSets,cs.resources.MultiOpenMixin)
A reference to a specific file containing tags.
This manages a mapping of name
=> TagSet
,
itself a mapping of tag name => tag value.
Method TagFile.__setitem__(self, name, te)
Set item name
to te
.
Method TagFile.get(self, name, default=None)
Get from the tagsets.
Method TagFile.items(self, prefix=None)
tagsets.items
If the optional prefix
is supplied,
yield only those items whose keys start with prefix
.
Method TagFile.keys(self, prefix=None)
tagsets.keys
If the options prefix
is supplied,
yield only those keys starting with prefix
.
Method TagFile.load_tagsets(filepath, ontology)
Load filepath
and return (tagsets,unparsed)
.
The returned tagsets
are a mapping of name
=>tag_name
=>value
.
The returned unparsed
is a list of (lineno,line)
for lines which failed the parse (excluding the trailing newline).
Property TagFile.names
The names from this FSTagsTagFile
as a list.
Method TagFile.parse_tags_line(*a, **kw)
Parse a "name tags..." line as from a .fstags
file,
return (name,TagSet)
.
Method TagFile.save(self)
Save the tag map to the tag file.
Method TagFile.save_tagsets(*a, **kw)
Save tagsets
and unparsed
to filepath
.
This method will create the required intermediate directories if missing.
Method TagFile.shutdown(self)
Save the tagsets if modified.
Method TagFile.startup(self)
No special startup.
Method TagFile.tags_line(name, tags)
Transcribe a name
and its tags
for use as a .fstags
file line.
Property TagFile.tagsets
The tag map from the tag file,
a mapping of name=>TagSet
.
This is loaded on demand.
Method TagFile.update(self, name, tags, *, prefix=None, verbose=None)
Update the tags for name
from the supplied tags
as for Tagset.update
.
Method TagFile.values(self, prefix=None)
tagsets.values
If the optional prefix
is supplied,
yield only those values whose keys start with prefix
.
Class TagsCommandMixin
Utility methods for cs.cmdutils.BaseCommand
classes working with tags.
Optional subclass attributes:
TAGSET_CRITERION_CLASS
: aTagSetCriterion
duck class, defaultTagSetCriterion
. For example,cs.sqltags
has a subclass with an.extend_query
method for computing an SQL JOIN used in searching for tagged entities.
Method TagsCommandMixin.parse_tag_choices(argv)
Parse argv
as an iterable of [!
]tag_name[=
*tag_value]
Tag`
additions/deletions.
Method TagsCommandMixin.parse_tagset_criteria(argv, tag_based_test_class=None)
Parse tag specifications from argv
until an unparseable item is found.
Return (criteria,argv)
where criteria
is a list of the parsed criteria
and argv
is the remaining unparsed items.
Each item is parsed via
cls.parse_tagset_criterion(item,tag_based_test_class)
.
Method TagsCommandMixin.parse_tagset_criterion(arg, tag_based_test_class=None)
Parse arg
as a tag specification
and return a tag_based_test_class
instance
via its .from_str
factory method.
Raises ValueError
in a misparse.
The default tag_based_test_class
comes from cls.TAGSET_CRITERION_CLASS
,
which itself defaults to class TagSetCriterion
.
The default TagSetCriterion.from_str
recognises:
-
tag_name: a negative requirement for tag_name- tag_name[
=
value]: a positive requirement for a tag_name with optional value.
Class TagSet(builtins.dict,cs.lex.FormatableMixin,cs.mappings.AttrableMappingMixin)
A setlike class associating a set of tag names with values.
This actually subclasses dict
, so a TagSet
is a direct
mapping of tag names to values.
It accepts attribute access to simple tag values when they
do not conflict with the class methods;
the reliable method is normal item access.
NOTE: iteration yields Tag
s, not dict keys.
Also note that all the Tags
from TagSet
share its ontology.
Subclasses should override the set
and discard
methods;
the dict
and mapping methods
are defined in terms of these two basic operations.
TagSet
s have a few special properties:
id
: a domain specific identifier; this may reasonably beNone
for entities not associated with database rows; thecs.sqltags.SQLTags
class associates this with the database row id.name
: the entity's name; a read only alias for the'name'
Tag
. Thecs.sqltags.SQLTags
class defines "log entries" asTagSet
s with noname
.unixtime
: a UNIX timestamp, afloat
holding seconds since the UNIX epoch (midnight, 1 January 1970 UTC). This is typically the row creation time for entities associated with database rows.
Because TagSet
subclasses cs.mappings.AttrableMappingMixin
you can also access tag values as attributes
provided that they do conflict with instance attributes
or class methods or properties.
The TagSet
class defines the class attribute ATTRABLE_MAPPING_DEFAULT
as None
which causes attribute access to return None
for missing tag names.
This supports code like:
if tags.title:
# use the title in something
else:
# handle a missing title tag
Method TagSet.__init__(self, *a, **kw)
Initialise the TagSet
.
Parameters:
- positional parameters initialise the
dict
and are passed todict.__init__
_id
: optional identity value for databaselike implementations_ontology
: optionalTagsOntology to use for this
TagSet`- other alphabetic keyword parameters are also used to initialise the
dict
and are passed todict.__init__
Method TagSet.__getattr__(self, attr)
Support access to dotted name attributes
if attr
is not found via the superclass __getattr__
.
This is done by returning a subtags of those tags
commencing with attr+'.'
.
Example:
>>> tags=TagSet(a=1,b=2)
>>> tags.a
1
>>> tags.c
>>> tags['c.z']=9
>>> tags['c.x']=8
>>> tags
TagSet:{'a': 1, 'b': 2, 'c.z': 9, 'c.x': 8}
>>> tags.c
TagSet:{'z': 9, 'x': 8}
>>> tags.c.z
9
However, this is not supported when there is a tag named 'c'
because tags.c
has to return the 'c'
tag value:
>>> tags=TagSet(a=1,b=2,c=3)
>>> tags.a
1
>>> tags.c
3
>>> tags['c.z']=9
>>> tags.c.z
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'z'
Method TagSet.__iter__(self, prefix=None, ontology=None)
Yield the tag data as Tag
s.
Method TagSet.__setattr__(self, attr, value)
Attribute based Tag
access.
If attr
is in self.__dict__
then that is updated,
supporting "normal" attributes set on the instance.
Otherwise the Tag
named attr
is set to value
.
The __init__
methods of subclasses should do something like this
(from TagSet.__init__
)
to set up the ordinary instance attributes
which are not to be treated as Tag
s:
self.__dict__.update(id=_id, ontology=_ontology, modified=False)
Method TagSet.__str__(self)
The TagSet
suitable for writing to a tag file.
Method TagSet.add(self, name, value=None, *a, **kw)
Set self[tag_name]=value
.
If verbose
, emit an info message if this changes the previous value.
Method TagSet.as_dict(self)
Return a dict
mapping tag name to value.
Method TagSet.as_tags(self, prefix=None, ontology=None)
Yield the tag data as Tag
s.
Property TagSet.csvrow
This TagSet
as a list useful to a csv.writer
.
The inverse of from_csvrow
.
Method TagSet.discard(self, name, value=None, *a, **kw)
Discard the tag matching (tag_name,value)
.
Return a Tag
with the old value,
or None
if there was no matching tag.
Note that if the tag value is None
then the tag is unconditionally discarded.
Otherwise the tag is only discarded
if its value matches.
Method TagSet.edit(self, editor=None, verbose=None)
Edit this TagSet
.
Method TagSet.edit_many(*a, **kw)
Edit an iterable of TagSet
s.
Return a list of (old_name,new_name,TagSet)
for those which were modified.
This function supports modifying both name
and Tag
s.
Method TagSet.format_kwargs(self, *a, **kw)
Return a TagSetNamespace
for this TagSet
.
This has many convenience facilities for use in format strings.
Method TagSet.from_csvrow(csvrow)
Construct a TagSet
from a CSV row like that from
TagSet.csvrow
, being unixtime,id,name,tags...
.
Method TagSet.from_line(line, offset=0, *, ontology=None, verbose=None)
Create a new TagSet
from a line of text.
Property TagSet.name
Read only name
property, None
if there is no 'name'
tag.
Method TagSet.ns(self, *a, **kw)
Return a TagSetNamespace
for this TagSet
.
This has many convenience facilities for use in format strings.
Method TagSet.set(self, name, value=None, *a, **kw)
Set self[tag_name]=value
.
If verbose
, emit an info message if this changes the previous value.
Method TagSet.set_from(self, other, verbose=None)
Completely replace the values in self
with the values from other
,
a TagSet
or any other name
=>value
dict.
This has the feature of logging changes
by calling .set
and .discard
to effect the changes.
Method TagSet.subtags(self, prefix)
Return a new TagSet
containing tags commencing with prefix+'.'
with the key prefixes stripped off.
Example:
>>> tags = TagSet({'a.b':1, 'a.d':2, 'c.e':3})
>>> tags.subtags('a')
TagSet:{'b': 1, 'd': 2}
Method TagSet.tag(self, tag_name, prefix=None, ontology=None)
Return a Tag
for tag_name
, or None
if missing.
Property TagSet.unixtime
unixtime
property, autosets to time.time()
if accessed.
Method TagSet.update(self, other, *, prefix=None, verbose=None)
Update this TagSet
from other
,
a dict of {name:value}
or an iterable of Tag
like or (name,value)
things.
Class TagSetCriterion
A testable criterion for a TagSet
.
TagSetCriterion.TAG_BASED_TEST_CLASS
Method TagSetCriterion.from_any(*a, **kw)
Convert some suitable object o
into a TagSetCriterion
.
Various possibilities for o
are:
TagSetCriterion
: returned unchangedstr
: a string tests for the presence of a tag with that name and optional value;- an object with a
.choice
attribute; this is taken to be aTagSetCriterion
ducktype and returned unchanged - an object with
.name
and.value
attributes; this is taken to beTag
-like and a positive test is constructed Tag
: an object with a.name
and.value
is equivalent to a positive equalityTagBasedTest
(name,value)
: a 2 element sequence is equivalent to a positive equalityTagBasedTest
Method TagSetCriterion.from_str(*a, **kw)
Prepare a TagSetCriterion
from the string s
.
Method TagSetCriterion.from_str2(s, offset=0, delim=None)
Parse a criterion from s
at offset
and return (TagSetCriterion,offset)
.
This method recognises an optional leading '!'
or '-'
indicating negation of the test,
followed by a criterion recognised by the .parse
method
of one of the classes in cls.CRITERION_PARSE_CLASSES
.
Method TagSetCriterion.match_tagged_entity(self, te: 'TagSet') -> bool
Apply this TagSetCriterion
to a TagSet
.
Class TagSetNamespace(ExtendedNamespace,types.SimpleNamespace)
A formattable nested namespace for a TagSet
,
subclassing ExtendedNamespace
,
providing attribute based access to tag data.
TagSet
s have a .ns()
method which returns a TagSetNamespace
derived from that TagSet
.
This class exists particularly to help with format strings because tools like fstags and sqltags use these for their output formats. As such, I wanted to be able to put some expressive stuff in the format strings.
However, this also gets you attribute style access to various
related values without mucking with format strings.
For example for some TagSet
tags
with a colour=blue
Tag
,
if I set ns=tags.ns()
:
ns.colour
is itself a namespace based on thecolour
Tag`ns.colour_s
is the string'blue'
ns.colour._tag
is thecolour
Tag
itself If theTagSet
had an ontology:ns.colour._meta
is a namespace based on the metadata for thecolour
Tag
This provides an assortment of special names derived from the TagSet
.
See the docstring for __getattr__
for the special attributes provided
beyond those already provided by ExtendedNamespace.__getattr__
.
Example with a simple TagSet
:
>>> tags = TagSet(colour='blue', labels=['a','b','c'], size=9)
>>> 'The colour is {colour}.'.format_map(tags)
'The colour is blue.'
>>> # the natural way to obtain a TagSetNamespace from a TagSet
>>> ns = tags.ns() # returns TagSetNamespace.from_tagset(tags)
>>> # the ns object has additional computed attributes
>>> 'The colour tag is {colour._tag}.'.format_map(ns)
'The colour tag is colour=blue.'
>>> # also, the direct name for any Tag can be used
>>> # which returns its value
>>> 'The colour is {colour}.'.format_map(ns)
'The colour is blue.'
>>> 'The colours are {colours}. The labels are {labels}.'.format_map(ns)
"The colours are ['blue']. The labels are ['a', 'b', 'c']."
>>> 'The first label is {label}.'.format_map(ns)
'The first label is a.'
The same TagSet
with an ontology:
>>> ont = TagsOntology({
... 'type.colour': TagSet(description="a colour, a hue", type="str"),
... 'meta.colour.blue': TagSet(
... url='https://en.wikipedia.org/wiki/Blue',
... wavelengths='450nm-495nm'),
... })
>>> tags = TagSet(colour='blue', labels=['a','b','c'], size=9, _ontology=ont)
>>> # the colour Tag
>>> tags.tag('colour') # doctest: +ELLIPSIS
Tag(name='colour',value='blue',ontology=TagsOntology<...>)
>>> # type information about a colour
>>> tags.tag('colour').type
'str'
>>> tags.tag('colour').typedata
TagSet:{'description': 'a colour, a hue', 'type': 'str'}
>>> # metadata about this particular colour value
>>> tags.tag('colour').meta
TagSet:{'url': 'https://en.wikipedia.org/wiki/Blue', 'wavelengths': '450nm-495nm'}
Using a namespace view of the Tag, useful for format strings:
>>> # the TagSet as a namespace for use in format strings
>>> ns = tags.ns()
>>> # The namespace .colour node, which has the Tag attached.
>>> # When there is a Tag attached, the repr is that of the Tag value.
>>> ns.colour # doctest: +ELLIPSIS
'blue'
>>> # The underlying colour Tag itself.
>>> ns.colour._tag # doctest: +ELLIPSIS
Tag(name='colour',value='blue',ontology=TagsOntology<...>)
>>> # The str() of a namespace with a ._tag is the Tag value
>>> # making for easy use in a format string.
>>> f'{ns.colour}'
'blue'
>>> # the type information about the colour Tag
>>> ns.colour._tag.typedata
TagSet:{'description': 'a colour, a hue', 'type': 'str'}
>>> # The metadata: a TagSetNamespace for the metadata TagSet
>>> ns.colour._meta # doctest: +ELLIPSIS
TagSetNamespace(_path='.', _pathnames=(), _ontology=None, wavelengths='450nm-495nm', url='https://en.wikipedia.org/wiki/Blue')
>>> # the _meta.url is itself a namespace with a ._tag for the URL
>>> ns.colour._meta.url # doctest: +ELLIPSIS
'https://en.wikipedia.org/wiki/Blue'
>>> # but it formats nicely because it has a ._tag
>>> f'colour={ns.colour}, info URL={ns.colour._meta.url}'
'colour=blue, info URL=https://en.wikipedia.org/wiki/Blue'
Method TagSetNamespace.__bool__(self)
Truthiness: True
unless the ._bool
attribute overrides that.
Method TagSetNamespace.__format__(self, *a, **kw)
Format this node.
If there's a Tag
on the node, format its value.
Otherwise use the superclass format.
Method TagSetNamespace.__getattr__(self, *a, **kw)
Look up an indirect node attribute, whose value is inferred from another.
The following attribute names and forms are supported:
_keys
: the keys of the value for theTag
associated with this node; meaningful ifself._tag.value
has akeys
method_meta
: a namespace containing the meta information for theTag
associated with this node:self._tag.meta.ns()
_type
: a namespace containing the type definition for theTag
associated with this node:self._tag.typedata.ns()
_values
: the values within theTag.value
for theTag
associated with this node- baseattr
_lc
: lowercase and titled forms. If baseattr exists, return its value lowercased viacs.lex.lc_()
. Conversely, if baseattr is required and does not directly exist but its baseattr_lc
form does, return the value of baseattr_lc
titlelified usingcs.lex.titleify_lc()
. - baseattr
s
, baseattres
: singular/plural. If baseattr exists return[self.
baseattr]
. Conversely, if baseattr does not exist but one of its plural attributes does, return the first element from the plural attribute. [:alpha:]*
: an identifierish name binds to a stub subnamespace so the{a.b.c.d}
in a format string can be replaced with itself to present the undefined name in full.
Method TagSetNamespace.__getitem__(self, *a, **kw)
If this node has a ._tag
then dereference its .value
,
otherwise fall through to the superclass __getitem__
.
Method TagSetNamespace.__str__(self)
A TagSetNamespace
with a ._tag
renders str(_tag.value)
,
otherwise ExtendedNamespace.__str__
is used.
Method TagSetNamespace.from_tagset(*a, **kw)
Compute and return a presentation of this TagSet
as a
nested TagSetNamespace
.
Note that multiple dots in Tag
names are collapsed;
for example Tag
s named 'a.b'
, 'a..b'
, 'a.b.'
and
'..a.b'
will all map to the namespace entry a.b
.
Tag
s are processed in reverse lexical order by name, which
dictates which of the conflicting multidot names takes
effect in the namespace - the first found is used.
Property TagSetNamespace.key
The key.
Property TagSetNamespace.ontology
The reference ontology.
Property TagSetNamespace.value
The value.
Class TagSets(cs.resources.MultiOpenMixin)
Base class for collections of TagSet
instances
such as cs.fstags.FSTags
and cs.sqltags.SQLTags
.
Examples of this include:
cs.fstags.FSTags
: a mapping of filesystem paths to their associatedTagSets
cs.sqltags.SQLTags
: a mapping of names toTagSet
s stored in an SQL database
Subclasses must implement:
default_factory(self,name,**kw)
: as withdefaultdict
this is called as from__missing__
for missing names, and also fromadd
. If set toNone
then__getitem__
will raiseKeyError
for missing names. Unlikedefaultdict
, the factory is called with the keyname
and any additional keyword parameters.get(name,default=None)
: return theTagSet
associated withname
, ordefault
.__setitem__(name,tagset)
: associate aTagSet
with the keyname
; this is called by the__missing__
method with a newly createdTagSet
.
Subclasses may reasonably want to define the following:
startup(self)
: allocate any needed resources such as database connectionsshutdown(self)
: write pending changes to a backing store, release resources acquired duringstartup
keys(self)
: return an iterable of names__len__(self)
: return the number of names
Method TagSets.__init__(self, *, ontology=None)
Initialise the collection.
TagSets.TagSetClass
Method TagSets.__contains__(self, name: str)
Test whether name
is present in self.te_mapping
.
Method TagSets.__getitem__(self, name: str)
Obtain the TagSet
associated with name
.
If name
is not presently mapped,
return self.__missing__(name)
.
Method TagSets.__len__(self)
Return the length of self.te_mapping
.
Method TagSets.__missing__(self, *a, **kw)
Like dict
, the __missing__
method may autocreate a new TagSet
.
This is called from __getitem__
if name
is missing
and uses the factory cls.default_factory
.
If that is None
raise KeyError
,
otherwise call self.default_factory(name,**kw)
.
If that returns None
raise KeyError
,
otherwise save the entity under name
and return the entity.
Method TagSets.__setitem__(self, name, te)
Save te
in the backend under the key name
.
Method TagSets.add(self, name: str, **kw)
Return a new TagSet
associated with name
,
which should not already be in use.
Method TagSets.default_factory(self, name: str)
Create a new TagSet
named name
.
Method TagSets.get(self, name: str, default=None)
Return the TagSet
associated with name
,
or default
if there is no such entity.
Method TagSets.shutdown(self)
Write any pending changes to a backing store,
release resources allocated during startup
.
Method TagSets.startup(self)
Allocate any needed resources such as database connections.
Method TagSets.subdomain(self, subname)
Return a proxy for this TagSets
for the name
s
starting with subname+'.'
.
Class TagSetsSubdomain(cs.obj.SingletonMixin,cs.mappings.PrefixedMappingProxy)
A view into a TagSets
for keys commencing with a prefix.
Property TagSetsSubdomain.TAGGED_ENTITY_FACTORY
The entity factory comes from the parent collection.
Class TagsOntology(cs.obj.SingletonMixin,TagSets,cs.resources.MultiOpenMixin)
An ontology for tag names.
This is based around a mapping of names
to ontological information expressed as a TagSet
.
A cs.fstags.FSTags
uses ontologies initialised from TagFile
s
containing ontology mappings.
There are two main categories of entries in an ontology:
- types: an entry named
type.{typename}
contains aTagSet
defining the type namedtypename
- metadata: an entry named
meta.{typename}.{value_key}
contains aTagSet
holding metadata for a value of type {typename}
Types:
The type of a Tag
is nothing more than its name
.
The basic types have their Python names: int
, float
, str
, list
,
dict
, date
, datetime
.
You can define subtypes of these for your own purposes,
for example:
type.colour type=str description="A hue."
which subclasses str
.
Subtypes of list
include a member_type
specifying the type for members of a Tag
value:
type.scene type=list member_type=str description="A movie scene."
Subtypes of dict
include a key_type
and a member_type
specifying the type for keys and members of a Tag
value:
type.cast type=dict key_type=actor member_type=role description="Cast members and their roles."
type.actor type=person description="An actor's stage name."
type.person type=str description="A person."
type.role type=character description="A character role in a performance."
type.character type=str description="A person in a story."
Metadata:
Metadata are Tag
s describing particular values of a type.
For example, the metadata for the Tag
colour=blue
:
meta.colour.blue url="https://en.wikipedia.org/wiki/Blue" wavelengths="450nm-495nm"
meta.actor.scarlett_johansson
meta.character.marvel.black_widow type=character names=["Natasha Romanov"]
Accessing type data and metadata:
A TagSet
may have a reference to a TagsOntology
as .ontology
and so also do any of its Tag
s.
Method TagsOntology.__bool__(self)
Support easy ontology or some_default
tests,
since ontologies are broadly optional.
Method TagsOntology.__setitem__(self, name, te)
Save te
against the key name
.
Method TagsOntology.basetype(self, typename)
Infer the base type name from a type name.
The default type is 'str'
,
but any type which resolves to one in self.BASE_TYPES
may be returned.
Method TagsOntology.convert_tag(self, tag)
Convert a Tag
's value accord to the ontology.
Return a new Tag
with the converted value
or the original Tag
unchanged.
This is primarily aimed at things like regexp based autotagging,
where the matches are all strings
but various fields have special types,
commonly int
s or date
s.
Method TagsOntology.edit_indices(self, *a, **kw)
Edit the entries specified by indices.
Return TagSet
s for the entries which were changed.
Method TagsOntology.get(self, name, default=None)
Proxy .get
through to self.te_mapping
.
Method TagsOntology.meta(self, type_name, value)
Return the metadata TagSet
for (type_name,value)
.
Method TagsOntology.meta_index(type_name=None, value=None)
Return the entry index for the metadata for (type_name,value)
.
Method TagsOntology.meta_names(self, type_name=None)
Generator yielding defined metadata names.
If type_name
is specified, yield only the value_names
for that type_name
.
For example, meta_names('character')
on an ontology with a meta.character.marvel.black_widow
would yield 'marvel.black_widow'
i.e. only the suffix part for character
metadata.
Method TagsOntology.type(self, type_name)
Return the TagSet
defining the type named type_name
.
Method TagsOntology.type_index(type_name)
Return the entry index for the type type_name
.
Method TagsOntology.type_names(self)
Generator yielding defined type names.
Method TagsOntology.types(self)
Generator yielding defined type names and their defining TagSet
.
Method TagsOntology.value_metadata(self, *a, **kw)
Return a ValueMetadata
for type_name
and value
.
This provides the mapping between a type's value and its semantics.
For example,
if a TagSet
had a list of characters such as:
characters=["Captain America (Marvel)","Black Widow (Marvel)"]
then these values could be converted to the dotted identifiers
characters.marvel.captain_america
and characters.marvel.black_widow
respectively,
ready for lookup in the ontology
to obtain the "metadata" TagSet
for each specific value.
Method TagsOntology.value_to_tag_name(*a, **kw)
Convert a tag value to a tagnamelike dotted identifierish string
for use in ontology lookup.
Returns None
for unconvertable values.
Nonnegative int
s are converted to str
.
Strings are converted as follows:
- a trailing
(.*)
is turned into a prefix with a dot, for example"Captain America (Marvel)"
becomes"Marvel.Captain America"
. - the string is split into words (nonwhitespace),
lowercased and joined with underscores,
for example
"Marvel.Captain America"
becomes"marvel.captain_america"
.
Class TagsOntologyCommand(cs.cmdutils.BaseCommand)
A command line for working with ontology types.
Command line usage:
Usage: TagsOntologyCommand subcommand [...]
Subcommands:
help [subcommand-names...]
Print the help for the named subcommands,
or for all subcommands if no names are specified.
type
With no arguments, list the defined types.
type type_name
With a type name, print its `Tag`s.
type type_name edit
Edit the tags defining a type.
type type_name edit meta_names_pattern...
Edit the tags for the metadata names matching the
meta_names_patterns.
type type_name list
Listt the metadata names for this type and their tags.
Method TagsOntologyCommand.cmd_type(self, argv)
Usage:
{cmd}
With no arguments, list the defined types.
{cmd} type_name
With a type name, print its Tag
s.
{cmd} type_name edit
Edit the tags defining a type.
{cmd} type_name edit meta_names_pattern...
Edit the tags for the metadata names matching the
meta_names_patterns.
{cmd} type_name list
Listt the metadata names for this type and their tags.
Class ValueMetadataNamespace(TagSetNamespace,ExtendedNamespace,types.SimpleNamespace)
A subclass of TagSetNamespace
for a Tag
's metadata.
The reference TagSet
is the defining TagSet
for the metadata of a particular Tag
value
as defined by a ValueMetadata
(the return value of Tag.metadata
).
Method ValueMetadataNamespace.__format__(self, *a, **kw)
Format this node.
If there's a Tag
on the node, format its value.
Otherwise use the superclass format.
Method ValueMetadataNamespace.from_metadata(*a, **kw)
Construct a new ValueMetadataNamespace
from meta
(a ValueMetadata
).
Release Log
Release 20210404: Bugfix TagBasedTest.COMPARISON_FUNCS["="]: if cmp_value is None, return true (the tag is present).
Release 20210306:
- ExtendedNamespace,TagSetNamespace: move the .[:alpha:]* attribute support from ExtendedNamespace to TagSetNamespace because it requires Tags.
- TagSetNamespace.getattr: new _i, _s, _f suffixes to return int, str or float tag values (or None); fold _lc in with these.
- Pull most of
TaggedEntity
out intoTaggedEntityMixin
for reuse by domain specific tagged entities. - TaggedEntity: new .set and .discard methods.
- TaggedEntity: new as_editable_line, from_editable_line, edit and edit_entities methods to support editing entities using a text editor.
- ontologies: type entries are now prefixed with "type." and metadata entries are prefixed with "meta."; provide a worked ontology example in the introduction and improve related docstrings.
- TagsOntology: new .types(), .types_names(), .meta(type_name,value), .meta_names() methods.
- TagsOntology.getitem: create missing TagSets on demand.
- New TagsOntologyCommand, initially with a "type [type_name [{edit|list}]]" subcommand, ready for use as the cmd_ont subcommand of other tag related commands.
- TagSet: support initialisation like a dict including keywords, and move the
ontology
parameter to_onotology
. - TagSet: include AttrableMappingMixin to enable attribute access to values when there is no conflict with normal methods.
- UUID encode/decode support.
- Honour $TAGSET_EDITOR or $EDITOR as preferred interactive editor for tags.
- New TagSet.subtags(prefix) to extract a subset of the tags.
- TagsOntology.value_metadata: new optional convert parameter to override the default "convert human friendly name" algorithm, particularly to pass convert=str to things which are already the basic id.
- Rename TaggedEntity to TagSet.
- Rename TaggedEntities to TagSets.
- TagSet: new csvrow and from_csvrow methods imported from obsolete TaggedEntityMixin class.
- Move BaseTagFile from cs.fstags to TagFile in cs.tagset.
- TagSet: support access to the tag "c.x" via attributes provided there is no "c" tag in the way.
- TagSet.unixtime: implement the autoset-to-now semantics.
- New as_timestamp(): convert date, datetime, int or float to a UNIX timestamp.
- Assorted docstring updates and bugfixes.
Release 20200716:
- Update for changed cs.obj.SingletonMixin API.
- Pull in TaggedEntity from cs.sqltags and add the .csvrow property and the .from_csvrow factory.
Release 20200521.1: Fix DISTINFO.install_requires, drop debug import.
Release 20200521:
- New ValueDetail and KeyValueDetail classes for returning ontology information; TagInfo.detail now returns a ValueDetail for scalar types, a list of ValueDetails for sequence types and a list of KeyValueDetails for mapping types; drop various TagInfo mapping/iterable style methods, too confusing to use.
- Plumb ontology parameter throughout, always optional.
- Drop TypedTag, Tags now use ontologies for this.
- New TagsCommandMixin to support BaseCommands which manipulate Tags.
- Many improvements and bugfixes.
Release 20200318:
- Note that the TagsOntology stuff is in flux and totally alpha.
- Tag.prefix_name factory returning a new tag if prefix is not empty, ptherwise self.
- TagSet.update: accept an optional prefix for inserting "foreign" tags with a distinguishing name prefix.
- Tag.as_json: turn sets and tuples into lists for encoding.
- Backport for Python < 3.7 (no fromisoformat functions).
- TagSet: drop unused and illplaced .titleify, .episode_title and .title methods.
- TagSet: remove "defaults", unused.
- Make TagSet a direct subclass of dict, adjust uses of .update etc.
- New ExtendedNamespace class which is a SimpleNamespace with some inferred attributes and a partial mapping API (keys and getitem).
- New TagSet.ns() returning the Tags as an ExtendedNamespace, which doubles as a mapping for str.format_map; TagSet.format_kwargs is now an alias for this.
- New Tag.from_string factory to parse a str into a Tag.
- New TagsOntology and TypedTag classes to provide type and value-detail information; very very alpha and subject to change.
Release 20200229.1: Initial release: pull TagSet, Tag, TagChoice from cs.fstags for independent use.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.