Tags and sets of tags with __format__ support and optional ontology information.
Project description
Tags and sets of tags with format support and optional ontology information.
Latest release 20250528:
- Some refactors and small fixes.
- Tag: rename Tag.from_str2 to Tag.parse, drop offset parameter of Tag.from_str.
See cs.fstags
for support for applying these to filesystem objects
such as directories and files.
See cs.sqltags
for support for databases of entities with tags,
not directly associated with filesystem objects.
This is suited to both log entries (entities with no "name")
and large collections of named entities;
both accept Tag
s and can be searched on that basis.
All of the available complexity is optional:
you can use Tag
s without bothering with TagSet
s
or TagsOntology
s.
This module contains the following main classes:
Tag
: an object with a.name
and optional.value
(defaultNone
) and also an optional reference.ontology
for associating semantics with tag values. The.value
, if notNone
, will often be a string, but may be any Python object. If you're using these viacs.fstags
, the object will need to be JSON transcribeable.TagSet
: adict
subclass representing a set ofTag
s to associate with something; it also has setlike.add
and.discard
methods. As such it only supports a singleTag
for a given tag name, but that tag value can of course be a sequence or mapping for more elaborate tag values.TagsOntology
: a mapping of type names toTagSet
s defining the type and also to entries for the metadata for specific per-type values.
Here's a simple example with some Tag
s and a TagSet
.
>>> tags = TagSet()
>>> # add a "bare" Tag named 'blue' with no value
>>> tags.add('blue')
>>> # add a "topic=tagging" Tag
>>> tags.set('topic', 'tagging')
>>> # make a "subtopic" Tag and add it
>>> subtopic = Tag('subtopic', 'ontologies')
>>> tags.add(subtopic)
>>> # Tags have nice repr() and str()
>>> subtopic
Tag(name='subtopic',value='ontologies')
>>> print(subtopic)
subtopic=ontologies
>>> # a TagSet also has a nice repr() and str()
>>> tags
TagSet:{'blue': None, 'topic': 'tagging', 'subtopic': 'ontologies'}
>>> print(tags)
blue subtopic=ontologies topic=tagging
>>> tags2 = TagSet({'a': 1}, b=3, c=[1,2,3], d='dee')
>>> tags2
TagSet:{'a': 1, 'b': 3, 'c': [1, 2, 3], 'd': 'dee'}
>>> print(tags2)
a=1 b=3 c=[1,2,3] d=dee
>>> # since you can print a TagSet to a file as a line of text
>>> # you can get it back from a line of text
>>> TagSet.from_str('a=1 b=3 c=[1,2,3] d=dee')
TagSet:{'a': 1, 'b': 3, 'c': [1, 2, 3], 'd': 'dee'}
>>> # because TagSets are dicts you can format strings with them
>>> print('topic:{topic} subtopic:{subtopic}'.format_map(tags))
topic:tagging subtopic:ontologies
>>> # TagSets have convenient membership tests
>>> # test for blueness
>>> 'blue' in tags
True
>>> # test for redness
>>> 'red' in tags
False
>>> # test for any "subtopic" tag
>>> 'subtopic' in tags
True
>>> # test for subtopic=ontologies
>>> print(subtopic)
subtopic=ontologies
>>> subtopic in tags
True
>>> # test for subtopic=libraries
>>> subtopic2 = Tag('subtopic', 'libraries')
>>> subtopic2 in tags
False
Ontologies
Tag
s and TagSet
s suffice to apply simple annotations to things.
However, an ontology brings meaning to those annotations.
See the TagsOntology
class for implementation details,
access methods and more examples.
Consider a record about a movie, with these tags (a TagSet
):
title="Avengers Assemble"
series="Avengers (Marvel)"
cast={"Scarlett Johansson":"Black Widow (Marvel)"}
where we have the movie title, a name for the series in which it resides, and a cast as an association of actors with roles.
An ontology lets us associate implied types and metadata with these values.
Here's an example ontology supporting the above TagSet
:
type.cast type=dict key_type=person member_type=character description="members of a production"
type.character description="an identified member of a story"
type.series type=str
character.marvel.black_widow type=character names=["Natasha Romanov"]
person.scarlett_johansson fullname="Scarlett Johansson" bio="Known for Black Widow in the Marvel stories."
The type information for a cast
is defined by the ontology entry named type.cast
,
which tells us that a cast
Tag
is a dict
,
whose keys are of type person
and whose values are of type character
.
(The default type is str
.)
To find out the underlying type for a character
we look that up in the ontology in turn;
because it does not have a specified type
Tag
, it it taken to be a str
.
Having the types for a cast
,
it is now possible to look up the metadata for the described cast members.
The key "Scarlett Johansson"
is a person
(from the type definition of cast
).
The ontology entry for her is named person.scarlett_johansson
which is computed as:
person
: the type namescarlett_johansson
: obtained by downcasing"Scarlett Johansson"
and replacing whitespace with an underscore. The full conversion process is defined by theTagsOntology.value_to_tag_name
function.
The key "Black Widow (Marvel)"
is a character
(again, from the type definition of cast
).
The ontology entry for her is named character.marvel.black_widow
which is computed as:
character
: the type namemarvel.black_widow
: obtained by downcasing"Black Widow (Marvel)"
, replacing whitespace with an underscore, and moving a bracketed suffix to the front as an unbracketed prefix. The full conversion process is defined by theTagsOntology.value_to_tag_name
function.
Format Strings
You can just use str.format_map
as shown above
for the direct values in a TagSet
,
since it subclasses dict
.
However, TagSet
s also subclass cs.lex.FormatableMixin
and therefore have a richer format_as
method which has an extended syntax
for the format component.
Command line tools like fstags
use this for output format specifications.
An example:
>>> # an ontology specifying the type for a colour
>>> # and some information about the colour "blue"
>>> ont = TagsOntology(
... {
... 'type.colour':
... TagSet(description="a colour, a hue", type="str"),
... 'colour.blue':
... TagSet(
... url='https://en.wikipedia.org/wiki/Blue',
... wavelengths='450nm-495nm'
... ),
... }
... )
>>> # tag set with a "blue" tag, using the ontology above
>>> tags = TagSet(colour='blue', labels=['a', 'b', 'c'], size=9, _ontology=ont)
>>> tags.format_as('The colour is {colour}.')
'The colour is blue.'
>>> # format a string about the tags showing some metadata about the colour
>>> tags.format_as('Information about the colour may be found here: {colour:metadata.url}')
'Information about the colour may be found here: https://en.wikipedia.org/wiki/Blue'
Short summary:
as_unixtime
: Convert a tag value to a UNIX timestamp.BaseTagSets
: The base class for collections ofTagSet
instances such ascs.fstags.FSTags
andcs.sqltags.SQLTags
.jsonable
: Convertobj
to a JSON encodable form. This returnsobj
for purely JSONable objects and a JSONable deep copy ofobj
if it or some subcomponent required conversion.converted
is a dict mapping object ids to their converted forms to prevent loops.MappingTagSets
: ABaseTagSets
subclass using an arbitrary mapping.RegexpTagRule
: A regular expression basedTag
rule.selftest
: Run some ad hoc self tests.Tag
: A name/value pair. EachTag
has a.name
(str
), a.value
and an optional.ontology
. Thename
must be a dotted identifier.tag_or_tag_value
: A decorator for functions or methods which may be called as:.TagBasedTest
: A test based on aTag
.TagFile
: A reference to a specific file containing tags.TagsCommandMixin
: Utility methods forcs.cmdutils.BaseCommand
classes working with tags.TagSet
: A setlike class collection ofTag
s.TagSetCriterion
: A testable criterion for aTagSet
.TagSetPrefixView
: A view of aTagSet
via aprefix
.TagSetsSubdomain
: A view into aBaseTagSets
for keys commencing with a prefix being the subdomain plus a dot ('.'
).TagsOntology
: An ontology for tag names. This is based around a mapping of names to ontological information expressed as aTagSet
.TagsOntologyCommand
: A command line for working with ontology types.
Module contents:
-
as_unixtime(tag_value)
: Convert a tag value to a UNIX timestamp.This accepts
int
,float
(already a timestamp) anddate
ordatetime
(usedatetime.timestamp() for a nonnaive
datetime, otherwise
time.mktime(tag_value.time_tuple())`, which assumes the local time zone). -
Class
BaseTagSets(cs.resources.MultiOpenMixin, collections.abc.MutableMapping)``: The base class for collections ofTagSet
instances such as `cs.fstags.FSTags` and `cs.sqltags.SQLTags`.Examples of this include:
cs.cdrip.MBSQLTags
: a mapping of MusicbrainsNG entities to their associatedTagSet
cs.fstags.FSTags
: a mapping of filesystem paths to their associatedTagSet
cs.sqltags.SQLTags
: a mapping of names toTagSet
s stored in an SQL database
Subclasses must implement:
get(name,default=None)
: return theTagSet
associated withname
, ordefault
.__setitem__(name,tagset)
: associate aTagSet
with the keyname
; this is called by the__missing__
method with a newly createdTagSet
.keys(self)
: return an iterable of names
Subclasses may reasonably want to override the following:
startup_shutdown(self)
: context manager to allocate and release any needed resources such as database connections
Subclasses may implement:
__len__(self)
: return the number of names
The
TagSet
factory used to fetch or create aTagSet
is namedTagSetClass
. The default implementation honours two class attributes:TAGSETCLASS_DEFAULT
: initiallyTagSet
TAGSETCLASS_PREFIX_MAPPING
: a mapping of type names toTagSet
subclasses
The type name of a
TagSet
name is the first dotted component. For example,artist.nick_cave
has the type nameartist
. A subclass ofBaseTagSets
could utiliise anArtistTagSet
subclass ofTagSet
and provide:TAGSETCLASS_PREFIX_MAPPING = { 'artist': ArtistTagSet, }
in its class definition. Accesses to
artist.
* entities would result inArtistTagSet
instances and access to other entities would result in ordinaryTagSet
instances.
BaseTagSets.__init__(self, *, ontology=None)
:
Initialise the collection.
BaseTagSets.TAGSETCLASS_DEFAULT
BaseTagSets.TagSetClass(self, *, name, **kw)
:
Factory to create a new TagSet
from name
.
BaseTagSets.__contains__(self, name: str)
:
Test whether name
is present in the underlying mapping.
BaseTagSets.__getitem__(self, name: str)
:
Obtain the TagSet
associated with name
.
If name
is not presently mapped,
return self.__missing__(name)
.
BaseTagSets.__iter__(self)
:
Iteration returns the keys.
BaseTagSets.__len__(self)
:
Return the length of the underlying mapping.
BaseTagSets.__missing__(self, name: str, **kw)
:
Like dict
, the __missing__
method may autocreate a new TagSet
.
This is called from __getitem__
if name
is missing
and uses the factory cls.default_factory
.
If that is None
raise KeyError
,
otherwise call self.default_factory(name,**kw)
.
If that returns None
raise KeyError
,
otherwise save the entity under name
and return the entity.
BaseTagSets.__setitem__(self, name, te)
:
Save te
in the backend under the key name
.
BaseTagSets.add(self, name: str, **kw)
:
Return a new TagSet
associated with name
,
which should not already be in use.
BaseTagSets.default_factory(self, name: str)
:
Create a new TagSet
named name
.
BaseTagSets.edit(self, *, select_tagset=None, **kw)
:
Edit the TagSet
s.
Parameters:
select_tagset
: optional callable accepting aTagSet
which tests whether it should be included in theTagSet
s to be edited Other keyword arguments are passed toTag.edit_tagsets
.
BaseTagSets.get(self, name: str, default=None)
:
Return the TagSet
associated with name
,
or default
if there is no such entity.
BaseTagSets.items(self, *, prefix=None)
:
Generator yielding (key,value)
pairs,
optionally constrained to keys starting with prefix+'.'
.
BaseTagSets.keys(self, *, prefix=None)
:
Return the keys starting with prefix+'.'
or all keys if prefix
is None
.
BaseTagSets.subdomain(self, subname: str)
:
Return a proxy for this BaseTagSets
for the name
s
starting with subname+'.'
.
BaseTagSets.values(self, *, prefix=None)
:
Generator yielding the mapping values (TagSet
s),
optionally constrained to keys starting with prefix+'.'
.
-
jsonable(obj, converted: dict)
: Convertobj
to a JSON encodable form. This returnsobj
for purely JSONable objects and a JSONable deep copy ofobj
if it or some subcomponent required conversion.converted
is a dict mapping object ids to their converted forms to prevent loops. -
Class
MappingTagSets(BaseTagSets)``: ABaseTagSets
subclass using an arbitrary mapping.If no mapping is supplied, a
dict
is created for the purpose.Example:
>>> tagsets = MappingTagSets() >>> list(tagsets.keys()) [] >>> tagsets.get('foo') >>> tagsets['foo'] = TagSet(bah=1, zot=2) >>> list(tagsets.keys()) ['foo'] >>> tagsets.get('foo') TagSet:{'bah': 1, 'zot': 2} >>> list(tagsets.keys(prefix='foo')) ['foo'] >>> list(tagsets.keys(prefix='bah')) []
MappingTagSets.__delitem__(self, name: str)
:
Delete the TagSet
named name
.
MappingTagSets.__setitem__(self, name: str, te)
:
Save te
in the backend under the key name
.
MappingTagSets.get(self, name: str, default=None)
:
Get name
or default
.
MappingTagSets.keys(self, *, prefix: Optional[str] = None) -> Iterable[str]
:
Return an iterable of the keys commencing with prefix
or all keys if prefix
is None
.
-
Class
RegexpTagRule``: A regular expression basedTag
rule.This applies a regular expression to a string and returns inferred
Tag
s.
RegexpTagRule.infer_tags(self, s)
:
Apply the rule to the string s
, return a list of Tag
s.
-
Class
Tag(Tag, cs.lex.FormatableMixin)``: A name/value pair. EachTag
has a `.name` (`str`), a `.value` and an optional `.ontology`. The `name` must be a dotted identifier.Terminology:
- A "bare"
Tag
has avalue
ofNone
. - A "naive"
Tag
has anontology
ofNone
.
The constructor for a
Tag
is unusual:- both the
value
andontology
are optional, defaulting toNone
- if
name
is astr
then we always construct a newTag
with the suppplied values - if
name
is not astr
it should be aTag
like object to promote; it is an error if thevalue
parameter is notNone
in this case - an optional
prefix
may be supplied which is prepended toname
with a dot ('.'
) if not empty
The promotion process is as follows:
- if
name
is aTag
subinstance then if the suppliedontology
is notNone
and is not the ontology associated withname
then a newTag
is made, otherwise the originalTag
is returned unchanged - otherwise a new
Tag
is made fromname
using its.value
and overriding its.ontology
if theontology
parameter is notNone
Examples:
>>> ont = TagsOntology({'colour.blue': TagSet(wavelengths='450nm-495nm')}) >>> tag0 = Tag('colour', 'blue') >>> tag0 Tag(name='colour',value='blue') >>> tag = Tag(tag0) >>> tag Tag(name='colour',value='blue') >>> tag = Tag(tag0, ontology=ont) >>> tag # doctest: +ELLIPSIS Tag(name='colour',value='blue',ontology=...) >>> tag = Tag(tag0, prefix='surface') >>> tag Tag(name='surface.colour',value='blue')
- A "bare"
Tag.__init__(self, *a, **kw)
:
Dummy __init__
to avoid FormatableMixin.__init__
because we subclass namedtuple
which has no __init__
.
Tag.__hash__
Tag.__str__(self)
:
Encode name
and value
.
A "bare" Tag
(self.value is None
) is just its name.
Otherwise {self.name}={self.transcribe_value(self.value)}
.
Tag.alt_values(self, value_tag_name=None)
:
Return a list of alternative values for this Tag
on the premise that each has a metadata entry.
Tag.basetype
:
The base type name for this tag.
Returns None
if there is no ontology.
This calls self.onotology.basetype(self.name)
.
The basetype is the endpoint of a cascade down the defined types.
For example, this might tell us that a Tag
role="Fred"
has a basetype "str"
by cascading through a hypothetical chain role
->character
->str
:
type.role type=character
type.character type=str
Tag.from_arg(arg, offset=0, ontology=None)
:
Parse a Tag
from the string arg
at offset
(default 0
).
where arg
is known to be entirely composed of the value,
such as a command line argument.
This calls the from_str
method with fallback_parse
set
to gather then entire tail of the supplied string arg
.
Tag.from_str(s, ontology=None, fallback_parse=None)
:
Parse s
as a Tag
definition.
This is the inverse of Tag.__str__
, and a shim for Tag.parse
which checks that the entire string is consumed.
Tag.from_str2(*a, **kw)
:
Obsolete name for Tag.parse
.
Tag.is_valid_name(name)
:
Test whether a tag name is valid: a dotted identifier.
Tag.key_metadata(self, key)
:
Return the metadata definition for key
.
The metadata TagSet
is obtained from the ontology entry
type.
key_tag_name
where type is the Tag
's key_type
and key_tag_name is the key converted
into a dotted identifier by TagsOntology.value_to_tag_name
.
Tag.key_type
:
The type name for members of this tag.
This is required if .value
is a mapping.
Tag.key_typedef
:
The typedata definition for this Tag
's keys.
This is for Tag
s which store mappings,
for example a movie cast, mapping actors to roles.
The name of the member type comes from
the key_type
entry from self.typedata
.
That name is then looked up in the ontology's types.
Tag.matches(self, tag_name, value)
:
Test whether this Tag
matches (tag_name,value)
.
Tag.member_metadata(self, member_key)
:
Return the metadata definition for self[member_key].
The metadata TagSet
is obtained from the ontology entry
type.
member_tag_name
where type is the Tag
's member_type
and member_tag_name is the member value converted
into a dotted identifier by TagsOntology.value_to_tag_name
.
Tag.member_type
:
The type name for members of this tag.
This is required if .value
is a sequence or mapping.
Tag.member_typedef
:
The typedata definition for this Tag
's members.
This is for Tag
s which store mappings or sequences,
for example a movie cast, mapping actors to roles,
or a list of scenes.
The name of the member type comes from
the member_type
entry from self.typedata
.
That name is then looked up in the ontology's types.
Tag.meta
:
Shortcut property for the metadata TagSet
.
Tag.metadata(self, *, ontology=None, convert=None)
:
Fetch the metadata information about this specific tag value,
derived through the ontology
from the tag name and value.
The default ontology
is self.ontology
.
For a scalar type (int
, float
, str
) this is the ontology TagSet
for self.value
.
For a sequence (list
) this is a list of the metadata
for each member.
For a mapping (dict
) this is mapping of key->metadata
.
Tag.parse(s, offset=0, *, ontology=None, **parse_value_kw)
:
Parse tag_name[=value] from s
at offset
, return (Tag,post_offset)
.
Parameters:
s
: the string to parseoffset
: optional offset of the parse start, default0
ontology
: optionalTagsOntology
to associate with theTag
Other keyword arguments are passed to Tag.parse_value
.
Tag.parse_name(s, offset=0)
:
Parse a tag name from s
at offset
: a dotted identifier.
Tag.parse_value(s, offset=0, *, extra_types=None, fallback_parse=None)
:
Parse a value from s
at offset
(default 0
).
Return the value, or None
on no data.
The optional extra_types
parameter may be an iterable of
(type,from_str,to_str)
tuples where from_str
is a
function which takes a string and returns a Python object
(expected to be an instance of type
).
The default comes from cls.EXTRA_TYPES
.
This supports storage of nonJSONable values in text form.
The optional fallback_parse
parameter
specifies a parse function accepting (s,offset)
and returning (parsed,new_offset)
where parsed
is text from s[offset:]
and new_offset
is where the parse stopped.
The default is cs.lex.get_nonwhite
to gather nonwhitespace characters,
intended to support tag_name=
bare_word
in human edited tag files.
The core syntax for values is JSON;
value text commencing with any of '"'
, '['
or '{'
is treated as JSON and decoded directly,
leaving the offset at the end of the JSON parse.
Otherwise all the nonwhitespace at this point is collected
as the value text,
leaving the offset at the next whitespace character
or the end of the string.
The text so collected is then tried against the from_str
function of each extra_types
;
the first successful parse is accepted as the value.
If no extra type match,
the text is tried against int()
and float()
;
if one of these parses the text and str()
of the result round trips
to the original text
then that value is used.
Otherwise the text itself is kept as the value.
Tag.transcribe_value(value, extra_types=None, json_options=None)
:
Transcribe value
for use in Tag
transcription.
The optional extra_types
parameter may be an iterable of
(type,from_str,to_str)
tuples where to_str
is a
function which takes a string and returns a Python object
(expected to be an instance of type
).
The default comes from cls.EXTRA_TYPES
.
If value
is an instance of type
then the to_str
function is used to transcribe the value
as a str
, which should not include any whitespace
(because of the implementation of parse_value
).
If there is no matching to_str
function,
cls.JSON_ENCODER.encode
is used to transcribe value
.
This supports storage of nonJSONable values in text form.
Tag.typedef
:
The defining TagSet
for this tag's name.
This is how its type is defined,
and is obtained from:
self.ontology['type.'+self.name]
Basic Tag
s often do not need a type definition;
these are only needed for structured tag values
(example: a mapping of cast members)
or when a Tag
name is an alias for another type
(example: a cast member name might be an actor
which in turn might be a person
).
For example, a Tag
colour=blue
gets its type information from the type.colour
entry in an ontology;
that entry is just a TagSet
with relevant information.
-
tag_or_tag_value(*da, **dkw)
: A decorator for functions or methods which may be called as:func(name[,value])
or as:
func(Tag)
The optional decorator argument
no_self
(defaultFalse
) should be supplied for plain functions as they have no leadingself
parameter to accomodate.Example:
@tag_or_tag_value def add(self, tag_name, value, *, verbose=None):
This defines a
.add()
method which can be called withname
andvalue
or with singleTag
like object (something with.name
and.value
attributes), for example:tags = TagSet() .... tags.add('colour', 'blue') .... tag = Tag('size', 9) tags.add(tag)
-
Class
TagBasedTest(TagBasedTest, TagSetCriterion)``: A test based on aTag
.Attributes:
spec
: the source text from which this choice was parsed, possiblyNone
choice
: the apply/reject flagtag
: theTag
representing the criterioncomparison
: an indication of the test comparison
The following comparison values are recognised:
None
: test for the presence of theTag
'='
: test that the tag value equalstag.value
'<'
: test that the tag value is less thantag.value
'<='
: test that the tag value is less than or equal totag.value
'>'
: test that the tag value is greater thantag.value
'>='
: test that the tag value is greater than or equal totag.value
'~/'
: test if the tag value as a regexp is present intag.value
- '~': test if a matching tag value is present in
tag.value
TagBasedTest.by_tag_value(tag_name, tag_value, *, choice=True, comparison='=')
:
Return a TagBasedTest
based on a Tag
or tag_name,tag_value
.
TagBasedTest.match_tagged_entity(self, te: 'TagSet') -> bool
:
Test against the Tag
s in tags
.
Note: comparisons when self.tag.name
is not in tags
always return False
(possibly inverted by self.choice
).
TagBasedTest.parse(s, offset=0, delim=None)
:
Parse tag_name[{<
|<=
|'='|'>='|>
|'~'}value]
and return (dict,offset)
where the dict
contains the following keys and values:
tag
: aTag
embodying the tag name and valuecomparison
: an indication of the test comparison
-
Class
TagFile(cs.fs.FSPathBasedSingleton, BaseTagSets)``: A reference to a specific file containing tags.This manages a mapping of
name
=>TagSet
, itself a mapping of tag name => tag value.
TagFile.__setitem__(self, name, te)
:
Set item name
to te
.
TagFile.get(self, name, default=None)
:
Get from the tagsets.
TagFile.is_modified(self)
:
Test whether this TagSet
has been modified.
TagFile.keys(self, *, prefix=None)
:
tagsets.keys
If the options prefix
is supplied,
yield only those keys starting with prefix
.
TagFile.load_tagsets(filepath, ontology, extra_types=None)
:
Load filepath
and return (tagsets,unparsed)
.
The returned tagsets
are a mapping of name
=>tag_name
=>value
.
The returned unparsed
is a list of (lineno,line)
for lines which failed the parse (excluding the trailing newline).
TagFile.names
:
The names from this FSTagsTagFile
as a list.
TagFile.parse_tags_line(line, ontology=None, verbose=None, extra_types=None) -> Tuple[str, cs.tagset.TagSet]
:
Parse a "name tags..." line as from a .fstags
file,
return (name,TagSet)
.
TagFile.save(self, extra_types=None, prune=False)
:
Save the tag map to the tag file if modified.
TagFile.save_tagsets(filepath, tagsets, unparsed, extra_types=None, prune=False)
:
Save tagsets
and unparsed
to filepath
.
This method will create the required intermediate directories if missing.
This method does not clear the .modified
attribute of the TagSet
s
because it does not know it is saving to the Tagset
's primary location.
TagFile.startup_shutdown(self)
:
Save the tagsets if modified.
TagFile.tags_line(name, tags, extra_types=None, prune=False)
:
Transcribe a name
and its tags
for use as a .fstags
file line.
TagFile.tagsets
:
The tag map from the tag file,
a mapping of name=>TagSet
.
This is loaded on demand.
TagFile.update(self, name, tags, *, prefix=None, verbose=None)
:
Update the tags for name
from the supplied tags
as for Tagset.update
.
-
Class
TagsCommandMixin``: Utility methods forcs.cmdutils.BaseCommand
classes working with tags.Optional subclass attributes:
TAGSET_CRITERION_CLASS
: aTagSetCriterion
duck class, defaultTagSetCriterion
. For example,cs.sqltags
has a subclass with an.extend_query
method for computing an SQL JOIN used in searching for tagged entities.
TagsCommandMixin.TagAddRemove
TagsCommandMixin.parse_tag_addremove(arg, offset=0)
:
Parse arg
as an add/remove tag specification
of the form [-
]tag_name[=
value].
Return (remove,Tag)
.
Examples:
>>> TagsCommandMixin.parse_tag_addremove('a')
TagAddRemove(remove=False, tag=Tag(name='a',value=None))
>>> TagsCommandMixin.parse_tag_addremove('-a')
TagAddRemove(remove=True, tag=Tag(name='a',value=None))
>>> TagsCommandMixin.parse_tag_addremove('a=1')
TagAddRemove(remove=False, tag=Tag(name='a',value=1))
>>> TagsCommandMixin.parse_tag_addremove('-a=1')
TagAddRemove(remove=True, tag=Tag(name='a',value=1))
>>> TagsCommandMixin.parse_tag_addremove('-a="foo bah"')
TagAddRemove(remove=True, tag=Tag(name='a',value='foo bah'))
>>> TagsCommandMixin.parse_tag_addremove('-a=foo bah')
TagAddRemove(remove=True, tag=Tag(name='a',value='foo bah'))
TagsCommandMixin.parse_tag_choices(argv)
:
Parse argv
as an iterable of [!
]tag_name[=
*tag_value]
Tag`
additions/deletions.
TagsCommandMixin.parse_tagset_criteria(argv, tag_based_test_class=None)
:
Parse tag specifications from argv
until an unparseable item is found.
Return (criteria,argv)
where criteria
is a list of the parsed criteria
and argv
is the remaining unparsed items.
Each item is parsed via
cls.parse_tagset_criterion(item,tag_based_test_class)
.
TagsCommandMixin.parse_tagset_criterion(arg, tag_based_test_class=None)
:
Parse arg
as a tag specification
and return a tag_based_test_class
instance
via its .from_str
factory method.
Raises ValueError
in a misparse.
The default tag_based_test_class
comes from cls.TAGSET_CRITERION_CLASS
,
which itself defaults to class TagSetCriterion
.
The default TagSetCriterion.from_str
recognises:
-
tag_name: a negative requirement for tag_name- tag_name[
=
value]: a positive requirement for a tag_name with optional value.
-
Class
TagSet(builtins.dict, cs.dateutils.UNIXTimeMixin, cs.lex.FormatableMixin, cs.mappings.AttrableMappingMixin, cs.deco.Promotable)``: A setlike class collection ofTag
s.This actually subclasses
dict
, so aTagSet
is a direct mapping of tag names to values. It accepts attribute access to simple tag values when they do not conflict with the class methods; the reliable method is normal item access.NOTE: iteration yields
Tag
s, not dict keys.Also note that all the
Tags
from aTagSet
share its ontology.Subclasses should override the
set
anddiscard
methods; thedict
and mapping methods are defined in terms of these two basic operations.TagSet
s have a few special properties:id
: a domain specific identifier; this may reasonably beNone
for entities not associated with database rows; thecs.sqltags.SQLTags
class associates this with the database row id.name
: the entity's name; a read only alias for the'name'
Tag
. Thecs.sqltags.SQLTags
class defines "log entries" asTagSet
s with noname
.unixtime
: a UNIX timestamp, afloat
holding seconds since the UNIX epoch (midnight, 1 January 1970 UTC). This is typically the row creation time for entities associated with database rows, but usually the event time forTagSet
s describing an event.
Because
TagSet
subclassescs.mappings.AttrableMappingMixin
you can also access tag values as attributes provided that they do not conflict with instance attributes or class methods or properties.
TagSet.__init__(self, *a, _id=None, _ontology=None, **kw)
:
Initialise the TagSet
.
Parameters:
- positional parameters initialise the
dict
and are passed todict.__init__
_id
: optional identity value for databaselike implementations_ontology
: optionalTagsOntology to use for this
TagSet`- other alphabetic keyword parameters are also used to initialise the
dict
and are passed todict.__init__
TagSet.__contains__(self, tag)
:
Test for a tag being in this TagSet
.
If the supplied tag
is a str
then this test
is for the presence of tag
in the keys.
Otherwise,
for each tag T
in the tagset
test T.matches(tag)
and return True
on success.
The default Tag.matches
method compares the tag name
and if the same,
returns true if tag.value
is None
(basic "is the tag present" test)
and otherwise true if tag.value==T.value
(basic "tag value equality" test).
Otherwise return False
.
TagSet.__getattr__(self, attr)
:
Support access to dotted name attributes.
The following attribute accesses are supported:
- an attrbute from a superclass
- a
Tag
whose name isattr
; return its value - the value of
self.auto_infer(attr)
if that does not raiseValueError
- if
self.ontology
, try {type}{field} and {type}{field}s - otherwise return
self.subtags(attr)
to allow access to dotted tags, provided any existing tags start with "attr."
If this TagSet
has an ontology
and attr looks like *typename*
_*fieldname* and *typename* is a key, look up the metadata for the
Tag` value
and return the metadata's fieldname key.
This also works for plural values.
For example if a TagSet
has the tag artists=["fred","joe"]
and attr
is artist_names
then the metadata entries for "fred"
and "joe"
are looked up
and their artist_name
tags are returned,
perhaps resulting in the list
["Fred Thing","Joe Thang"]
.
If there are keys commencing with attr+'.'
then this returns a view of those keys
so that a subsequent attribute access can access one of those keys.
Otherwise, a superclass attribute access is performed.
Example of dotted access to tags like c.x
:
>>> tags=TagSet(a=1,b=2)
>>> tags.a
1
>>> tags.c
Traceback (most recent call last):
...
AttributeError: TagSet.c
>>> tags['c.z']=9
>>> tags['c.x']=8
>>> tags
TagSet:{'a': 1, 'b': 2, 'c.z': 9, 'c.x': 8}
>>> tags.c
TagSetPrefixView:c.{'z': 9, 'x': 8}
>>> tags.c.z
9
However, this is not supported when there is a tag named 'c'
because tags.c
has to return the 'c'
tag value:
>>> tags=TagSet(a=1,b=2,c=3)
>>> tags.a
1
>>> tags.c
3
>>> tags['c.z']=9
>>> tags.c.z
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'z'
TagSet.__iter__(self, prefix=None, ontology=None)
:
Yield the tag data as Tag
s.
TagSet.__setattr__(self, attr, value)
:
Attribute based Tag
access.
If attr
is private or is in self.__dict__
then that is updated,
supporting "normal" attributes set on the instance.
Otherwise the Tag
named attr
is set to value
.
The __init__
methods of subclasses should do something like this
(from TagSet.__init__
)
to set up the ordinary instance attributes
which are not to be treated as Tag
s:
self.__dict__.update(id=_id, ontology=_ontology, modified=False)
TagSet.__str__(self)
:
The TagSet
suitable for writing to a tag file.
TagSet.add(self, tag_name, value, **kw)
:
Adding a Tag
calls the class set()
method.
TagSet.as_dict(self)
:
Return a dict
mapping tag name to value.
TagSet.as_tags(self, prefix=None, ontology=None)
:
Yield the tag data as Tag
s.
TagSet.auto
:
The automatic namespace.
Here we can refer to dotted tag names directly as attributes.
TagSet.auto_infer(self, attr)
:
The default inference implementation.
This should return a value if attr
is inferrable
and raise ValueError
if not.
The default implementation returns the direct tag value for attr
if present.
TagSet.csvrow
:
This TagSet
as a list useful to a csv.writer
.
The inverse of from_csvrow
.
TagSet.discard(self, tag_name, value, *, verbose=None)
:
Discard the tag matching (tag_name,value)
.
Return a Tag
with the old value,
or None
if there was no matching tag.
Note that if the tag value is None
then the tag is unconditionally discarded.
Otherwise the tag is only discarded
if its value matches.
TagSet.dump(self, keys=None, *, preindent=None, file=None, **pf_kwargs)
:
Dump a TagSet
in multiline format.
Parameters:
keys
: optional iterable ofTag
names to printfile
: optional keyword parameter specifying the output filelike object; the default issys.stdout
.preindent
: optional leading indentation for the entire dump, either astr
or anint
indicating a number of spaces Other keyword arguments are passed topprint.pformat
.
TagSet.edit(self, editor=None, verbose=None, comments=())
:
Edit this TagSet
.
TagSet.edit_tagsets(tes, editor=None, verbose=True)
:
Edit a collection of TagSet
s.
Return a list of (old_name,new_name,TagSet)
for those which were modified.
This function supports modifying both name
and Tag
s.
The Tag
s are updated directly.
The changed names are returning in the old_name,new_name
above.
The collection tes
may be either a mapping of name/key
to TagSet
or an iterable of TagSets
. If the latter, a
mapping is made based on te.name or te.id
for each item
te
in the iterable.
TagSet.from_csvrow(csvrow)
:
Construct a TagSet
from a CSV row like that from
TagSet.csvrow
, being unixtime,id,name,tag[,tag,...]
.
TagSet.from_ini(f, section: str, missing_ok=False)
:
Load a TagSet
from a section of a .ini
file.
Parameters:
f
: the.ini
format file to read; an iterable of lines (eg a file object) or the name of a file to opensection
: the name of the config section from which to load theTagSet
missing_ok
: optional flag, defaultFalse
; if true a missing file will return an emptyTagSet
instead of raisingFileNotFoundError
TagSet.from_line(s, offset: int, **from_str_kw)
:
Obsolete form of TagSet.from_str
.
TagSet.from_str(tags_s, *, ontology=None, extra_types=None, verbose=None)
:
Create a new TagSet
from a line of text.
The line consists of a whitespace separated list of Tag
s.
This is the inverse of TagSet.__str__
.
TagSet.from_tags(tags, _id=None, _ontology=None)
:
Make a TagSet
from an iterable of Tag
s.
TagSet.get_arg_name(self, field_name)
:
Override for FormattableMixin.get_arg_name
:
return the leading dotted identifier,
which represents a tag or tag prefix.
TagSet.get_value(self, arg_name, a, kw)
:
Override for FormattableMixin.get_value
:
look up arg_name
in kw
, return a value.
The value is obtained as follows:
kw[arg_name]
: theTag
namedarg_name
if presentkw.get_format_attribute(arg_name)
: a formattable attribute namedarg_name
otherwise raiseKeyError
ifself.format_mode.strict
otherwise return the placeholder string'{'+arg_name+'}'
.
TagSet.is_stale(self, max_age=None)
:
Test whether this TagSet
is stale
i.e. the time since self.last_updated
UNIX time exceeds max_age
seconds
(default from self.STALE_AGE
).
This is a convenience function for TagSet
s which cache external data.
TagSet.name
:
Read only name
property, None
if there is no 'name'
tag.
TagSet.save_as_ini(self, f, section: str, config=None)
:
Save this TagSet
to the config file f
as section
.
If f
is a string, read an existing config from that file
and update the section.
TagSet.set(self, tag_name, value, *, verbose=None)
:
Set self[tag_name]=value
.
If verbose
, emit an info message if this changes the previous value.
TagSet.set_from(self, other, verbose=None)
:
Completely replace the values in self
with the values from other
,
a TagSet
or any other name
=>value
dict.
This has the feature of logging changes
by calling .set
and .discard
to effect the changes.
TagSet.subtags(self, prefix, as_tagset=False)
:
Return TagSetPrefixView
of the tags commencing with prefix+'.'
with the key prefixes stripped off.
If as_tagset
is true (default False
)
return a new standalone TagSet
containing the prefixed keys.
Example:
>>> tags = TagSet({'a.b':1, 'a.d':2, 'c.e':3})
>>> tags.subtags('a')
TagSetPrefixView:a.{'b': 1, 'd': 2}
>>> tags.subtags('a', as_tagset=True)
TagSet:{'b': 1, 'd': 2}
TagSet.tag(self, tag_name, prefix=None, ontology=None)
:
Return a Tag
for tag_name
, or None
if missing.
Parameters:
tag_name
: the name of theTag
to createprefix
: optional prefix; if supplied, prependprefix+'.'
to theTag
nameontology
: optional ontology for theTag
, defaultself.ontology
TagSet.tag_metadata(self, tag_name, prefix=None, ontology=None, convert=None)
:
Return a list of the metadata for the Tag
named tag_name
,
or an empty list if the Tag
is missing.
TagSet.unixtime
:
unixtime
property, autosets to time.time()
if accessed and missing.
TagSet.update(self, other=None, *, prefix=None, verbose=None, **kw)
:
Update this TagSet
from other
,
a dict of {name:value}
or an iterable of Tag
like or (name,value)
things.
TagSet.uuid
:
The TagSet
's 'uuid'
value as a UUID if present, otherwise None
.
TagSetCriterion.TAG_BASED_TEST_CLASS
TagSetCriterion.from_any(o)
:
Convert some suitable object o
into a TagSetCriterion
.
Various possibilities for o
are:
TagSetCriterion
: returned unchangedstr
: a string tests for the presence of a tag with that name and optional value;- an object with a
.choice
attribute; this is taken to be aTagSetCriterion
ducktype and returned unchanged - an object with
.name
and.value
attributes; this is taken to beTag
-like and a positive test is constructed Tag
: an object with a.name
and.value
is equivalent to a positive equalityTagBasedTest
(name,value)
: a 2 element sequence is equivalent to a positive equalityTagBasedTest
TagSetCriterion.from_arg(arg, fallback_parse=None)
:
Prepare a TagSetCriterion
from the string arg
where arg
is known to be entirely composed of the value,
such as a command line argument.
This calls the from_str
method with fallback_parse
set
to gather then entire tail of the supplied string arg
.
TagSetCriterion.from_str(s: str, fallback_parse=None)
:
Prepare a TagSetCriterion
from the string s
.
TagSetCriterion.from_str2(s, offset=0, delim=None, fallback_parse=None)
:
Parse a criterion from s
at offset
and return (TagSetCriterion,offset)
.
This method recognises an optional leading '!'
or '-'
indicating negation of the test,
followed by a criterion recognised by the .parse
method
of one of the classes in cls.CRITERION_PARSE_CLASSES
.
TagSetCriterion.match_tagged_entity(self, te: 'TagSet') -> bool
:
Apply this TagSetCriterion
to a TagSet
.
-
Class
TagSetPrefixView(cs.lex.FormatableMixin)``: A view of aTagSet
via a `prefix`.Access to a key
k
accesses theTagSet
with the keyprefix+'.'+k
.This is a kind of funny hybrid of a
Tag
and aTagSet
in that some things such as__format__
will format theTag
namedprefix
if it exists in preference to the subtags.Example:
>>> tags = TagSet(a=1, b=2) >>> tags TagSet:{'a': 1, 'b': 2} >>> tags['sub.x'] = 3 >>> tags['sub.y'] = 4 >>> tags TagSet:{'a': 1, 'b': 2, 'sub.x': 3, 'sub.y': 4} >>> sub = tags.sub >>> sub TagSetPrefixView:sub.{'x': 3, 'y': 4} >>> sub.z = 5 >>> sub TagSetPrefixView:sub.{'x': 3, 'y': 4, 'z': 5} >>> tags TagSet:{'a': 1, 'b': 2, 'sub.x': 3, 'sub.y': 4, 'sub.z': 5}
TagSetPrefixView.__getattr__(self, attr)
:
Proxy other attributes through to the TagSet
.
TagSetPrefixView.__setattr__(self, attr, value)
:
Attribute based Tag
access.
If attr
is in self.__dict__
then that is updated,
supporting "normal" attributes set on the instance.
Otherwise the Tag
named attr
is set to value
.
The __init__
methods of subclasses should do something like this
(from TagSet.__init__
)
to set up the ordinary instance attributes
which are not to be treated as Tag
s:
self.__dict__.update(id=_id, ontology=_ontology, modified=False)
TagSetPrefixView.as_dict(self)
:
Return a dict
representation of this view.
TagSetPrefixView.get(self, k, default=None)
:
Mapping get
method.
TagSetPrefixView.get_format_attribute(self, attr)
:
Fetch a formatting attribute from the proxied object.
TagSetPrefixView.items(self)
:
Return an iterable of the items (Tag
name, Tag
).
TagSetPrefixView.keys(self)
:
The keys of the subtags.
TagSetPrefixView.ontology
:
The ontology of the references TagSet
.
TagSetPrefixView.setdefault(self, k, v=None)
:
Mapping setdefault
method.
TagSetPrefixView.subtags(self, subprefix)
:
Return a deeper view of the TagSet
.
TagSetPrefixView.tag
:
The Tag
for the prefix, or None
if there is no such Tag
.
TagSetPrefixView.update(self, mapping)
:
Update tags from a name->value mapping.
TagSetPrefixView.value
:
Return the Tag
value for the prefix, or None
if there is no such Tag
.
TagSetPrefixView.values(self)
:
Return an iterable of the values (Tag
s).
Class
TagSetsSubdomain(cs.obj.SingletonMixin, cs.mappings.PrefixedMappingProxy)``: A view into aBaseTagSets
for keys commencing with a prefix being the subdomain plus a dot (`'.'`).
TagSetsSubdomain.TAGGED_ENTITY_FACTORY
:
The entity factory comes from the parent collection.
-
Class
TagsOntology(cs.obj.SingletonMixin, BaseTagSets)``: An ontology for tag names. This is based around a mapping of names to ontological information expressed as aTagSet
.Normally an object's tags are not a self contained repository of all the information; instead a tag just names some information.
As a example, consider the tag
colour=blue
. Meta information aboutblue
is obtained via the ontology, which has an entry for the colourblue
.We adopt the convention that the type is just the tag name, so we obtain the metadata by calling
ontology.metadata(tag)
or alternativelyontology.metadata(tag.name,tag.value)
being the type name and value respectively.The ontology itself is based around
TagSets
and effectively the callontology.metadata('colour','blue')
would look up theTagSet
namedcolour.blue
in the ontology.For a self contained dataset this means that it can be its own ontology.
For tags associated with arbitrary objects such as the filesystem tags maintained by
cs.fstags
the ontology would be a separate tags collection stored in a central place.There are two main categories of entries in an ontology:
- metadata: other entries named typename
.
value_key contains aTagSet
holding metadata for a value of type typename whose value is mapped to value_key - types: an optional entry named
type.
typename contains aTagSet
describing the type named typename; really this is just more metadata where the "type name" istype
Metadata are
TagSet
instances describing particular values of a type. For example, the metadataTagSet
for theTag
colour="blue"
:colour.blue url="https://en.wikipedia.org/wiki/Blue" wavelengths="450nm-495nm"
Some metadata associated with the
Tag
actor="Scarlett Johansson"
:actor.scarlett_johansson role=["Black Widow (Marvel)"] character.marvel.black_widow fullname=["Natasha Romanov"]
The tag values are lists above because an actor might play many roles, etc.
There's a default convention for converting human descriptions such as the role string
"Black Widow (Marvel)"
to its metadata.- the value
"Black Widow (Marvel)"
if converted to a key by the ontology methodvalue_to_tag_name
; it moves a bracket suffix such as(Marvel)
to the front as a prefixmarvel.
and downcases the rest of the string and turns spaces into underscores. This yields the value keymarvel.black_widow
. - the type is
role
, so the ontology entry for the metadata isrole.marvel.black_widow
This requires type information about a
role
. Here are some type definitions supporting the above metadata:type.person type=str description="A person." type.actor type=person description="An actor's stage name." type.character type=str description="A person in a story." type.role type_name=character description="A character role in a performance." type.cast type=dict key_type=actor member_type=role description="Cast members and their roles."
The basic types have their Python names:
int
,float
,str
,list
,dict
,date
,datetime
. You can define subtypes of these for your own purposes as illustrated above.For example:
type.colour type=str description="A hue."
which subclasses
str
.Subtypes of
list
include amember_type
specifying the type for members of aTag
value:type.scene type=list member_type=str description="A movie scene."
Subtypes of
dict
include akey_type
and amember_type
specifying the type for keys and members of aTag
value:Accessing type data and metadata:
A
TagSet
may have a reference to aTagsOntology
as.ontology
and so also do any of itsTag
s. - metadata: other entries named typename
TagsOntology.__bool__(self)
:
Support easy ontology or some_default
tests,
since ontologies are broadly optional.
TagsOntology.__delitem__(self, name)
:
Delete the entity named name
.
TagsOntology.__getitem__(self, name)
:
Fetch tags
for the entity named name
.
TagsOntology.__setitem__(self, name, tags)
:
Apply tags
to the entity named name
.
TagsOntology.add_tagsets(self, tagsets: cs.tagset.BaseTagSets, match, unmatch=None, index=0)
:
Insert a _TagsOntology_SubTagSets
at index
in the list of _TagsOntology_SubTagSets
es.
The new _TagsOntology_SubTagSets
instance is initialised
from the supplied tagsets
, match
, unmatch
parameters.
TagsOntology.as_dict(self)
:
Return a dict
containing a mapping of entry names to their TagSet
s.
TagsOntology.basetype(self, typename)
:
Infer the base type name from a type name.
The default type is 'str'
,
but any type which resolves to one in self.BASE_TYPES
may be returned.
TagsOntology.by_type(self, type_name, with_tagsets=False)
:
Yield keys or (key,tagset) of type type_name
i.e. all keys commencing with type_name.
.
TagsOntology.convert_tag(self, tag)
:
Convert a Tag
's value accord to the ontology.
Return a new Tag
with the converted value
or the original Tag
unchanged.
This is primarily aimed at things like regexp based autotagging,
where the matches are all strings
but various fields have special types,
commonly int
s or date
s.
TagsOntology.edit_indices(self, indices, prefix=None)
:
Edit the entries specified by indices.
Return TagSet
s for the entries which were changed.
TagsOntology.from_match(tagsets, match, unmatch=None)
:
Initialise a SubTagSets
from tagsets
, match
and optional unmatch
.
Parameters:
tagsets
: aTagSets
holding ontology informationmatch
: a match function used to choose entries based on a type nameunmatch
: an optional reverse formatch
, accepting a subtype name and returning its public name
If match
is None
then tagsets
will always be chosen if no prior entry matched.
Otherwise, match
is resolved to a function match-func(type_name)
which returns a subtype name on a match and a false value on no match.
If match
is a callable it is used as match_func
directly.
if match
is a list, tuple or set
then this method calls itself with (tagsets,submatch)
for each member submatch
if match
.
If match
is a str
,
if it ends in a dot '.', dash '-' or underscore '_'
then it is considered a prefix of type_name
and the returned
subtype name is the text from type_name
after the prefix
othwerwise it is considered a full match for the type_name
and the returns subtype name is type_name
unchanged.
The match
string is a simplistic shell style glob
supporting *
but not ?
or [
seq]
.
The value of unmatch
is constrained by match
.
If match
is None
, unmatch
must also be None
;
the type name is used unchanged.
If match
is callable,
unmatchmust also be callable; it is expected to reverse
match`.
Examples:
>>> from cs.sqltags import SQLTags
>>> from os.path import expanduser as u
>>> # an initial empty ontology with a default in memory mapping
>>> ont = TagsOntology()
>>> # divert the types actor, role and series to my media ontology
>>> ont.add_tagsets(
... SQLTags(u('~/var/media-ontology.sqlite')),
... ['actor', 'role', 'series'])
>>> # divert type "musicbrainz.recording" to mbdb.sqlite
>>> # mapping to the type "recording"
>>> ont.add_tagsets(SQLTags(u('~/.cache/mbdb.sqlite')), 'musicbrainz.')
>>> # divert type "tvdb.actor" to tvdb.sqlite
>>> # mapping to the type "actor"
>>> ont.add_tagsets(SQLTags(u('~/.cache/tvdb.sqlite')), 'tvdb.')
TagsOntology.get(self, name, default=None)
:
Fetch the entity named name
or default
.
TagsOntology.items(self)
:
Yield (entity_name,tags)
for all the items in each subtagsets.
TagsOntology.keys(self)
:
Yield entity names for all the entities.
TagsOntology.metadata(self, type_name, value, *, convert=None)
:
Return the metadata TagSet
for type_name
and value
.
This implements the mapping between a type's value and its semantics.
The optional parameter convert
may specify a function to use to convert value
to a tag name component
to be used in place of self.value_to_tag_name
(the default).
For example, if a TagSet
had a list of characters such as:
character=["Captain America (Marvel)","Black Widow (Marvel)"]
then these values could be converted to the dotted identifiers
character.marvel.captain_america
and character.marvel.black_widow
respectively,
ready for lookup in the ontology
to obtain the "metadata" TagSet
for each specific value.
TagsOntology.startup_shutdown(self)
:
Open all the subTagSets
and close on exit.
TagsOntology.subtype_name(self, type_name)
:
Return the type name for use within self.tagsets
from type_name
.
Returns None
if this is not a supported type_name
.
TagsOntology.type_name(self, subtype_name)
:
Return the external type name from the internal subtype_name
which is used within self.tagsets
.
TagsOntology.type_names(self)
:
Return defined type names i.e. all entries starting type.
.
TagsOntology.type_values(self, type_name, value_tag_name=None)
:
Yield the various defined values for type_name
.
This is useful for types with enumerated metadata entries.
For example, if metadata entries exist as foo.bah
and foo.baz
for the type_name
'foo'
then this yields 'bah'
and 'baz'
.`
Note that this looks for a Tag
for the value,
falling back to the entry suffix if the tag is not present.
That tag is normally named value
(from DEFAULT_VALUE_TAG_NAME)
but may be overridden by the value_tag_name
parameter.
Also note that normally it is desireable that the value
convert to the suffix via the value_to_tag_name
method
so that the metadata entry can be located from the value.
TagsOntology.typedef(self, type_name)
:
Return the TagSet
defining the type named type_name
.
TagsOntology.types(self)
:
Generator yielding defined type names and their defining TagSet
.
TagsOntology.value_to_tag_name(value)
:
Convert a tag value to a tagnamelike dotted identifierish string
for use in ontology lookup.
Raises ValueError
for unconvertable values.
We are allowing dashes in the result (UUIDs, MusicBrainz discids, etc).
int
s are converted to str
.
Strings are converted as follows:
- a trailing
(.*)
is turned into a prefix with a dot, for example"Captain America (Marvel)"
becomes"Marvel.Captain America"
. - the string is split into words (nonwhitespace),
lowercased and joined with underscores,
for example
"Marvel.Captain America"
becomes"marvel.captain_america"
.
-
Class
TagsOntologyCommand(cs.cmdutils.BaseCommand)``: A command line for working with ontology types.Usage summary:
Usage: tagsontology [common-options...] subcommand [options...] A command line for working with ontology types. Subcommands: edit [common-options...] [{/name-regexp | entity-name}] Edit entities. With no arguments, edit all the entities. With an argument starting with a slash, edit the entities whose names match the regexp. Otherwise the argument is expected to be an entity name; edit the tags of that entity. help [common-options...] [-l] [-s] [subcommand-names...] Print help for subcommands. This outputs the full help for the named subcommands, or the short help for all subcommands if no names are specified. info [common-options...] [field-names...] Recite general information. Explicit field names may be provided to override the default listing. meta [common-options...] tag=value repl [common-options...] Run a REPL (Read Evaluate Print Loop), an interactive Python prompt. shell [common-options...] Run a command prompt via cmd.Cmd using this command's subcommands. type [common-options...] With no arguments, list the defined types. type [common-options...] type_name With a type name, print its `Tag`s. type [common-options...] type_name edit Edit the tags defining a type. type [common-options...] type_name edit meta_names_pattern... Edit the tags for the metadata names matching the meta_names_patterns. type [common-options...] type_name list type [common-options...] type_name ls List the metadata names for this type and their tags. type [common-options...] type_name + entity_name [tags...] Create type_name.entity_name and apply the tags.
TagsOntologyCommand.cmd_edit(self, argv)
:
Usage: {cmd} [{{/name-regexp | entity-name}}]
Edit entities.
With no arguments, edit all the entities.
With an argument starting with a slash, edit the entities
whose names match the regexp.
Otherwise the argument is expected to be an entity name;
edit the tags of that entity.
TagsOntologyCommand.cmd_meta(self, argv)
:
Usage: {cmd} tag=value
TagsOntologyCommand.cmd_type(self, argv)
:
Usage:
{cmd}
With no arguments, list the defined types.
{cmd} type_name
With a type name, print its Tag
s.
{cmd} type_name edit
Edit the tags defining a type.
{cmd} type_name edit meta_names_pattern...
Edit the tags for the metadata names matching the
meta_names_patterns.
{cmd} type_name list
{cmd} type_name ls
List the metadata names for this type and their tags.
{cmd} type_name + entity_name [tags...]
Create type_name.entity_name and apply the tags.
TagsOntologyCommand.run_context(self)
:
Open self.options.ontology
during commands.
Release Log
Release 20250528:
- Some refactors and small fixes.
- Tag: rename Tag.from_str2 to Tag.parse, drop offset parameter of Tag.from_str.
Release 20250306: TagFile.save_tagsets: use atomic_filename to write the tag file.
Release 20250103: Replace TagSet.from_line with TagSet.from_str, leave @OBSOLETE hook behind.
Release 20241007: Tagset._Auto: act much more like a mapping.
Release 20241005: _FormatStringTagProxy: give it a format method which formats the proxied tag's value.
Release 20240422.2: jsonable: use obj.for_json() if available.
Release 20240422.1: jsonable: convert pathlib.PurePath to str, hoping this isn't too open ended a can of worms.
Release 20240422:
- New jsonable(obj) function to return a deep copy of
obj
which can be transcribed as JSON. - Tag.transcribe_value: pass jsonable(value) to the JSON encoder, drop special checks now done by jsonable().
- Tag.str: do not catch TypeError any more, was embedding Python repr()s in .fstags files - now Tag.transcribe_value() does the correct thing where that is possible.
Release 20240316: Fixed release upload artifacts.
Release 20240305:
- Tag.from_str2: make the ontology optional.
- TagSetPrefixView: provide len() and update().
Release 20240211:
- TagFile.parse_tag_line: recognise dotted_identifiers directly, avoids misparsing bare "nan" as float NaN.
- Tag.parse_value: BUGFIX parse - always to the primary types first (int, float) before trying any funny extra types.
Release 20240201: TagsOntology.metadata: actually call the .items() method!
Release 20231129:
- TagSet.getattr: rework the attribute lookup with greater precision.
- TagSetPrefixView.getattr: if the attribute is not there, raise Attribute error, do not try to fall back to something else.
- TagSet: drop ATTRABLE_MAPPING_DEFAULT=None, caused far more confusion that it was worth.
Release 20230612:
- TagFile.save_tagsets: catch and warn about exceptions from update_mapping[key].update, something is wrong with my SQLTags usage.
- TagFile.save_tagsets: update_mapping: do not swallow AttributeError.
Release 20230407: Move the (optional) ORM open/close from FSTags.startup_shutdown to TagFile.save, greatly shortens the ORM lock.
Release 20230212: Mark TagSetCriterion as Promotable.
Release 20230210:
- TagFile: new optional update_mapping secondary mapping to which to mirror file tags, for example to an SQLTags.
- New .uuid:UUID property returning the UUID for the tag named 'uuid' or None.
Release 20230126: New TagSet.is_stale() method based on .expiry attribute, intended for TagSets which are caches of other primary data.
Release 20221228:
- TagFile: drop _singleton_key, FSPathBasedSingleton provides a good default.
- TagFile.save_tagsets,tags_line: new optional prune=False parameter to drop empty top level dict/lists.
- TagFile.save: plumb prune=False parameter.
Release 20220806: New TagSetCriterion.promote(obj)->TagSetCriterion class method.
Release 20220606:
- Tag.parse_value: bugfix parse of float.
- TagSet.edit: accept optional comments parameter with addition header comment lines, be more tolerant of errors, avoid losing data on error.
Release 20220430:
- TagSetPrefixView: new as_dict() method.
- TagSetPrefixView.str: behave like TagSet.str.
- TagFile.save_tagsets: do not try to save if the file is missing and the tagsets are empty.
- New TagSet.from_tags(tags) factory to make a new TagSet from an iterable of tags.
- TagSetPrefixView: add .get and .setdefault mapping methods.
- RegexpTagRule: accept optional tag_prefix parameter.
- Tagset: new from_ini() and save_as_ini() methods to support cs.timeseries config files, probably handy elsewhere.
Release 20220311: Assorted internal changes.
Release 20211212:
- Tag: new fallback_parse parameter for value parsing, default get_nonwhite.
- Tag: new from_arg factory with fallback_parse grabbing the whole string for command line arguments, thus supporting unquoted strings for ease of use.
- TagSetCriterion: new optional fallback_parse parameter and from_arg method as for the Tag factories.
- Tag.transcribe_value: accept optional json_options to control the JSON encoder, used for human friendly multiline edits in cs.app.tagger.
- Rename edit_many to edit_tagsets for clarity.
- TagsOntology: new type_values method to return values for a type (derived from their metadata entries).
- Tag: new alt_values method returning its TagsOntology.type_values.
- (Internal) New _FormatStringTagProxy which proxies a Tag but uses str(self.__proxied.value) for str to support format strings.
- (Internal) TagSet.get_value: if arg_name matches a Tag, return a _FormatStringTagProxy.
- Tag.new: accept (tag_name,value) or (Tag) as initialisation parameters.
Release 20210913:
- TagSet.get_value: raise KeyError in strict mode, leave placeholder otherwise.
- Other small changes.
Release 20210906: Many many updates; some semantics have changed.
Release 20210428: Bugfix TagSet.set: internal in place changes to a complex tag value were not noticed, causing TagFile to not update on shutdown.
Release 20210420:
- TagSet: also subclass cs.dateutils.UNIXTimeMixin.
- Various TagSetNamespace updates and bugfixes.
Release 20210404: Bugfix TagBasedTest.COMPARISON_FUNCS["="]: if cmp_value is None, return true (the tag is present).
Release 20210306:
- ExtendedNamespace,TagSetNamespace: move the .[:alpha:]* attribute support from ExtendedNamespace to TagSetNamespace because it requires Tags.
- TagSetNamespace.getattr: new _i, _s, _f suffixes to return int, str or float tag values (or None); fold _lc in with these.
- Pull most of
TaggedEntity
out intoTaggedEntityMixin
for reuse by domain specific tagged entities. - TaggedEntity: new .set and .discard methods.
- TaggedEntity: new as_editable_line, from_editable_line, edit and edit_entities methods to support editing entities using a text editor.
- ontologies: type entries are now prefixed with "type." and metadata entries are prefixed with "meta."; provide a worked ontology example in the introduction and improve related docstrings.
- TagsOntology: new .types(), .types_names(), .meta(type_name,value), .meta_names() methods.
- TagsOntology.getitem: create missing TagSets on demand.
- New TagsOntologyCommand, initially with a "type [type_name [{edit|list}]]" subcommand, ready for use as the cmd_ont subcommand of other tag related commands.
- TagSet: support initialisation like a dict including keywords, and move the
ontology
parameter to_onotology
. - TagSet: include AttrableMappingMixin to enable attribute access to values when there is no conflict with normal methods.
- UUID encode/decode support.
- Honour $TAGSET_EDITOR or $EDITOR as preferred interactive editor for tags.
- New TagSet.subtags(prefix) to extract a subset of the tags.
- TagsOntology.value_metadata: new optional convert parameter to override the default "convert human friendly name" algorithm, particularly to pass convert=str to things which are already the basic id.
- Rename TaggedEntity to TagSet.
- Rename TaggedEntities to TagSets.
- TagSet: new csvrow and from_csvrow methods imported from obsolete TaggedEntityMixin class.
- Move BaseTagFile from cs.fstags to TagFile in cs.tagset.
- TagSet: support access to the tag "c.x" via attributes provided there is no "c" tag in the way.
- TagSet.unixtime: implement the autoset-to-now semantics.
- New as_timestamp(): convert date, datetime, int or float to a UNIX timestamp.
- Assorted docstring updates and bugfixes.
Release 20200716:
- Update for changed cs.obj.SingletonMixin API.
- Pull in TaggedEntity from cs.sqltags and add the .csvrow property and the .from_csvrow factory.
Release 20200521.1: Fix DISTINFO.install_requires, drop debug import.
Release 20200521:
- New ValueDetail and KeyValueDetail classes for returning ontology information; TagInfo.detail now returns a ValueDetail for scalar types, a list of ValueDetails for sequence types and a list of KeyValueDetails for mapping types; drop various TagInfo mapping/iterable style methods, too confusing to use.
- Plumb ontology parameter throughout, always optional.
- Drop TypedTag, Tags now use ontologies for this.
- New TagsCommandMixin to support BaseCommands which manipulate Tags.
- Many improvements and bugfixes.
Release 20200318:
- Note that the TagsOntology stuff is in flux and totally alpha.
- Tag.prefix_name factory returning a new tag if prefix is not empty, ptherwise self.
- TagSet.update: accept an optional prefix for inserting "foreign" tags with a distinguishing name prefix.
- Tag.as_json: turn sets and tuples into lists for encoding.
- Backport for Python < 3.7 (no fromisoformat functions).
- TagSet: drop unused and illplaced .titleify, .episode_title and .title methods.
- TagSet: remove "defaults", unused.
- Make TagSet a direct subclass of dict, adjust uses of .update etc.
- New ExtendedNamespace class which is a SimpleNamespace with some inferred attributes and a partial mapping API (keys and getitem).
- New TagSet.ns() returning the Tags as an ExtendedNamespace, which doubles as a mapping for str.format_map; TagSet.format_kwargs is now an alias for this.
- New Tag.from_string factory to parse a str into a Tag.
- New TagsOntology and TypedTag classes to provide type and value-detail information; very very alpha and subject to change.
Release 20200229.1: Initial release: pull TagSet, Tag, TagChoice from cs.fstags for independent use.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.