Simple SQL based tagging and the associated `sqltags` command line script, supporting both tagged named objects and tagged timestamped log entries.
Project description
Simple SQL based tagging
and the associated sqltags
command line script,
supporting both tagged named objects and tagged timestamped log entries.
Latest release 20221228: SQLTagsCommand: update implementation of BaseCommand.run_context to use super().run_context().
Compared to cs.fstags
and its associated fstags
command,
this is oriented towards large numbers of items
not naturally associated with filesystem objects.
My initial use case is an activity log (unnamed timestamped tag sets) but I'm also using it for ontologies (named tag sets containing metadata).
Many basic tasks can be performed with the sqltags
command line utility,
documented under the SQLTagsCommand
class below.
See the SQLTagsORM
documentation for details about how data
are stored in the database.
See the SQLTagSet
documentation for details of how various
tag value types are supported.
Class BaseSQLTagsCommand(cs.cmdutils.BaseCommand, cs.tagset.TagsCommandMixin)
Common features for commands oriented around an SQLTags
database.
Command line usage:
Usage: basesqltags [-f db_url] subcommand [...]
-f db_url SQLAlchemy database URL or filename.
Default from $SQLTAGS_DBURL (default '~/var/sqltags.sqlite').
Subcommands:
dbshell
Start an interactive database shell.
edit criteria...
Edit the entities specified by criteria.
export [-F format] [{tag[=value]|-tag}...]
Export entities matching all the constraints.
-F format Specify the export format, either CSV or FSTAGS.
find [-o output_format] {tag[=value]|-tag}...
List entities matching all the constraints.
-o output_format
Use output_format as a Python format string to lay out
the listing.
Default: {datetime} {headline}
help [-l] [subcommand-names...]
Print the full help for the named subcommands,
or for all subcommands if no names are specified.
-l Long help even if no subcommand-names provided.
import [{-u|--update}] {-|srcpath}...
Import CSV data in the format emitted by "export".
Each argument is a file path or "-", indicating standard input.
-u, --update If a named entity already exists then update its tags.
Otherwise this will be seen as a conflict
and the import aborted.
init
Initialise the database.
This includes defining the schema and making the root metanode.
log [-c category,...] [-d when] [-D strptime] {-|headline} [tags...]
Record entries into the database.
If headline is '-', read headlines from standard input.
-c categories
Specify the categories for this log entry.
The default is to recognise a leading CAT,CAT,...: prefix.
-d when
Use when, an ISO8601 date, as the log entry timestamp.
-D strptime
Read the time from the start of the headline
according to the provided strptime specification.
tag {-|entity-name} {tag[=value]|-tag}...
Tag an entity with multiple tags.
With the form "-tag", remove that tag from the direct tags.
A entity-name named "-" indicates that entity-names should
be read from the standard input.
Function glob2like(glob: str) -> str
Convert a filename glob to an SQL LIKE pattern.
Function main(argv=None)
Command line mode.
Class PolyValue(PolyValue, builtins.tuple)
A namedtuple
for the polyvalues used in an SQLTagsORM
.
We express various types in SQL as one of 3 columns:
float_value
: forfloat
s andint
s which round trip withfloat
string_value
: forstr
structured_value
: a JSON transcription of any other type
This allows SQL indexing of basic types.
Note that because str
gets stored in string_value
this leaves us free to use "bare string" JSON to serialise
various nonJSONable types.
The SQLTagSets
class has a to_polyvalue
factory
which produces a PolyValue
suitable for the SQL rows.
NonJSONable types such as datetime
are converted to a str
but stored in the structured_value
column.
This should be overridden by subclasses as necessary.
On retrieval from the database
the tag rows are converted to Python values
by the SQLTagSets.from_polyvalue
method,
reversing the process above.
Class PolyValueColumnMixin
A mixin for classes with (float_value,string_value,structured_value)
columns.
This is used by the Tags
and TagMultiValues
relations inside SQLTagsORM
.
Function prefix2like(prefix: str, esc='\\') -> str
Convert a prefix string to an SQL LIKE pattern.
Class SQLParameters(SQLParameters, builtins.tuple)
The parameters required for constructing queries or extending queries with JOINs.
Attributes:
criterion
: the source criterion, usually anSQTCriterion
subinstancealias
: an alias of the source table for use in queriesentity_id_column
: theentities
id column,alias.id
if the alias is ofentities
,alias.entity_id
if the alias is oftags
constraint
: a filter query based onalias
Class SQLTagBasedTest(cs.tagset.TagBasedTest, cs.tagset.TagBasedTest, builtins.tuple, SQTCriterion, cs.tagset.TagSetCriterion)
A cs.tagset.TagBasedTest
extended with a .sql_parameters
method.
Class SQLTagProxies
A proxy for the tags supporting Python comparison => SQLParameters
.
Example:
sqltags.tags.dotted.name.here == 'foo'
Class SQLTagProxy
An object based on a Tag
name
which produces an SQLParameters
when compared with some value.
Example:
>>> sqltags = SQLTags('sqlite://')
>>> sqltags.init()
>>> # make a SQLParameters for testing the tag 'name.thing'==5
>>> sqlp = sqltags.tags.name.thing == 5
>>> str(sqlp.constraint)
'tags_1.name = :name_1 AND tags_1.float_value = :float_value_1'
>>> sqlp = sqltags.tags.name.thing == 'foo'
>>> str(sqlp.constraint)
'tags_1.name = :name_1 AND tags_1.string_value = :string_value_1'
Class SQLTags(cs.tagset.BaseTagSets, cs.resources.MultiOpenMixin, cs.context.ContextManagerMixin, collections.abc.MutableMapping, collections.abc.Mapping, collections.abc.Collection, collections.abc.Sized, collections.abc.Iterable, collections.abc.Container)
A class using an SQL database to store its TagSet
s.
Class SQLTagsCommand(BaseSQLTagsCommand, cs.cmdutils.BaseCommand, cs.tagset.TagsCommandMixin)
sqltags
main command line utility.
Command line usage:
Usage: sqltags [-f db_url] subcommand [...]
-f db_url SQLAlchemy database URL or filename.
Default from $SQLTAGS_DBURL (default '~/var/sqltags.sqlite').
Subcommands:
dbshell
Start an interactive database shell.
edit criteria...
Edit the entities specified by criteria.
export [-F format] [{tag[=value]|-tag}...]
Export entities matching all the constraints.
-F format Specify the export format, either CSV or FSTAGS.
find [-o output_format] {tag[=value]|-tag}...
List entities matching all the constraints.
-o output_format
Use output_format as a Python format string to lay out
the listing.
Default: {datetime} {headline}
help [-l] [subcommand-names...]
Print the full help for the named subcommands,
or for all subcommands if no names are specified.
-l Long help even if no subcommand-names provided.
import [{-u|--update}] {-|srcpath}...
Import CSV data in the format emitted by "export".
Each argument is a file path or "-", indicating standard input.
-u, --update If a named entity already exists then update its tags.
Otherwise this will be seen as a conflict
and the import aborted.
init
Initialise the database.
This includes defining the schema and making the root metanode.
list [entity-names...]
List entities and their tags.
log [-c category,...] [-d when] [-D strptime] {-|headline} [tags...]
Record entries into the database.
If headline is '-', read headlines from standard input.
-c categories
Specify the categories for this log entry.
The default is to recognise a leading CAT,CAT,...: prefix.
-d when
Use when, an ISO8601 date, as the log entry timestamp.
-D strptime
Read the time from the start of the headline
according to the provided strptime specification.
ls [entity-names...]
List entities and their tags.
tag {-|entity-name} {tag[=value]|-tag}...
Tag an entity with multiple tags.
With the form "-tag", remove that tag from the direct tags.
A entity-name named "-" indicates that entity-names should
be read from the standard input.
Class SQLTagSet(cs.obj.SingletonMixin, cs.tagset.TagSet, builtins.dict, cs.dateutils.UNIXTimeMixin, cs.lex.FormatableMixin, cs.lex.FormatableFormatter, string.Formatter, cs.mappings.AttrableMappingMixin)
A singleton TagSet
attached to an SQLTags
instance.
As with the TagSet
superclass,
tag values can be any Python type.
However, because we are storing these values in an SQL database
it is necessary to provide a conversion facility
to prepare those values for storage.
The database schema is described in the SQLTagsORM
class;
in short we directly support None
, float
and str
,
int
s which round trip with float
,
and list
, tuple
and dict
whose contents transcribe to JSON.
int
s which are too large to round trip with float
are treated as an extended "bigint"
type
using the scheme described below.
Because the ORM has distinct float
and str
columns to support indexing,
there will be no plain strings in the remaining JSON blob column.
Therefore we support other types by providing functions
to convert each type to a str
and back,
and an associated "type label" which will be prefixed to the string;
the resulting string is stored in the JSON blob.
The default mechanism is based on the following class attributes and methods:
TYPE_JS_MAPPING
: a mapping of a type label string to a 3 tuple of(type,to_str,from_str)
being the extended type, a function to convert an instance tostr
and a function to convert astr
to an instance of this typeto_js_str
: a method accepting(tag_name,tag_value)
and returningtag_value
as astr
; the default implementation looks up the type oftag_value
inTYPE_JS_MAPPING
to locate the correspondingto_str
functionfrom_js_str
: a method accepting(tag_name,js)
which uses the leading type label prefix from thejs
to look up the correspondingfrom_str
function fromTYPE_JS_MAPPING
and use it on the tail ofjs
The default TYPE_JS_MAPPING
has mappings for:
"bigint"
: conversions forint
"date"
: conversions fordatetime.date
"datetime"
: conversions fordatetime.datetime
Subclasses wanting to augument the TYPE_JS_MAPPING
should prepare their own with code such as:
class SubSQLTagSet(SQLTagSet,....):
....
TYPE_JS_MAPPING=dict(SQLTagSet.TYPE_JS_MAPPING)
TYPE_JS_MAPPING.update(
typelabel=(type, to_str, from_str),
....
)
Class SQLTagsORM(cs.sqlalchemy_utils.ORM, cs.resources.MultiOpenMixin, cs.context.ContextManagerMixin, cs.dateutils.UNIXTimeMixin)
The ORM for an SQLTags
.
The current implementation uses 3 tables:
entities
: this has a NULLablename
andunixtime
UNIX timestamp; this is unique pername
if the name is not NULLtags
: this has anentity_id
,name
and a value stored in one of three columns:float_value
,string_value
andstructured_value
which is a JSON blob; this is unique per(entity_id,name)
tag_subvalues
: this is a broken out version oftags
whenstructured_value
is a sequence or mapping, breaking out the values one per row; this exists to support "tag contains value" lookups
Tag values are stored as follows:
None
: all 3 columns are set toNULL
float
: stored infloat_value
int
: if theint
round trips tofloat
then it is stored infloat_value
, otherwise it is stored instructured_value
with the type label"bigint"
str
: stored instring_value
list
,tuple
,dict
: stored instructured_value
; if these containers contain unJSONable content there will be trouble- other types, such as
datetime
: these are converted to strings with identifying type label prefixes and stored instructured_value
The float_value
and string_value
columns
allow us to provide indices for these kinds of tag values.
The type label scheme takes advantage of the fact that actual str
s
are stored in the string_value
column.
Because of this, there will be no actual strings in structured_value
.
Therefore, we can convert nonJSONable types to str
and store them here.
The scheme used is to provide conversion functions to convert types
to str
and back, and an associated "type label" prefix.
For example, we store a datetime
as the ISO format of the datetime
with "datetime:"
prefixed to it.
The actual conversions are kept with the SQLTagSet
class
(or any subclass).
This ORM receives the 3-tuples of SQL ready values
from that class as the PolyValue
namedtuple
and does not perform any conversion itself.
The conversion process is described in SQLTagSet
.
Class SQTCriterion(cs.tagset.TagSetCriterion)
Subclass of TagSetCriterion
requiring an .sql_parameters
method
which returns an SQLParameters
providing the information required
to construct an sqlalchemy query.
It also resets .CRITERION_PARSE_CLASSES
, which will pick up
the SQL capable criterion classes below.
Class SQTEntityIdTest(SQTCriterion, cs.tagset.TagSetCriterion)
A test on entity.id
.
Function verbose(msg, *a)
Emit message if in verbose mode.
Release Log
Release 20221228: SQLTagsCommand: update implementation of BaseCommand.run_context to use super().run_context().
Release 20220806:
- Bugfix for SQLTagsORM.search(mode='entity').
- SQLTags.find: new _without_tags=False parameter to allow fast searches omitting the entity tags.
Release 20220606:
- New SQLTagsORM.Entities.add_new_tags method, use it in SQLTags.default_factory for bulk insert.
- SQTCriterion: new .from_equality(tag_name,tag_value) factory to make an equality criterion.
- SQLTags.find: accept criteria as positional parameters instead of a single iterable, accept new keyword parameters as equality criteria.
- SQLTags.getitem: accept a slice to index the .unixtime tag.
- SQLTagsORM: also turn on echo mode if "ECHO" in $SQLTAGS_MODES.
Release 20220311: Assorted updates.
Release 20211212:
- Rename edit_many to edit_tagsets for clarity.
- Small bugfixes.
Release 20210913:
- SQLTagsCommand: rename cmd_ns to cmd_list,cmd_ls.
- SQLTagsCommand.cmd_export: accept "-F export_format" for csv or fstags export, accept no criteria to mean all tagsets.
- Encoding schema for nonJSONable types.
- Rename the TagSets abstract base class to BaseTagSets.
- BaseSQLTagsCommand.cmd_edit: implement rename.
- Many other internal small changes.
Release 20210420:
- New PolyValueMixin pulled out of Tags for common support of the (float_value,string_value,structured_value).
- SQLTagsORM: new TagSubValues relation containing broken out values for values which are sequences, to support efficient lookup if sequence values such as log entry categories.
- New BaseSQLTagsCommand.parse_categories static method to parse FOO,BAH into ['foo','bah'].
- sqltags find: change default format to "{datetime} {headline}".
- Assorted small changes.
Release 20210404:
- SQLTags.getitem: when autocreating an entity, do it in a new session so that the entity is commited to the database before any further use.
- SQLTagsCommand: new cmd_dbshell to drop you into the database.
Release 20210321: Drop logic now merged with cs.sqlalchemy_utils, use the new default session stuff.
Release 20210306.1: Docstring updates.
Release 20210306: Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.