Declarative Python meta-model system and visitor utilities
Project description
Normalize
The normalize package is a class builder and toolkit most useful for writing "plain old data structures" to wrap data from network sources in python objects.
It is called "normalize", because it is focused on the first normal form of relational database modelling. This is the simplest and most straightforward level which defines what are normally called "records" (or rows). A record is a defined collection of properties/attributes (columns), where you know roughly what to expect in each property/attribute, and can access them by some kind of descriptor (i.e., the attribute name). You can also use it as a general purpose declarative meta-programming framework, as it ships with an official meta-object-protocol (MOP) API to describe this information, built on top of python's notion of classes/types and descriptors and extended where necessary.
Put simply, you write python classes to describe your assumptions
about the data structures you're dealing with, feed in input data and
you get regular python objects back which have attributes which you
can use naturally.
Or, you get an error and find you have to revisit your assumptions.
You can then perform basic operations with the objects, such as make
changes to them and convert them back, or compare them to another
version using the rich comparison API.
You can also construct the objects 'natively' using regular python
keyword/value constructors or by passing a dict as the first
argument.
It is very similar in scope to the remoteobjects and
schematics packages on PyPI, and may in time evolve to include all
the features of those packages.
While there is some notion of primary keys in the module, mainly for the purposes of recognizing objects in collections for comparison, higher levels of normalization are an exercise left to the implementer.
Features
-
declarative API, which may optionally contain direct marshaling hints:
::
class Star(Record): id = Property(isa=int, required=True) name = Property(isa=str) other_names = Property(json_name="otherNames")Type descriptions (
isa=) are completely optional, but if given will be use for type checking and coercion. -
rich descriptor API (in
normalize.property), including the notions of not just 'required' and 'isa' type hints as shown above but also default functions, custom-type check functions, and coercion functions.It also sports an extensible attribute trait system, which adds more features via optional Property sub-classes, selected automatically, enabling:
-
lazy attributes which short-cut at the python core level once calculated (a somewhat underused python feature)
-
read-only attributes
-
type-safe attributes (i.e., that type-check on assign)
-
collection attributes (see below)
-
-
coercion from regular python dictionaries or
key=value(kwargs) constructor arguments -
conversion to and from JSON for all classes, regardless of whether they derive
normalize.record.json.JsonRecord. Support for custom functions for JSON marshal in and out. -
conversion to primitive python types via the pickle API (
__getnewargs__) -
New in 0.5: generic mechanism for marshalling to and from other other forms. See the documentation for the new
normalize.visitor.VisitorPatternAPI. -
typed collections with item coercion (currently lists and dicts only):
::
class StarSystem(Record): components = ListProperty(Star) alpha_centauri = StarSystem( components=[{id=70890, name="Proxima Centauri"}, {id=71683, name="Alpha Centauri A"}, {id=71681, name="Alpha Centauri B"}] ) -
"field selector" API which allows for specification of properties deep into nested data structures;
::
name_selector = FieldSelector("components", 0, "name") print name_selector.get(alpha_centauri) # "Proxima Centauri" -
comparison API which returns differences between two Records of matching types. Ability to mark properties as "extraneous" to skip comparison (this also affects the
==operator) -
...and much more!
============ Contributing
#. Fork the repo from GitHub <https://github.com/hearsaycorp/normalize>.
#. Make your changes.
#. Add unittests for your changes.
#. Run pep8 <https://pypi.python.org/pypi/pep8>, pyflakes <https://pypi.python.org/pypi/pyflakes>, and pylint <https://pypi.python.org/pypi/pyflakes> to make sure your changes follow the Python style guide and doesn't have any errors.
#. Commit. Please write a commit message which explains the use case; see the commit log for examples.
#. Add yourself to the AUTHORS file (in alphabetical order).
#. Send a pull request from your fork to the main repo.
Normalize changelog and errata
3.1.0 4th May 2026
- Added support for Python 3.12, 3.13, 3.14
- Dropped support for Python < 3.10
- Moved the CI from CircleCI to Github Actions
- Migrate to Poetry for dependency management
3.0.1 26th August 2025
- Add support for binary JSON parsing to work as before 3.0.0
3.0.0 18th August 2025
- Fully dropped python 2 support
- Breaking change with string types
- Types can be cast to string will be casted (None, int, float etc...)
1.0.1 10th February 2016
- Added new base class for all exceptions to subclass. This will ensure that users of normalize will be able to catch all exceptions.
1.0.0 28th September 2015
As a hint to the stability of the code, I've decided to call this release 1.0.
But with a major version comes a major new feature. The 0.x approach was one of type safety and strictness. The 1.0 approach will be one of convenience and added pythonicity, layered on top of an inner strictness. To allow for backwards compatibility, in general you must specify the new behavior in the class declaration.
The details will be documented in the manual, tests and tutorial, but in a nutshell, the new features are:
-
unset V1 attributes return something false (usually
None) instead ofAttributeError. You can override the type ofNonereturned withv1_none=''. This value can be assigned to the slot, and if it doesn't pass the type constraint, instead of raisingnormalize.exc.CoercionErrorit will behave the same as deleting the attribute. -
there's a new base class called
AutoJsonRecordwhich allows you to access attributes of the input JSON, previously accessed via.unknown_json_keys['attribute'], by regular attribute access. This feature is recursive, so you can quickly work with new APIs without having to pre-write a bunch of API definitions. -
Much more is available via a direct
from normalize import Foo, including all of the typed property declarations, the visitor API, and diff types. -
DatetimePropertyandDatePropertynow ship with ajson_outfunction which usesisoformat()to convert to a string as you'd expect them to. -
New type
NumberPropertywhich will hold any numeric type (as decided bynumbers.Number) -
FieldSelector got a new function
get_or_nonewhich is likegetbut returnsNoneinstead of throwing aFieldSelectorException.
There are also some minor backwards incompatibilities:
-
setting
default=None(or any other false, immutable value) on a property will select a V1 property. The benefit of this is it makes the class instance dictionary lighter, for classes which specify a lot ofdefault=Noneordefault=''properties. -
DateTimePropertynow ships with default JSON IO functions which usedatetime.datetime.strptimeanddatetime.datetime.isoformat()to convert to and from a string. This is an improvement, but technically an API change you might need to consider if you were expecting it to fail. -
DatePropertywill now force the value type to be a date, and will truncate datetimes to dates as originally envisioned. -
StringPropertyandUnicodePropertyno longer will convert anything you pass to them to a string or unicode string. This is actually a new feature, because before the declaration was unusable; just about everything in python can be converted to a string, so you'd end up with string representations of objects in the slots. Now you get type errors. -
The
emptyproperty parameter has been removed completely.
0.10.0 21st August 2015
-
Exceptions raised while marshalling JSON are now wrapped by a new exception which exposes the path within the input document that the problem occurred.
-
Various structured exceptions had attribute names changed. They're now more consistent across varying exception types.
-
Using
JsonListProperty()makes the type of the inner collection aJsonRecordListsubclass instead of previously it was aRecordList, enabling the context above. Beware that this has implications to input marshalling; previously skipped marshalling will now be called. -
When using
JsonListProperty, previously if it encountered a different type of collection (or even a string), it would build with just the keys. This now raises an exception. Similarly withJsonDictPropertyif you pass something other than a mapping. -
Field selectors with upper case and digits in attribute names will be converted to paths via
.pathwithout using quoting if they are valid JavaScript/C tokens.
0.9.10 9th July 2015
- the implicit squashing of attributes which coerce to None now also works for subtype coerce functions
0.9.9 8th July 2015
-
added a new, convenient API for creating type objects which check their values against a function:
subtypeFor example, if you want to say that a slot contains an ISO8601-formatted datetime string, you could declare that like this:
::
import re import dateutil.parser import normalize # simplified for brevity iso8601_re = re.compile(r'^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\.(\d+))?$') ISO8601 = normalize.subtype( "ISO8601", of=str, where=lambda x: re.match(iso8601_re, x), coerce=lambda s: dateutil.parser.parse(s).isoformat(), ) class SomeClass(normalize.Record): created = normalize.Property(isa=ISO8601)
0.9.8 26th June 2015
-
MultiFieldSelector.from_path(path) did not work if the 'path' did not end with ')' (ie, there was only one FieldSelector within).
-
FieldSelector delete operations were updated to work with collection items: previously, you could not remove items from collections, or use 'None' at the end of a delete Field Selector. This now works for DictCollection and ListCollection.
-
Some bugs with FieldSelector.post, .put and .delete on DictCollections were cleaned up.
-
It is now possible to use FieldSelector.post(x, y) to create a new item in a collection or set a property specified as a record where 'x' is the only required property.
0.9.7 9th June 2015
- the fix delivered by 0.9.6 fix now also fixes empty collections
0.9.6 9th June 2015
- fixed regression introduced in 0.9.4 with collections, which cleanly round trip using a non-specialized VisitorPattern again
0.9.5 9th June 2015
- FieldSelector and MultiFieldSelector's operations now work with DictCollection containers as well as native dict's
0.9.4 5th June 2015
- Fixed normalize.visitor for collections of non-Record types as well.
0.9.3 3rd June 2015
-
Comparing simple collections will now return MODIFIED instead of ADDED/REMOVED if individual indexes/keys changed
-
Comparing typed collections where the item type is not a Record type (eg
list_of(str)) now falls back to the appropriate 'simple' collection comparison function. This works recursively, so you can eg get meaningful results comparingdict_of(list_of(str))instances. -
New diff option 'moved' to return a new diff type MOVED for items in collections.
-
the completely undocumented
DiffOptions.id_argssub-class API method is now deprecated and will be removed in a future release. -
Specifying 'compare_filter' to diffs over collections where the field selector matches something other than the entire collection now works.
0.9.2 27th May 2015
-
Another backwards compatibility accessor for
RecordList.valuesallows assignment to proceed.::
class MyFoo(Record): bar = ListProperty(of=SomeRecord) foo = MyFoo(bar=[]) # this will now warn instead of throwing Exception foo.bar.values = list_of_some_records # these forms will not warn: foo.bar = list_of_some_records foo.bar[:] = list_of_some_records
0.9.1 22nd May 2015
- the
RecordList.valuesremoval in 0.9.0 has been changed to be a deprecation with a warning instead of a hard error.
0.9.0 21st May 2015
-
ListPropertyattribute can now be treated like lists; they support almost all of the same methods the built-inlisttype does, and type-checks values inserted into them with coercion.note: if you were using
.valuesto access the internal array, this is now not present onRecordListinstances. You should be able to just remove the.values:::
class MyFoo(Record): bar = ListProperty(of=SomeRecord) foo = MyFoo(bar=[somerecord1, somerecord2]) # before: foo.bar.values.extend(more_records) foo.bar.values[-1:] = even_more_records # now: foo.bar.extend(more_records) foo.bar[-1:] = even_more_records -
DictPropertycan now be used, and these also support the importantdictmethods, with type-checking. -
You can now construct typed collections using
list_ofanddict_of:::
from normalize.coll import list_of, dict_of
complex = dict_of(list_of(int))() complex['foo'] = ["1"] # ok complex['foo'].append("bar") # raises a CoercionError
Be warned if using
stras a type constraint that just about anything will happily coerce to a string, but that might not be what you want. Consider usingbasestringinstead, which will never coerce successfully.
0.8.0 6th March 2015
-
bool(record)was reverted to pre-0.7.x behavior: always True, unless a Collection in which case Falsy depending on the number of members in the collection. -
Empty psuedo-attributes now return
normalize.empty.EmptyValobjects, which are alwaysFalseand perform a limited amount of sanity checking/type inference, so that misspellings of sub-properties can sometimes be caught.
0.7.4 5th March 2015
-
A regression which introduced subtle bugs in 0.7.0, which became more significant with the new feature delivered in 0.7.3 was fixed.
-
An exception with some forms of dereferencing MultiFieldSelectors was fixed.
0.7.3 4th March 2015
- Added a new option to diff to suppress diffs found when comparing lists of objects for which all populated fields are filtered.
0.7.2 27th February 2015
- Fixed a regression with the new 'json_out' behavior I decided was big enough to pull 0.7.1 from PyPI for.
0.7.1 27th February 2015
-
VisitorPattern.visit with visit_filter would not visit everything in the filter due to the changes in 0.7.0
-
MultiFieldSelector subscripting, where the result is now a "complete" MultiFieldSelector (ie, matches all fields/values) is now more efficient by using a singleton
-
the return of 'json_out' is no longer unconditionally passed to
to_json: call it explicitly if you desire this behavior:::
class Foo(Record): bar = Property(isa=Record, json_out=lambda x: {"bar": x})If you are using
json_outlike this, and expectingRecordvalues or anything with ajson_datamethod to have that called, then you can wrap the whole thing into_json:::
from normalize.record.json import to_json class Foo(Record): bar = Property(isa=Record, json_out=lambda x: to_json({"bar": x}))
0.7.0 18th February 2015
Lots of long awaited and behavior-changing features:
-
empty pseudo-attributes are now available which return (usually falsy) values when the attribute is not set, instead of throwing AttributeError like the regular getters.
The default is to call this the same as the regular attribute, but with a '0' appended;
::
class Foo(Record): bar = Property() foo = Foo() foo.bar # raises AttributeError foo.bar0 # NoneThe default 'empty' value depends on the passed
isa=type constraint, and can be set toNoneor the empty string, as desired, usingempty=:::
class Dated(Record): date = Property(isa=MyType, empty=None)It's also possible to disable this functionality for particular attributes using
empty_attr=None.Property uses which are not safe will see a new warning raised which includes instructions on the changes recommended.
-
accordingly, bool(record) now also returns false if the record has no attributes defined; this allows you to use '0' in a chain with properties that are record types:
::
if some_record.sub_prop0.foobar0: passInstead of the previous:
::
if hasattr(some_record, "sub_prop") and \ getattr(some_record.sub_prop, "foobar", False): passThis currently involves creating a new (empty) instance of the object for each of the intermediate properties; but this may in the future be replaced by a proxy object for performance.
The main side effect of this change is that this kind of code is no longer safe:
::
try: foo = FooJsonRecord(json_data) except: foo = None if foo: #... doesn't imply an exception happened -
The mechanism by which
empty=delivers psuedo-attributes is available via theaux_propssub-class API on Property. -
Various ambiguities around the way MultiFieldSelectors and their
__getattr__and__contains__operators (ie,multi_field_selector[X]andX in multi_field_selector) are defined have been updated based on findings from using them in real applications. See the function definitions for more.
0.6.6 16th January 2014
- Fix
FieldSelector.deleteandFieldSelector.getwhen some of the items in a collection are missing attributes
0.6.5 2nd January 2014
- lazy properties would fire extra times when using visitor APIs or other direct use of get on the meta-property (#50)
0.6.4 2nd January 2014
- The 'path' form of a multi field selector can now round-trip, using
MultiFieldSelector.from_path - Two new operations on
MultiFieldSelector:deleteandpatch
0.6.3 30th December 2014
- Add support in to_json for marshaling out a property of a record
- The 'path' form of a field selector can now round-trip, using
FieldSelector.from_path
0.6.2 24rd September 2014
- A false positive match was fixed in the fuzzy matching code.
0.6.1 23rd September 2014
-
Gracefully handle unknown keyword arguments to Property() previously this would throw an awful internal exception.
-
Be sure to emit NO_CHANGE diff events if deep, fuzzy matching found no differences
0.6.0 17th September 2014
- Diff will now attempt to do fuzzy matching when comparing collections. This should result in more fine-grained differences when comparing data where the values have to be matched by content. This implementation in this version can be slow (O(N²)), if comparing very large sets with few identical items.
0.5.5 17th September 2014
-
Lots of improvements to exceptions with the Visitor
-
More records should now round-trip ('visit' and 'cast') cleanly with the default Visitor mappings; particularly
RecordListtypes with extra, extraneous properties. -
ListProperties were allowing unsafe assignment; now all collections will always be safe (unless marked 'unsafe' or read-only)
0.5.4 20th August 2014
- values in attributes of type 'set' get serialized to JSON as lists by default now (Dale Hui)
0.5.3 20th August 2014
-
fixed a corner case with collection diff & filters (github issue #45)
-
fixed
Property(list_of=SomeRecordType), which should have worked likeListProperty(of=SomeRecordType), but didn't due to a bug in the metaclass.
0.5.2 5th August 2014
-
You can now pass an object method to
compare_as=on a property definition. -
New sub-class API hook in
DiffOptions:normalize_object_slot, which receives the object as well as the value. -
passing methods to
default=which do not call their first argument 'self' is now a warning.
0.5.1 29th July 2014
- Subscripting a MultiFieldSelector with an empty (zero-length) FieldSelector now works, and returns the original field selector. This fixed a bug in the diff code when the top level object was a collection.
0.5.0 23rd July 2014
- normalize.visitor overhaul. Visitor got split into a sub-class API, VisitorPattern, which is all class methods, and Visitor, the instance which travels with the operation to provide context. Hugely backwards incompatible, but the old API was undocumented and sucked anyway.
0.4.x Series, 19th June - 23rd July 2014
-
added support for comparing filtered objects;
__pk__()object method no longer honored. Seetests/test_mfs_diff.pyfor examples -
MultiFieldSelector can now be traversed by indexing, and supports the
inoperator, with individual indices or FieldSelector objects as the member. Seetests/test_selector.pyfor examples. -
extraneousdiff option now customizable via theDiffOptionssub-class API. -
Diff,JsonDiffandMultiFieldSelectornow have more useful default stringification. -
The 'ignore_empty_slots' diff option is now capable of ignoring empty records as well as None-y values. This even works if the records are not actually None but all of the fields that have values are filtered by the DiffOptions compare_filter parameter.
-
added Diffas property trait, so you can easily add 'compare_as=lambda x: scrub(x)' for field-specific clean-ups specific to comparison.
-
errors thrown from property coerce functions are now wrapped in another exception to supply the extra context. For instance, the example in the intro will now print an error like:
CoerceError: coerce to datetime for Comment.edited failed with value '2001-09-09T01:47:22': datetime constructor raised: an integer is required
0.3.0, 30th May 2014
-
enhancement to diff to allow custom, per-field normalization of values before comparison
-
Some inconsistancies in JSON marshalling in were fixed
0.2.x Series, 24th April - 27th May 2014
-
the return value from
coercefunctions is now checked against the type constraints (isaandcheckproperties) -
added capability of Property constructor to dynamically mix variants as needed; Almost everyone can now use plain
Property(),ListProperty(), or a shorthand typed property declaration (likeStringProperty()); other properties likeSafeandLazywill be automatically added as needed. Property types such asLazySafeJsonPropertyare no longer needed and were savagely expunged from the codebase. -
SafePropertyis now only a safe base class forPropertysub-classes which have type constraints. Uses ofmake_property_typewhich did not add type constraints must be changed toPropertytype, or will raiseexc.PropertyTypeMixNotFound -
bug fix for pickling
JsonRecordclasses -
filtering objects via
MultiFieldSelector.get(obj)now works forJsonRecordclasses. -
The
AttributeErrorraised when an attribute is not defined now includes the full name of the attribute (class + attribute)
0.1.x Series, 27th March - 8th April 2014
-
much work on the diff mechanisms, results, and record identity
-
records which set a tuple for
isanow work properly on stringification -
semi-structured exceptions (
normalize.exc) -
the collections 'tuple protocol' (which models all collections as a sequence of (K, V) tuples) was reworked and made to work with more cases, such as iterators and generators.
-
Added
DatePropertyandDatetimeProperty
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file normalize-3.1.0.tar.gz.
File metadata
- Download URL: normalize-3.1.0.tar.gz
- Upload date:
- Size: 72.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d3306a5fea9a6b41e677325965ed75f960ec72a063886260e6379fafb3a56ca
|
|
| MD5 |
abacb46eb8983f1bbcdb243337b5f512
|
|
| BLAKE2b-256 |
2620dabd97ee46041e3e087e9ac5851d91a0cb22527fd0ab22f3a7f220333bda
|
Provenance
The following attestation bundles were made for normalize-3.1.0.tar.gz:
Publisher:
publish.yml on hearsaycorp/normalize
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
normalize-3.1.0.tar.gz -
Subject digest:
3d3306a5fea9a6b41e677325965ed75f960ec72a063886260e6379fafb3a56ca - Sigstore transparency entry: 1437407675
- Sigstore integration time:
-
Permalink:
hearsaycorp/normalize@cff13b1d89f9760f12003c3134351acd48b8f85d -
Branch / Tag:
refs/heads/master - Owner: https://github.com/hearsaycorp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cff13b1d89f9760f12003c3134351acd48b8f85d -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file normalize-3.1.0-py3-none-any.whl.
File metadata
- Download URL: normalize-3.1.0-py3-none-any.whl
- Upload date:
- Size: 84.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cb4953efe436392c871cef3463389abce221041cae7b372714cd7437420ba03
|
|
| MD5 |
e8351648fb03114116303ea07eea1be1
|
|
| BLAKE2b-256 |
701857ab3d1a0c75c64f157abb209197dde091f04375877347e2e2622e3b6fb7
|
Provenance
The following attestation bundles were made for normalize-3.1.0-py3-none-any.whl:
Publisher:
publish.yml on hearsaycorp/normalize
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
normalize-3.1.0-py3-none-any.whl -
Subject digest:
1cb4953efe436392c871cef3463389abce221041cae7b372714cd7437420ba03 - Sigstore transparency entry: 1437407685
- Sigstore integration time:
-
Permalink:
hearsaycorp/normalize@cff13b1d89f9760f12003c3134351acd48b8f85d -
Branch / Tag:
refs/heads/master - Owner: https://github.com/hearsaycorp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cff13b1d89f9760f12003c3134351acd48b8f85d -
Trigger Event:
workflow_dispatch
-
Statement type: