Skip to main content

E4X style embedded DSL for Python but without E and X

Project description

P4D is not Python but it is also fun

P4D ( = Python for Data ) is an EasyExtend langlet used to write tree shaped textual data a.k.a XML.

P4D elements are written as statements in a notation being familiar for Python programmers. They closely resemble Python statements. Here is one used to define an element

elm philosophers:
    philosopher:
        name: "Hegel"
        books:
             book( first_edition = 1807):
                title: "Phänomenologie des Geistes"
                language: "german"
    philosopher:
        name: "Leibniz"
        books:
             book( first_edition = 1714):
                title: "Monadologie"
                language: "french"

The elm keyword is new and it is one of the few occasions where P4D breaks actual Python code. However I did not found elm being used in Pythons stdlib so it might break just none.

The subobjects of philosophers can be accessed in E4X style

books = philosophers.philosopher.(name == "Hegel").books
books.book.(@first_edition < 1810).title.text()
# -> "Phänomenologie des Geistes"

P4D can also be used as a template language in the sense that Python expressions can be embedded in P4D elements

L = [1,2,3]
elm A:
    B: &L

elm X:
    & A

The elements of the list object L will be distributed over elements of type B

assert len(A.B) == 3
assert X.A.B[0].text() == '1'

It is easy to convert a P4D element into an XML element using the xmlstr() method. Otherwise one can convert XML to P4D using P4D.from_xml(). This is so easy because internally the same datastructure is used. Building a P4D element and parsing an XML document leads to the same internal representation. This internal representation can be used to store even more distinct objects having quite different properties and are not even textual but binary.

Bytelets

Other than XML and the P4D elements we’ve seen above Bytelets are used to deal with binary data in a flexible manner.

Suppose you want to serialize a string and don’t want to use null-termination. Then you have to send the length of the string together with the string and the type or tag for the receiver to identify the hexcode as a string

elm bl:text:
    Tag: 0x50
    Len: &LEN
    Text: "{obamania}"

This P4D element produces a Bytelet. Bytelets are generally prefixed using the bl namespace prefix. The LEN object is a so called Flow object. It is kind of a dataflow binding. If you bind LEN to a field it computes the sum of the lengths of all values of subsequent fields. If you update the Bytelet the value of the `` Len`` field will be re-computed using LEN.

One can check this out

assert text.Len.hex() == 8

Here 8 is just the length of the text.

new_text = text.clone()
new_text.Text = "{the merkel}"
assert new_text.Len.hex() == 10

One can fully evaluate the text Bytelet

assert text.hex() == 0x50 0x08 0x6F 0x62 0x61 0x6D 0x61 0x6E 0x69 0x61

Notice that writing 0x50 0x08 0x6F ... `` without quotes is valid in P4D and it yields a ``Hex object not a number. So P4D supports enhanced hexadecimal literals which obtain a different semantics than Pythons.

Now we want to turn the Hex object back into a Bytelet for which it has to be parsed. This is done by a Schema

elm bl-schema:TextSchema:
    Tag: 1
    Len: &LEN
    Text: &VAL

parsed = TextSchema.parse(text.hex())
assert parsed.hex()  == text.hex()

A Schema is characterized by the namespace prefix bl-schema. Otherwise it is just another Bytelet with a parse() method. Here the Schema has the exact same structure than the original Bytelet but different field values. The new value VAL is also a dataflow binding. It binds to the Len field. After the value 0x08 is assigned to Len while parsing the VAL binding uses this value to chop of 8 bytes from the input stream and assign it to Text.

More features of Bytelets:

  • Specifications of Bytelets can be refined by setting individual bits and specifying bit array widths.

  • Schemas can be used to parse arbitrary sequences of T(ag)L(ength)V(alue) structures like the one in our example. However the order of the TLVs need not be fixed.

  • Simple arithmetics is defined for Flow objects like LEN and VAL. So we can write Len : &LEN + 1 but also Text: &VAL - VAL["IHL"]*4 where VAL["IHL"] refers not to Len but another field IHL.

For more information look at the P4D documents

http://www.fiber-space.de/EasyExtend/doc/p4d/p4d.html

http://www.fiber-space.de/EasyExtend/doc/p4d/bytelets.html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

P4D Langlet-1.2.4-py2.5.zip (620.3 kB view hashes)

Uploaded Source

P4D Langlet-1.2.4-py2.5.tar.gz (482.6 kB view hashes)

Uploaded Source

Built Distribution

P4D Langlet-1.2.4.win32-py2.5.exe (637.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page