Skip to main content

Python client library for the FitLayout REST API

Project description

FitLayout - Python Client

(c) 2025 Radek Burget (burgetr@fit.vut.cz)

The FitLayout client connects to the FitLayout REST API server and allows obtaining artifact data, performing SPARQL queries on the artifact repository, modifying the repository content, or running remote FitLayout services. Its primary purpose is the implementation of algorithms for analyzing web pages in Python, where FitLayout serves as a source of data about rendered web pages, including details about the appearance and layout of individual content elements.

Usage

FitLayout uses the RDF data model for representing the pages and other artifacts. It defines a set of ontologies for describing the individual artifacts. The Python client is based on RDFLib that provides the RDF API for Python. The main FitLayoutClient class provides several functions for obtaining the artifact data.

A simple usage of the API:

from flclient import FitLayoutClient, BOX

fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")

# Get the IRIs of all rendered page in the repository
pageIris = fl.artifacts(BOX.Page)

# Get the first page IRI (as an example)
pageIri = next(pageIris)

# Get the RDF graph of the first page (a rdflib Graph object)
pageGraph = fl.get_artifact(pageIri)

# Print all properties of the page itself (excluding "pngImage" which is too large)
# This uses the RDFLib API for filtering the RDF triples.
for s, p, o in pageGraph.triples((pageIri, None, None)):
    prop = p.fragment # Omit the namespace from the property IRI
    if prop == "pngImage":
        continue
    print(f"{prop}: {o}")

# Get all content boxes from the page
for s, p, o in pageGraph.triples((None, BOX.belongsTo, pageIri)):
    boxIri = s
    # Get the box text (the box:text property)
    # See the ontology documentation for other box properties
    for bs, bp, bo in pageGraph.triples((boxIri, BOX.text, None)):
        print(f"Box: {boxIri}, text: {bo}")

The most efficient way to retrieve data for further analysis (e.g., machine learning) is to use SPARQL queries over the artifact repository. The following example uses one SPARQL query to retrieve all AreaTree artifacts and another SPARQL query to retrieve information about all distinguishable visual areas contained in a given area tree:

from flclient import FitLayoutClient, default_prefix_string

fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")

# Find all AreaTree IRIs in the repository.
# It returns a CSV with a single column 'areaTreeIri'.
listQuery = default_prefix_string() + """
    SELECT ?areaTreeIri
    WHERE {
        ?areaTreeIri rdf:type segm:AreaTree
    }
"""

# Execute the SPARQL query.
area_tree_rows = fl.sparql(listQuery)

# Get the first AreaTree IRI (as an example)
area_tree_iri = next(area_tree_rows)['areaTreeIri']
print("Area tree IRI:", area_tree_iri)

# A SPARQL query for finding area properties within the specified area tree
# For each area, retrieve its background color, text color, font size, position, dimensions, and text (if any)
# See the FitLayout ontology documentation for more details on the properties and relationships used.
areaQuery = default_prefix_string() + """
    SELECT (?c AS ?uri) ?backgroundColor ?color ?fontSize ?x ?y ?w ?h ?text
    WHERE {
        ?c rdf:type segm:Area .
        ?c segm:belongsTo <""" + str(area_tree_iri) + """> .
        ?c segm:containsBox ?box
        OPTIONAL { ?c box:backgroundColor ?backgroundColor } .
        OPTIONAL { ?box box:color ?color } .
        ?c box:fontSize ?fontSize .
        OPTIONAL { ?c segm:text ?text } .
        ?c box:bounds ?b . 

        ?b box:positionX ?x .
        ?b box:positionY ?y .
        ?b box:width ?w .
        ?b box:height ?h .
    }
"""

# Execute the SPARQL query
results = fl.sparql(areaQuery)

# Process the returned results.
for row in results:
    print(row)

Server setup

The Python client requires a FitLayout server to be running and accessible. The easiest way to set up a server is to use the fitlayout-local docker image, which provides both the server and a web GUI for adding and post-processing new web pages to the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flclient-0.0.1.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flclient-0.0.1-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file flclient-0.0.1.tar.gz.

File metadata

  • Download URL: flclient-0.0.1.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for flclient-0.0.1.tar.gz
Algorithm Hash digest
SHA256 111a6ac3f4596a6b55a38f7dafbb6b216b107b5dd2f8db178ed093559f6e0c15
MD5 d5ea981c88de27e9ee7a6e6b3211cddc
BLAKE2b-256 8d413028860389d7a10a5a0ae385f8986c923dae510cb4a2b0cd3aafab2961ab

See more details on using hashes here.

File details

Details for the file flclient-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: flclient-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for flclient-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c889f1b8546070a495d12191b6d2588c42c40ef20044b806e69457396ee26b9e
MD5 c8a502230ec34213200d0d3179f1479a
BLAKE2b-256 16a68769006198642cafa751caa2df32c08fe2ed8d33a30b5469f1a20457464f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page