Skip to main content

Python client library for the FitLayout REST API

Project description

FitLayout - Python Client

(c) 2025 Radek Burget (burgetr@fit.vut.cz)

The FitLayout client connects to the FitLayout REST API server and allows obtaining artifact data, performing SPARQL queries on the artifact repository, modifying the repository content, or running remote FitLayout services. Its primary purpose is the implementation of algorithms for analyzing web pages in Python, where FitLayout serves as a source of data about rendered web pages, including details about the appearance and layout of individual content elements.

Usage

FitLayout uses the RDF data model for representing the pages and other artifacts. It defines a set of ontologies for describing the individual artifacts. The Python client is based on RDFLib that provides the RDF API for Python. The main FitLayoutClient class provides several functions for obtaining the artifact data.

A simple usage of the API:

from flclient import FitLayoutClient, BOX

fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")

# Get the IRIs of all rendered page in the repository
pageIris = fl.artifacts(BOX.Page)

# Get the first page IRI (as an example)
pageIri = next(pageIris)

# Get the RDF graph of the first page (a rdflib Graph object)
pageGraph = fl.get_artifact(pageIri)

# Print all properties of the page itself (excluding "pngImage" which is too large)
# This uses the RDFLib API for filtering the RDF triples.
for s, p, o in pageGraph.triples((pageIri, None, None)):
    prop = p.fragment # Omit the namespace from the property IRI
    if prop == "pngImage":
        continue
    print(f"{prop}: {o}")

# Get all content boxes from the page
for s, p, o in pageGraph.triples((None, BOX.belongsTo, pageIri)):
    boxIri = s
    # Get the box text (the box:text property)
    # See the ontology documentation for other box properties
    for bs, bp, bo in pageGraph.triples((boxIri, BOX.text, None)):
        print(f"Box: {boxIri}, text: {bo}")

The most efficient way to retrieve data for further analysis (e.g., machine learning) is to use SPARQL queries over the artifact repository. The following example uses one SPARQL query to retrieve all AreaTree artifacts and another SPARQL query to retrieve information about all distinguishable visual areas contained in a given area tree:

from flclient import FitLayoutClient, default_prefix_string

fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")

# Find all AreaTree IRIs in the repository.
# It returns a CSV with a single column 'areaTreeIri'.
listQuery = default_prefix_string() + """
    SELECT ?areaTreeIri
    WHERE {
        ?areaTreeIri rdf:type segm:AreaTree
    }
"""

# Execute the SPARQL query.
area_tree_rows = fl.sparql(listQuery)

# Get the first AreaTree IRI (as an example)
area_tree_iri = next(area_tree_rows)['areaTreeIri']
print("Area tree IRI:", area_tree_iri)

# A SPARQL query for finding area properties within the specified area tree
# For each area, retrieve its background color, text color, font size, position, dimensions, and text (if any)
# See the FitLayout ontology documentation for more details on the properties and relationships used.
areaQuery = default_prefix_string() + """
    SELECT (?c AS ?uri) ?backgroundColor ?color ?fontSize ?x ?y ?w ?h ?text
    WHERE {
        ?c rdf:type segm:Area .
        ?c segm:belongsTo <""" + str(area_tree_iri) + """> .
        ?c segm:containsBox ?box
        OPTIONAL { ?c box:backgroundColor ?backgroundColor } .
        OPTIONAL { ?box box:color ?color } .
        ?c box:fontSize ?fontSize .
        OPTIONAL { ?c segm:text ?text } .
        ?c box:bounds ?b . 

        ?b box:positionX ?x .
        ?b box:positionY ?y .
        ?b box:width ?w .
        ?b box:height ?h .
    }
"""

# Execute the SPARQL query
results = fl.sparql(areaQuery)

# Process the returned results.
for row in results:
    print(row)

Server setup

The Python client requires a FitLayout server to be running and accessible. The easiest way to set up a server is to use the fitlayout-local docker image, which provides both the server and a web GUI for adding and post-processing new web pages to the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flclient-0.0.2.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flclient-0.0.2-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file flclient-0.0.2.tar.gz.

File metadata

  • Download URL: flclient-0.0.2.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for flclient-0.0.2.tar.gz
Algorithm Hash digest
SHA256 ec9f3be5f07cbebdccc2b90f233c8c2fb94543b0e56abe0f876fde50e7b8c3e1
MD5 5efddfc665121daf71f86400b6b0c058
BLAKE2b-256 daa43ecea58ec5815c1c2da7404c634d2d02911dee77e33a41ad5169f0137a04

See more details on using hashes here.

File details

Details for the file flclient-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: flclient-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for flclient-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ebbae9f768486228b0b20e21ec308439a5ced2f0d87651d2e32db2cb367220b9
MD5 5d10a2037995536b9990f8b35b88f6f1
BLAKE2b-256 60513946d7291eb6ac6c8d389952d085858a670922a8f718119bde9cc862dbc8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page