Python client library for the FitLayout REST API
Project description
FitLayout - Python Client
(c) 2025 Radek Burget (burgetr@fit.vut.cz)
The FitLayout client connects to the FitLayout REST API server and allows obtaining artifact data, performing SPARQL queries on the artifact repository, modifying the repository content, or running remote FitLayout services. Its primary purpose is the implementation of algorithms for analyzing web pages in Python, where FitLayout serves as a source of data about rendered web pages, including details about the appearance and layout of individual content elements.
Usage
FitLayout uses the RDF data model for representing the pages and other artifacts. It defines a set of ontologies for describing the individual artifacts. The Python client is based on RDFLib that provides the RDF API for Python. The main FitLayoutClient class provides several functions for obtaining the artifact data.
A simple usage of the API:
from flclient import FitLayoutClient, BOX
fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")
# Get the IRIs of all rendered page in the repository
pageIris = fl.artifacts(BOX.Page)
# Get the first page IRI (as an example)
pageIri = next(pageIris)
# Get the RDF graph of the first page (a rdflib Graph object)
pageGraph = fl.get_artifact(pageIri)
# Print all properties of the page itself (excluding "pngImage" which is too large)
# This uses the RDFLib API for filtering the RDF triples.
for s, p, o in pageGraph.triples((pageIri, None, None)):
prop = p.fragment # Omit the namespace from the property IRI
if prop == "pngImage":
continue
print(f"{prop}: {o}")
# Get all content boxes from the page
for s, p, o in pageGraph.triples((None, BOX.belongsTo, pageIri)):
boxIri = s
# Get the box text (the box:text property)
# See the ontology documentation for other box properties
for bs, bp, bo in pageGraph.triples((boxIri, BOX.text, None)):
print(f"Box: {boxIri}, text: {bo}")
The most efficient way to retrieve data for further analysis (e.g., machine learning) is to use SPARQL queries over the artifact repository. The following example uses one SPARQL query to retrieve all AreaTree artifacts and another SPARQL query to retrieve information about all distinguishable visual areas contained in a given area tree:
from flclient import FitLayoutClient, default_prefix_string
fl = FitLayoutClient("http://localhost:8400/api", "repositoryId")
# Find all AreaTree IRIs in the repository.
# It returns a CSV with a single column 'areaTreeIri'.
listQuery = default_prefix_string() + """
SELECT ?areaTreeIri
WHERE {
?areaTreeIri rdf:type segm:AreaTree
}
"""
# Execute the SPARQL query.
area_tree_rows = fl.sparql(listQuery)
# Get the first AreaTree IRI (as an example)
area_tree_iri = next(area_tree_rows)['areaTreeIri']
print("Area tree IRI:", area_tree_iri)
# A SPARQL query for finding area properties within the specified area tree
# For each area, retrieve its background color, text color, font size, position, dimensions, and text (if any)
# See the FitLayout ontology documentation for more details on the properties and relationships used.
areaQuery = default_prefix_string() + """
SELECT (?c AS ?uri) ?backgroundColor ?color ?fontSize ?x ?y ?w ?h ?text
WHERE {
?c rdf:type segm:Area .
?c segm:belongsTo <""" + str(area_tree_iri) + """> .
?c segm:containsBox ?box
OPTIONAL { ?c box:backgroundColor ?backgroundColor } .
OPTIONAL { ?box box:color ?color } .
?c box:fontSize ?fontSize .
OPTIONAL { ?c segm:text ?text } .
?c box:bounds ?b .
?b box:positionX ?x .
?b box:positionY ?y .
?b box:width ?w .
?b box:height ?h .
}
"""
# Execute the SPARQL query
results = fl.sparql(areaQuery)
# Process the returned results.
for row in results:
print(row)
Server setup
The Python client requires a FitLayout server to be running and accessible. The easiest way to set up a server is to use the fitlayout-local docker image, which provides both the server and a web GUI for adding and post-processing new web pages to the repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flclient-0.0.1.tar.gz.
File metadata
- Download URL: flclient-0.0.1.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
111a6ac3f4596a6b55a38f7dafbb6b216b107b5dd2f8db178ed093559f6e0c15
|
|
| MD5 |
d5ea981c88de27e9ee7a6e6b3211cddc
|
|
| BLAKE2b-256 |
8d413028860389d7a10a5a0ae385f8986c923dae510cb4a2b0cd3aafab2961ab
|
File details
Details for the file flclient-0.0.1-py3-none-any.whl.
File metadata
- Download URL: flclient-0.0.1-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c889f1b8546070a495d12191b6d2588c42c40ef20044b806e69457396ee26b9e
|
|
| MD5 |
c8a502230ec34213200d0d3179f1479a
|
|
| BLAKE2b-256 |
16a68769006198642cafa751caa2df32c08fe2ed8d33a30b5469f1a20457464f
|