Skip to main content

Process OpenStreetMap tile logs

Project description

Tilelog

Tilelog is used to generate tile logs for the OSMF Standard map layer.

Requirements

  • Access to Athena on the OSMF AWS account with the logs.
  • Python 3.6+

Install

For local development install uv and

uv venv
uv run tilelog --help

Usage

Usage: tilelog [OPTIONS]

Options:
  --date [%Y-%m-%d]   Date to generate logs for. Defaults to yesterday.
  --staging TEXT      AWS s3 location for Athena results
  --generate-success  Create logs of successful requests in Parquet
  --region TEXT       Region for Athena
  --tile FILENAME     File to output tile usage logs to
  --host FILENAME     File to output host usage logs to
  --app FILENAME      File to output app usage logs to
  --help              Show this message and exit.

e.g.

DATE=$(date -u -d "1 day ago" "+%Y-%m-%d")
tilelog --date ${DATE} --tile tiles-${DATE}.txt.xz --host hosts-${DATE}.csv --app apps-${DATE}.csv

--generate-success can only be run once for each day, so if doing development, should not generally be run as it will interfere with production.

Format documentation

Tile logs

Tile logs contain the number of requests per tile in a given 24 hour UTC day. Only tiles where at least 10 requests were made and requests came from at least 3 unique IPs are included for privacy reasons. Requests that were blocked, invalid, or unable to be served due to server load are not included (4xx and 5xx errors).

The format is one tile per line, in the format z/x/y N where z/x/y is the conventional tile coordinate and N is the number of requests.

No particular sorting order of lines is guaranteed.

Host logs

Host logs contain the website host of sites using tile.openstreetmap.org, their average requests/second, and their average requests/second that were cache misses in a given 24 hour UTC day. For privacy reasons, only sites with at least 432000 requests per day (5 requests/second average) coming from the site are included. Requests that were blocked, invalid, or unable to be served due to server load are not included (4xx and 5xx errors).

The host will normally be a valid domain name, but as this data comes from user requests it may contain other text.

The format is one host per line, in the CSV format "HOST",N,M where HOST is the host name, with special characters escaped, N is the requests/second, and M is the requests/second that were cache misses. Hosts are ordered by requests/second

The following may change in the future

  • Additional fields added at the end
  • The definition of "cache miss"
  • The theshold for requests/day to be included
  • Handling of invalid domains and empty referers

App logs

App logs contain the referer of non-website usage, primarily from stand-alone mobile and desktop programs. App name is derived from a combination of User-Agent, X-Requested-With, and non-website Referer.

Multiple app versions are combined and indicated with *.

The format is app host per line, in the CSV format "APP",N,M where APP is the app name, with special characters escaped, N is the requests/second, and M is the requests/second that were cache misses. Apps are ordered by requests/second

The following may change in the future

  • Additional fields added at the end
  • The definition of "cache miss"
  • The theshold for requests/day to be included
  • Combining of app versions

Contributing

Unfortunately, testing tilelogs requires access to private logs, making it difficult to test.

Style is flake8

Licence

Copyright (C) 2021-2022 Paul Norman

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tilelog-1.7.1a1.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tilelog-1.7.1a1-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file tilelog-1.7.1a1.tar.gz.

File metadata

  • Download URL: tilelog-1.7.1a1.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tilelog-1.7.1a1.tar.gz
Algorithm Hash digest
SHA256 9fe07ae157991dd3d7b4f48ebf18202d3b60b247e2ea43cee3dc8e22fd056dec
MD5 7507adc2b5a6d49d20c01dbbec4ef332
BLAKE2b-256 f05f84e07e4a06822d9e12f7c2bafe918d81fa21ee4b1944131fd1c1cb779023

See more details on using hashes here.

Provenance

The following attestation bundles were made for tilelog-1.7.1a1.tar.gz:

Publisher: package.yml on openstreetmap/tilelog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tilelog-1.7.1a1-py3-none-any.whl.

File metadata

  • Download URL: tilelog-1.7.1a1-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tilelog-1.7.1a1-py3-none-any.whl
Algorithm Hash digest
SHA256 761ca022c8b871f1db1653fc1a7e339150ddc279a25bb722f414d0ab4a65ddcf
MD5 710b0c869046d16a10e3ef71a8697759
BLAKE2b-256 791bfc7664897dc813c4a626d0fd0680bd5d1377fc5d20e323d892255384e1bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for tilelog-1.7.1a1-py3-none-any.whl:

Publisher: package.yml on openstreetmap/tilelog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page