Skip to main content

Process OpenStreetMap tile logs

Project description

Tilelog

Tilelog is used to generate tile logs for the OSMF Standard map layer.

Requirements

  • Access to Athena on the OSMF AWS account with the logs.
  • Python 3.6+

Install

For local development install uv and

uv venv
uv run tilelog --help

Usage

Usage: tilelog [OPTIONS]

Options:
  --date [%Y-%m-%d]   Date to generate logs for. Defaults to yesterday.
  --staging TEXT      AWS s3 location for Athena results
  --generate-success  Create logs of successful requests in Parquet
  --region TEXT       Region for Athena
  --tile FILENAME     File to output tile usage logs to
  --host FILENAME     File to output host usage logs to
  --app FILENAME      File to output app usage logs to
  --help              Show this message and exit.

e.g.

DATE=$(date -u -d "1 day ago" "+%Y-%m-%d")
tilelog --date ${DATE} --tile tiles-${DATE}.txt.xz --host hosts-${DATE}.csv --app apps-${DATE}.csv

--generate-success can only be run once for each day, so if doing development, should not generally be run as it will interfere with production.

Format documentation

Tile logs

Tile logs contain the number of requests per tile in a given 24 hour UTC day. Only tiles where at least 10 requests were made and requests came from at least 3 unique IPs are included for privacy reasons. Requests that were blocked, invalid, or unable to be served due to server load are not included (4xx and 5xx errors).

The format is one tile per line, in the format z/x/y N where z/x/y is the conventional tile coordinate and N is the number of requests.

No particular sorting order of lines is guaranteed.

Host logs

Host logs contain the website host of sites using tile.openstreetmap.org, their average requests/second, and their average requests/second that were cache misses in a given 24 hour UTC day. For privacy reasons, only sites with at least 432000 requests per day (5 requests/second average) coming from the site are included. Requests that were blocked, invalid, or unable to be served due to server load are not included (4xx and 5xx errors).

The host will normally be a valid domain name, but as this data comes from user requests it may contain other text.

The format is one host per line, in the CSV format "HOST",N,M where HOST is the host name, with special characters escaped, N is the requests/second, and M is the requests/second that were cache misses. Hosts are ordered by requests/second

The following may change in the future

  • Additional fields added at the end
  • The definition of "cache miss"
  • The theshold for requests/day to be included
  • Handling of invalid domains and empty referers

App logs

App logs contain the referer of non-website usage, primarily from stand-alone mobile and desktop programs. App name is derived from a combination of User-Agent, X-Requested-With, and non-website Referer.

Multiple app versions are combined and indicated with *.

The format is app host per line, in the CSV format "APP",N,M where APP is the app name, with special characters escaped, N is the requests/second, and M is the requests/second that were cache misses. Apps are ordered by requests/second

The following may change in the future

  • Additional fields added at the end
  • The definition of "cache miss"
  • The theshold for requests/day to be included
  • Combining of app versions

Contributing

Unfortunately, testing tilelogs requires access to private logs, making it difficult to test.

Style is flake8

Licence

Copyright (C) 2021-2022 Paul Norman

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tilelog-1.8.0.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tilelog-1.8.0-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file tilelog-1.8.0.tar.gz.

File metadata

  • Download URL: tilelog-1.8.0.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tilelog-1.8.0.tar.gz
Algorithm Hash digest
SHA256 daa74e0bde9506eac8537474bada911d5faec98ca5d85bda0baf2810173821c3
MD5 5bb692e4bec1d5e9948cc84ff8c4089a
BLAKE2b-256 fdcc75847507c6f12af685bd89d20d050c6b28a69ac5e4c22fccdd7d621dfd91

See more details on using hashes here.

Provenance

The following attestation bundles were made for tilelog-1.8.0.tar.gz:

Publisher: package.yml on openstreetmap/tilelog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tilelog-1.8.0-py3-none-any.whl.

File metadata

  • Download URL: tilelog-1.8.0-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tilelog-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ec094cb8965b5417a5db6f6386c1cf3ce3e8ddc0270a3ad812cb069c5a06377
MD5 d46ad4fed45c26d16169c66ef9e6cf25
BLAKE2b-256 27bceebc124a0066e37b54d4d21ab8d9b0774eca8816a8e185fc5b35d52d931b

See more details on using hashes here.

Provenance

The following attestation bundles were made for tilelog-1.8.0-py3-none-any.whl:

Publisher: package.yml on openstreetmap/tilelog

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page