Skip to main content

Archive user data from Mapillary

Project description

🗺️ Mapillary Downloader

Download your Mapillary data before it's gone.

▶️ Installation

Installation is optional, you can prefix the command with uvx or pipx to download and run it. Or if you're oldskool you can do:

pip install mapillary-downloader

❓ Usage

First, get your Mapillary API access token from the developer dashboard

# Set token via environment variable (recommended)
export MAPILLARY_TOKEN=YOUR_TOKEN
mapillary-downloader USERNAME1 USERNAME2 USERNAME3

# Or pass token directly, and have it in your shell history 💩👀
mapillary-downloader --token YOUR_TOKEN USERNAME1 USERNAME2

# Download to specific directory
mapillary-downloader --output ./downloads USERNAME1
option because default
usernames One or more Mapillary usernames (required)
--token Mapillary API token (or env var) $MAPILLARY_TOKEN
--output Output directory ./mapillary_data
--quality 256, 1024, 2048 or original original
--bbox west,south,east,north None
--no-webp Don't convert to WebP False
--max-workers Maximum number of parallel download workers CPU count
--no-tar Don't tar bucket directories False
--no-check-ia Don't check if exists on Internet Archive False

The downloader will:

  • 🏛️ Check Internet Archive to avoid duplicate downloads
  • 📷 Download multiple users' images organized by sequence
  • 📜 Inject EXIF metadata (GPS coordinates, camera info, timestamps, compass direction) and XMP data for panoramas.
  • 🗜️ Convert to WebP (by default) to save ~70% disk space
  • 🛟 Save progress every 5 minutes so you can safely resume if interrupted ()
  • 📦 Tar sequence directories (by default) for faster uploads to Internet Archive

🖼️ WebP Conversion

You'll need the cwebp binary installed:

# Debian/Ubuntu
sudo apt install webp

# macOS
brew install webp

To disable WebP conversion and keep original JPEGs, use --no-webp:

📦 Tarballs

Images are organized by capture date (YYYY-MM-DD) for incremental archiving:

mapillary-username-quality/
  2024-01-15/
    abc123/
      image1.webp
      image2.webp
    bcd456/
      image3.webp
  2024-01-16/
    def789/
      image4.webp

By default, these date directories are automatically tarred after download (2024-01-15.tar, 2024-01-16.tar, etc.). Reasons:

  • ⤴️ Incremental uploads. Add more to a collection. Well, eventually anyway. This won't work yet unless you delete the jsonl file and start again.
  • 📂 Fewer files - ~365 days/year × 10 years = 3,650 tars max. IA only want 5k items per collection
  • 🧨 Avoids blowing up IA's derive workers. We don't want Brewster's computers to create thumbs for 2 billion images.
  • 💾 I like to have a few inodes available for things other than this. I'm sure you do too.

To keep individual files instead of creating tars, use the --no-tar flag.

🏛️ Internet Archive upload

I've written a bash tool to rip media then tag, queue, and upload to The Internet Archive. The metadata is in the same format. If you symlink your ./mapillary_data dir to rip's 4.ship dir, they'll be queued for upload.

See inlay for details:

📊 Stats

To see overall project progress, or an estimate, use --stats

🚧 Development

make dev      # Setup dev environment
make test     # Run tests. Note: requires `exiftool`
make dist     # Build the distribution
make help     # See other make options

🔗 Links

⚖️ License

WTFPL with one additional clause

  1. Don't blame me

Do wtf you want, but don't blame me if it makes jokes about the size of your disk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mapillary_downloader-0.9.1.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mapillary_downloader-0.9.1-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file mapillary_downloader-0.9.1.tar.gz.

File metadata

  • Download URL: mapillary_downloader-0.9.1.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mapillary_downloader-0.9.1.tar.gz
Algorithm Hash digest
SHA256 71a7416a55cfe45c6011862fa00ff0a8e90c6d411caefd5ad61ba50346109002
MD5 d3a8466cb9979b7da6b14f8ff7c2501b
BLAKE2b-256 5704d66d8ce4a0383790102cbf2244681ea88c3961d840da792ea66e1e0d2d0b

See more details on using hashes here.

File details

Details for the file mapillary_downloader-0.9.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mapillary_downloader-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ead3032925e27e9f6edf12b207e2f7d883558a421ebe36b3abf85fed2d3dadd0
MD5 e5719ad8d5970ac38ce778e6f7f6453d
BLAKE2b-256 cca5793c3dc97c1162623d90e3d608832d6cd8450c1f27f456e37776fcaea80f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page