Skip to main content

A Python package for wrangling EXIF data extracted from images using Phil Harvey's EXIFTool.

Project description

Wrangle EXIF Data in Python

A Python package for wrangling EXIF data extracted from images using Phil Harvey's EXIFTool.

Set-up

Install pyexifwrangle with pip

$ pip install pyexifwrangle

Install Phil Harvey's EXIFTool

Install Phil Harvey's EXIFTool from https://exiftool.org/. This site has installation instructions if you need them.

Usage

Get EXIF data

After installing EXIFTool, you can extract EXIF data from every image in a folder, including subdirectories, save the results in a csv file, and return the results in a Pandas DataFrame.

import pyexifwrangle.wrangle as wr

df = wr.get_exif(input_dir='path/to/images', output_csv='path/to/output.csv')

Load the EXIF data

If you already used get_exif() to save the EXIF to a csv file, you can use read_exif() to load the csv file into a Pandas data frame. In this case, the output of get_exif() is the same as the output from read_exif().

import pyexifwrangle.wrangle as wr

df = wr.read_exif('path/to/output.csv', filename_col='SourceFile')

The function wrangle.read_exif uses the Pandas package to load the csv into a data frame. The parameter filename_col is the name of the column that contains the filenames of the images. The absolute file paths are included with the filenames in the filename_col. After reading the EXIF data into a Pandas data frame, this function removes any images whose filename starts with '.'.

Make columns from folder names

I often organize my images into folders and sub-folders. For example one of my projects has the following folder tree:

├── Samsung_phones  # main directory
│   ├── s21  # model
│   │   ├── s21_1  # phone name
│   │   │ 	├── blank  # scene type
│   │   │	│	├── front  # camera
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── telephoto
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── ultra
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── wide
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │ 	├── natural
│   │   │	│	├── front  
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide
│   │   ├── s21_2 
│   │   │ 	├── blank
│   │   │	│	├── front
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide
│   │   │ 	├── natural
│   │   │	│	├── front  
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide

Extract the folder names from the images' absolute filepaths and make a new column for each folder.

df = wr.filename2columns(df=df, filename_col='SourceFile', columns=['model', 'phone', 'scene_type', 'camera', 'image'])

Search for missing EXIF data

Find images missing EXIF data. For example, search the data frame for images that don't have an Aperture.

missing = wr.check_missing_exif(df=df, column='Aperture')

Count images per group(s)

Group images by column(s) and count the number of images per group.

counts = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'])

Optionally, you can sort the output of count_images_by_columns

counts_sorted = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'], sorted=
['phone', 'camera', 'Aperture'])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyexifwrangle-0.2.1.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

pyexifwrangle-0.2.1-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file pyexifwrangle-0.2.1.tar.gz.

File metadata

  • Download URL: pyexifwrangle-0.2.1.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for pyexifwrangle-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f2395731b04181dba636cc879b18f034891eff8ea438892187198011c7d83980
MD5 f2a606a9673f567c9ff898cac1a7277f
BLAKE2b-256 53d43902d35bf575efc94b4316d6fa90b8cfa6f533d7f867b467e7427371fe55

See more details on using hashes here.

File details

Details for the file pyexifwrangle-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pyexifwrangle-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7de630f79d34037861d74f847f98cf48f5a2df2552bb4be241d62d9cd308e07b
MD5 4dcf492238865027c8773813ca07a869
BLAKE2b-256 6923a559f5bedbdaa825b0770e3b46ae99f3935dc5aa04d250e53e912c9e919b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page