Skip to main content

A helper package for wrangling image EXIF data

Project description

Wrangle EXIF Data in Python

A Python package for wrangling EXIF data extracted from images using Phil Harvey's EXIFTool.

Set-up

Install pyexifwrangle with pip

$ pip install pyexifwrangle

Install Phil Harvey's EXIFTool

Install Phil Harvey's EXIFTool from https://exiftool.org/. This site has installation instructions if you need them.

Usage

Get EXIF data using EXIFTool

After installing EXIFTool, you can use the terminal to extract EXIF data from every image in a folder and save the results in a csv file. Open the terminal and change directories to the folder containing EXIFTool. On my computer, this step looks like

$ cd ~/Documents/Image-ExifTool-12.49

Extract the EXIF data and save it in a csv file.

$ exiftool -csv -r absolute_path/to/image_directory > path/to/output.csv

The image_directory can include subdirectories and the flag -r tells EXIFTool to include images in these subdirectories in the output.

Wrangle the EXIF data in Python

Load the csv file into Python.

import pyexifwrangle.wrangle as wr

df = wr.read_exif('path/to/output.csv', filename_col='SourceFile')

The function wrangle.read_exif uses the Pandas package to load the csv into a data frame. The parameter filename_col is the name of the column that contains the filenames of the images. The absolute file paths are included with the filenames in the filename_col. After reading the EXIF data into a Pandas data frame, this function removes any images whose filename starts with '.'.

I often organize my images into folders and sub-folders. For example one of my projects has the following folder tree:

├── Samsung_phones  # main directory
│   ├── s21  # model
│   │   ├── s21_1  # phone name
│   │   │ 	├── blank  # scene type
│   │   │	│	├── front  # camera
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── telephoto
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── ultra
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │	│	├── wide
│   │   │	│	│	├──image1.jpg
│   │   │	│	│	├──image2.jpg
│   │   │	│	│	├──...
│   │   │ 	├── natural
│   │   │	│	├── front  
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide
│   │   ├── s21_2 
│   │   │ 	├── blank
│   │   │	│	├── front
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide
│   │   │ 	├── natural
│   │   │	│	├── front  
│   │   │	│	├── telephoto
│   │   │	│	├── ultra
│   │   │	│	├── wide

Extract the folder names from the images' absolute filepaths and make a new column for each folder.

df = wr.filename2columns(df=df, filename_col='SourceFile', columns=['model', 'phone', 'scene_type', 'camera', 'image'])

Find images missing EXIF data. For example, search the data frame for images that don't have an Aperture.

missing = wr.check_missing_exif(df=df, column='Aperture')

Group images by column(s) and count the number of images per group.

counts = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'])

Optionally, you can sort the output of count_images_by_columns

counts_sorted = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'], sorted=
['phone', 'camera', 'Aperture'])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyexifwrangle-0.1.2.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page