A Python package for wrangling EXIF data extracted from images using Phil Harvey's EXIFTool.
Project description
Wrangle EXIF Data in Python
A Python package for wrangling EXIF data extracted from images using Phil Harvey's EXIFTool.
Set-up
Install pyexifwrangle with pip
$ pip install pyexifwrangle
Install Phil Harvey's EXIFTool
Install Phil Harvey's EXIFTool from https://exiftool.org/. This site has installation instructions if you need them.
Usage
Get EXIF data
After installing EXIFTool, you can extract EXIF data from every image in a folder, including subdirectories, save the results in a csv file, and return the results in a Pandas DataFrame.
import pyexifwrangle.wrangle as wr
df = wr.get_exif(input_dir='path/to/images', output_csv='path/to/output.csv')
Wrangle the EXIF data in Python
If you already used get_exif() to save the EXIF to a csv file, you can use read_exif() to load the csv file into a Pandas data frame. In this case, the output of get_exif() is the same as the output from read_exif().
import pyexifwrangle.wrangle as wr
df = wr.read_exif('path/to/output.csv', filename_col='SourceFile')
The function wrangle.read_exif uses the Pandas package to load the csv into a data frame. The parameter filename_col is the name of the column that contains the filenames of the images. The absolute file paths are included with the filenames in the filename_col. After reading the EXIF data into a Pandas data frame, this function removes any images whose filename starts with '.'.
I often organize my images into folders and sub-folders. For example one of my projects has the following folder tree:
├── Samsung_phones # main directory
│ ├── s21 # model
│ │ ├── s21_1 # phone name
│ │ │ ├── blank # scene type
│ │ │ │ ├── front # camera
│ │ │ │ │ ├──image1.jpg
│ │ │ │ │ ├──image2.jpg
│ │ │ │ │ ├──...
│ │ │ │ ├── telephoto
│ │ │ │ │ ├──image1.jpg
│ │ │ │ │ ├──image2.jpg
│ │ │ │ │ ├──...
│ │ │ │ ├── ultra
│ │ │ │ │ ├──image1.jpg
│ │ │ │ │ ├──image2.jpg
│ │ │ │ │ ├──...
│ │ │ │ ├── wide
│ │ │ │ │ ├──image1.jpg
│ │ │ │ │ ├──image2.jpg
│ │ │ │ │ ├──...
│ │ │ ├── natural
│ │ │ │ ├── front
│ │ │ │ ├── telephoto
│ │ │ │ ├── ultra
│ │ │ │ ├── wide
│ │ ├── s21_2
│ │ │ ├── blank
│ │ │ │ ├── front
│ │ │ │ ├── telephoto
│ │ │ │ ├── ultra
│ │ │ │ ├── wide
│ │ │ ├── natural
│ │ │ │ ├── front
│ │ │ │ ├── telephoto
│ │ │ │ ├── ultra
│ │ │ │ ├── wide
Extract the folder names from the images' absolute filepaths and make a new column for each folder.
df = wr.filename2columns(df=df, filename_col='SourceFile', columns=['model', 'phone', 'scene_type', 'camera', 'image'])
Find images missing EXIF data. For example, search the data frame for images that don't have an Aperture.
missing = wr.check_missing_exif(df=df, column='Aperture')
Group images by column(s) and count the number of images per group.
counts = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'])
Optionally, you can sort the output of count_images_by_columns
counts_sorted = wr.count_images_by_columns(df=df, columns=['model', 'phone', 'scene_type', 'camera'], sorted=
['phone', 'camera', 'Aperture'])
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pyexifwrangle-0.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 360f2673f920379e61245207eafc8d6ad6663f6160feb67d56d78f30d19dcf75 |
|
MD5 | 621f6132d62901af9808f06e32a53e23 |
|
BLAKE2b-256 | 009135bf96322e9ff55ece98d7695e24de729929f1af257ce960aba547f264f0 |