Skip to main content

Analyzes the entire history of a macOS Messages conversation

Project description

iMessage Conversation Analyzer

Copyright 2020-2026 Caleb Evans
Released under the MIT license

tests Coverage Status

iMessage Conversation Analyzer (ICA) is a fully-typed Python library (and CLI utility) that will read the contents of an iMessage conversation via the Messages database on macOS. You can then gather various metrics of interest from the messages in that conversation.

Much of this program was inspired by and built using findings from this blog post by Yorgos Askalidis.

Installation

Open a Terminal and run the following:

pip3 install imessage-conversation-analyzer

You can also install ICA via uv:

uv tool install imessage-conversation-analyzer

Usage

The package includes both a Command Line API for simplicity/convenience, as well as a Python API for developers who want maximum flexibility.

Command Line API

To use ICA from the command line, run the ica command from the Terminal. The minimum required arguments are:

  1. A path to an analyzer file to run, or the name of a built-in analyzer
  2. The first and last name of the contact(s), via the --contact / -c flag
    1. If the contact has no last name on record, you can just pass the first name
    2. You can also pass any phone number or email address associated with the contact; keep in mind that analysis will still run on all phone numbers / email addresses associated with the contact, not just the one you specify
    3. For group chats, simply pass multiple --contact / -c flags

Example

ica message_totals -c 'Thomas Riverstone' -c 'Daniel Brightingale'

The following outputs a table like:

Metric               Total
Messages             20036
Messages From Me      7000
Messages From Daniel  6501
Messages From Thomas  6535
Reactions             4880
Reactions From Me     1700
Reactions From Daniel 1675
Reactions From Thomas 1505
Days Messaged          115
Days Missed              0
Days With No Reply       0

Built-in analyzers

ICA includes several built-in analyzers out of the box:

  1. message_totals: a summary of message and reaction counts, by person and in total, as well as other insightful metrics
  2. attachment_totals: lists count data by attachment type, including number of Spotify links shared, YouTube videos, Apple Music, etc.
  3. most_frequent_emojis: count data for the top 10 most frequently used emojis across the entire conversation
  4. totals_by_day: a comprehensive breakdown of message totals for every day you and the other participants have been messaging in the conversation
  5. transcript: a full, unedited transcript of every message, including reactions, between you and the other participants (attachment files not included)
  6. count_phrases: count the number of case-insensitive occurrences of any arbitrary strings across all messages in a conversation (excluding reactions); use the -s / --case-sensitive option for case-sensitive counts, and the -r / --use-regex option to enable regular expression mode for all phrases you specify
  7. from_sql: execute an arbitrary SQL query against the conversation data (messages and attachments), using an in-memory SQLite database

Filtering

There are several built-in flags you can use to filter messages and attachments.

  • --from-date: A start date to filter messages by (inclusive); the format must be ISO 8601-compliant, e.g. YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS
  • --to-date: An end date to filter messages by (exclusive); the format must be ISO 8601-compliant, e.g. YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS
  • --from-person / -p: A reference to the person by whom to filter messages; accepted values can be me, them, all, or another participant; you can specifying another participant using their first name, full name, phone number, or email address (defaults to all); to filter by multiple people, pass this flag multiple times (e.g. -p Thomas -p Daniel)
ica message_totals -c 'Thomas Riverstone' --from-date 2024-12-01 --to-date 2025-01-01 --from-person 'Thomas'
# Filtering by more than one person
ica message_totals -c 'Thomas Riverstone' -c 'Daniel Brightingale' --from-date 2024-12-01 --to-date 2025-01-01 --from-person 'Thomas' --from-person 'Jane'

Other formats

You can optionally pass the -f/--format flag to output to a specific format like CSV (supported formats include csv, excel/xlsx, markdown/md, and json).

ica message_totals -c 'Thomas Riverstone' -f csv
ica ./my_custom_analyzer.py -c 'Thomas Riverstone' -f csv

Writing to a file

Finally, there is an optional -o/--output flag if you want to output to a specified file. ICA will do its best to infer the format from the file extension, although you could also pass --format if you have special filename requirements.

ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx

Python API

The Python API is much more powerful, allowing you to integrate ICA into any type of Python project that can run on macOS. All of the built-in analyzers (under the ica/analyzers directory) actually use this API.

Here's a complete example that shows how to retrieve the transcript of an entire iMessage conversation with one or more other people.

# get_my_transcript.py

import pandas as pd

import ica


# Export a transcript of the entire conversation
def main() -> None:
    # Allow your program to accept all the same CLI arguments as the `ica`
    # command; you can skip calling this if have other means of specifying the
    # contact name and output format; you can also add your own arguments this
    # way (see the count_phrases analyzer for an example of this)
    cli_args = ica.get_cli_parser().parse_args(
        namespace=ica.TypedCLIArguments()
    )
    # Retrieve the dataframes corresponding to the processed contents of the
    # database; dataframes include `messages` and `attachments`
    dfs = ica.get_dataframes(
        contacts=cli_args.contacts,
        timezone=cli_args.timezone,
        from_date=cli_args.from_date,
        to_date=cli_args.to_date,
        from_people=cli_args.from_people,
    )
    # Send the results to stdout (or to file) in the given format
    ica.output_results(
        pd.DataFrame(
            {
                "timestamp": dfs.messages["datetime"],
                "is_from_me": dfs.messages["is_from_me"],
                "is_reaction": dfs.messages["is_reaction"],
                # U+FFFC is the object replacement character, which appears as
                # the textual message for every attachment
                "message": dfs.messages["text"].replace(
                    r"\ufffc", "(attachment)", regex=True
                ),
            }
        ),
        # The default format (None) corresponds to the pandas default dataframe
        # table format
        format=cli_args.format,
        # When output is None (the default), ICA will print to stdout
        output=cli_args.output,
        # Make certain column labels more human-friendly with
        # prettified_label_overrides
        prettified_label_overrides={
            'is_from_me': 'Is from Me?',
            'is_reaction': 'Is Reaction?'
        }
    )


if __name__ == "__main__":
    main()

You can run the above program using the ica command, or execute it directly like any other Python program.

ica ./get_my_transcript.py -c 'Thomas Riverstone'
python ./get_my_transcript.py -c 'Thomas Riverstone'
python -m get_my_transcript -c 'Thomas Riverstone'

You're not limited to writing a command line program, though! The ica.get_dataframes() function is the only function you will need in any analyzer program. But beyond that, feel free to import other modules, send your results to other processes, or whatever you need to do!

Errors and exceptions

  • BaseAnalyzerException: the base exception class for all library-related errors and exceptions
  • ContactNotFoundError: raised if the specified contact was not found
  • ConversationNotFoundError: raised if the specified conversation was not found
  • FormatNotSupportedError: raised if the specified format is not supported by the library

Using a specific timezone

By default, all dates and times are in the local timezone of the system on which ICA is run. If you'd like to change this, you can pass the --timezone / -t option to the CLI with an IANA timezone name.

ica totals_by_day -c 'Daniel Brightingale' -t UTC
ica totals_by_day -c 'Daniel Brightingale' -t America/New_York

The equivalent option for the Python API is the timezone parameter to ica.get_dataframes:

dfs = ica.get_dataframes(contact=my_contact, timezone='UTC')

Data Schema

All analyzers (including the built-in from_sql analyzer and any custom analyzers you write) have access to the following dataframes/tables. An object with these dataframes are returned by the ica.get_dataframes() function in the Python API.

messages

A list of all messages in the conversation, including text messages and reactions.

Column Type Description
ROWID int The unique identifier of the message
text str The content of the message
datetime datetime.datetime The timestamp of the message whose timezone is based on the timezone parameter you pass to get_dataframes() (defaults to the system's local timezone)
sender_display_name str A display name representing the sender of the message; can be a first name, full name, phone number, email address, or "Me" if is_from_me is true for that message
sender_handle str The specific handle (phone number or email address) from which the sender sent the message
is_from_me bool Whether the message was sent by you (True) or another participant (False)
is_reaction bool Whether the message is a reaction (e.g. "Loved ...")

attachments

A list of all attachments in the conversation, including images, videos, audio, and any other types of files. Please note that no content is included, only metadata.

Column Type Description
ROWID int The unique identifier of the attachment
filename str The filename of the attachment
mime_type str The MIME type of the attachment (e.g. image/jpeg)
message_id int The ROWID of the associated message
datetime datetime.datetime The localized timestamp of the message
is_from_me bool Whether the attachment was sent by you (True) or another participant (False)
sender_handle str The specific handle (phone number or email address) from which the sender sent the attachment

handles

A list of all handles (phone numbers and email addresses) associated with the participants of the conversation (other than the host user / "me"). This allows for easy joining with the messages dataframe.

Column Type Description
handle_id int The unique numeric ID of the handle
name str The full name of the contact associated with the handle
first_name str The first name of the participant (as found on their contact record)
last_name str The last name of the participant (as found on their contact record)
identifier str The specific handle (phone number or email address) belonging to the participant
contact_id str The unique identifier of the contact record
display_name str A unique display name for the participant; can be a first name, full name, phone number, or email address (to ensure uniqueness)

SQL Functions

The Python API also exposes several powerful functions that allow you to query your conversation data using SQL. This is powered by an in-memory SQLite database that is automatically populated with the available iMessage dataframes. Please refer to the Data Schema section above for details on the available tables and their columns.

  • get_sql_connection(dfs): A context manager which creates a temporary in-memory SQLite database from your ICA dataframes, allowing you to operate on them with the ica.execute_sql_query() function (documented below)
  • execute_sql_query(query, con): Executes a SQL query against the connection provided by get_sql_connection; returns a pandas dataframe with the results
import ica

def main() -> None:
    # Retrieve conversation data
    dfs = ica.get_dataframes(contacts=["Jane Doe"])

    # Run SQL queries against the data
    with ica.get_sql_connection(dfs) as con:
        results = ica.execute_sql_query(
            "SELECT * FROM messages WHERE is_from_me = 1",
            con
        )
        ica.output_results(results)

if __name__ == "__main__":
    main()

Developer Setup

The following instructions are written for developers who want to run the package locally or write their own analyzers.

We recommend using the uv package manager for easier environment and dependency management (instructions).

1. Install uv

curl -LsSf https://astral.sh/uv/install.sh | sh

2. Create virtual environment and install dependencies

uv sync

3. Run CLI like normal

When you install ICA with uv, an editable installation of the package gets installed into the virtual environment that uv creates for you. This allows you to make changes to the source code and continue to invoke ica like normal:

ica message_totals -c 'Thomas Riverstone'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imessage_conversation_analyzer-3.1.0.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imessage_conversation_analyzer-3.1.0-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file imessage_conversation_analyzer-3.1.0.tar.gz.

File metadata

  • Download URL: imessage_conversation_analyzer-3.1.0.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for imessage_conversation_analyzer-3.1.0.tar.gz
Algorithm Hash digest
SHA256 4a44281e70b87a85b64b13e9c9c299b9980205f4f830efa6be266058ca7fa8dc
MD5 7f3b4cbe499ded31eb76d0aa0d8b7481
BLAKE2b-256 d6e004e79a719b94ca1d8b5c39d42e3327357f0d310dc8d59e698891ffb36222

See more details on using hashes here.

File details

Details for the file imessage_conversation_analyzer-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: imessage_conversation_analyzer-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for imessage_conversation_analyzer-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 22c76644ed085174488a94cef66bb13d1a460325da5622653fcebe33843dfa44
MD5 fda137cfd81a100f6ad53ee201aa3fa6
BLAKE2b-256 d6ddba127d7adc10664d609e6e6179156234eb82574619f7783dfbb17c4732c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page