Analyzes the entire history of a macOS Messages conversation
Project description
iMessage Conversation Analyzer
Copyright 2020-2026 Caleb Evans
Released under the MIT license
iMessage Conversation Analyzer (ICA) is a fully-typed Python library (and CLI utility) that will read the contents of an iMessage conversation via the Messages database on macOS. You can then gather various metrics of interest from the messages in that conversation.
Much of this program was inspired by and built using findings from this blog post by Yorgos Askalidis.
Installation
Open a Terminal and run the following:
pip3 install imessage-conversation-analyzer
You can also install ICA via uv:
uv tool install imessage-conversation-analyzer
Usage
The package includes both a Command Line API for simplicity/convenience, as well as a Python API for developers who want maximum flexibility.
Command Line API
To use ICA from the command line, run the ica command from the Terminal. The
minimum required arguments are:
- A path to an analyzer file to run, or the name of a built-in analyzer
- The first and last name of the contact(s), via the
--contact/-cflag- If the contact has no last name on record, you can just pass the first name
- You can also pass any phone number or email address associated with the contact; keep in mind that analysis will still run on all phone numbers / email addresses associated with the contact, not just the one you specify
- For group chats, simply pass multiple
--contact/-cflags
Example
ica message_totals -c 'Thomas Riverstone' -c 'Daniel Brightingale'
The following outputs a table like:
Metric Total
Messages 20036
Messages From Me 7000
Messages From Daniel 6501
Messages From Thomas 6535
Reactions 4880
Reactions From Me 1700
Reactions From Daniel 1675
Reactions From Thomas 1505
Days Messaged 115
Days Missed 0
Days With No Reply 0
Built-in analyzers
ICA includes several built-in analyzers out of the box:
message_totals: a summary of message and reaction counts, by person and in total, as well as other insightful metricsattachment_totals: lists count data by attachment type, including number of Spotify links shared, YouTube videos, Apple Music, etc.most_frequent_emojis: count data for the top 10 most frequently used emojis across the entire conversationtotals_by_day: a comprehensive breakdown of message totals for every day you and the other participants have been messaging in the conversationtranscript: a full, unedited transcript of every message, including reactions, between you and the other participants (attachment files not included)count_phrases: count the number of case-insensitive occurrences of any arbitrary strings across all messages in a conversation (excluding reactions); use the-s/--case-sensitiveoption for case-sensitive counts, and the-r/--use-regexoption to enable regular expression mode for all phrases you specifyfrom_sql: execute an arbitrary SQL query against the conversation data (messages and attachments), using an in-memory SQLite database
Filtering
There are several built-in flags you can use to filter messages and attachments.
--from-date: A start date to filter messages by (inclusive); the format must be ISO 8601-compliant, e.g. YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS--to-date: An end date to filter messages by (exclusive); the format must be ISO 8601-compliant, e.g. YYYY-MM-DD or YYYY-MM-DDTHH:MM:SS--from-person/-p: A reference to the person by whom to filter messages; accepted values can beme,them,all, or another participant; you can specifying another participant using their first name, full name, phone number, or email address (defaults toall); to filter by multiple people, pass this flag multiple times (e.g.-p Thomas -p Daniel)
ica message_totals -c 'Thomas Riverstone' --from-date 2024-12-01 --to-date 2025-01-01 --from-person 'Thomas'
# Filtering by more than one person
ica message_totals -c 'Thomas Riverstone' -c 'Daniel Brightingale' --from-date 2024-12-01 --to-date 2025-01-01 --from-person 'Thomas' --from-person 'Jane'
Other formats
You can optionally pass the -f/--format flag to output to a specific format
like CSV (supported formats include csv, excel/xlsx, markdown/md, and json).
ica message_totals -c 'Thomas Riverstone' -f csv
ica ./my_custom_analyzer.py -c 'Thomas Riverstone' -f csv
Writing to a file
Finally, there is an optional -o/--output flag if you want to output to a
specified file. ICA will do its best to infer the format from the file
extension, although you could also pass --format if you have special filename
requirements.
ica transcript -c 'Thomas Riverstone' -o ./my_transcript.xlsx
Python API
The Python API is much more powerful, allowing you to integrate ICA into any
type of Python project that can run on macOS. All of the built-in analyzers
(under the ica/analyzers directory) actually use this API.
Here's a complete example that shows how to retrieve the transcript of an entire iMessage conversation with one or more other people.
# get_my_transcript.py
import pandas as pd
import ica
# Export a transcript of the entire conversation
def main() -> None:
# Allow your program to accept all the same CLI arguments as the `ica`
# command; you can skip calling this if have other means of specifying the
# contact name and output format; you can also add your own arguments this
# way (see the count_phrases analyzer for an example of this)
cli_args = ica.get_cli_parser().parse_args(
namespace=ica.TypedCLIArguments()
)
# Retrieve the dataframes corresponding to the processed contents of the
# database; dataframes include `messages` and `attachments`
dfs = ica.get_dataframes(
contacts=cli_args.contacts,
timezone=cli_args.timezone,
from_date=cli_args.from_date,
to_date=cli_args.to_date,
from_people=cli_args.from_people,
)
# Send the results to stdout (or to file) in the given format
ica.output_results(
pd.DataFrame(
{
"timestamp": dfs.messages["datetime"],
"is_from_me": dfs.messages["is_from_me"],
"is_reaction": dfs.messages["is_reaction"],
# U+FFFC is the object replacement character, which appears as
# the textual message for every attachment
"message": dfs.messages["text"].replace(
r"\ufffc", "(attachment)", regex=True
),
}
),
# The default format (None) corresponds to the pandas default dataframe
# table format
format=cli_args.format,
# When output is None (the default), ICA will print to stdout
output=cli_args.output,
# Make certain column labels more human-friendly with
# prettified_label_overrides
prettified_label_overrides={
'is_from_me': 'Is from Me?',
'is_reaction': 'Is Reaction?'
}
)
if __name__ == "__main__":
main()
You can run the above program using the ica command, or execute it directly
like any other Python program.
ica ./get_my_transcript.py -c 'Thomas Riverstone'
python ./get_my_transcript.py -c 'Thomas Riverstone'
python -m get_my_transcript -c 'Thomas Riverstone'
You're not limited to writing a command line program, though! The
ica.get_dataframes() function is the only function you will need in any
analyzer program. But beyond that, feel free to import other modules, send your
results to other processes, or whatever you need to do!
Errors and exceptions
BaseAnalyzerException: the base exception class for all library-related errors and exceptionsContactNotFoundError: raised if the specified contact was not foundConversationNotFoundError: raised if the specified conversation was not foundFormatNotSupportedError: raised if the specified format is not supported by the library
Using a specific timezone
By default, all dates and times are in the local timezone of the system on which
ICA is run. If you'd like to change this, you can pass the --timezone / -t
option to the CLI with an IANA timezone name.
ica totals_by_day -c 'Daniel Brightingale' -t UTC
ica totals_by_day -c 'Daniel Brightingale' -t America/New_York
The equivalent option for the Python API is the timezone parameter to
ica.get_dataframes:
dfs = ica.get_dataframes(contact=my_contact, timezone='UTC')
Data Schema
All analyzers (including the built-in from_sql analyzer and any custom
analyzers you write) have access to the following dataframes/tables. An object
with these dataframes are returned by the ica.get_dataframes() function in the
Python API.
messages
A list of all messages in the conversation, including text messages and reactions.
| Column | Type | Description |
|---|---|---|
ROWID |
int |
The unique identifier of the message |
text |
str |
The content of the message |
datetime |
datetime.datetime |
The timestamp of the message whose timezone is based on the timezone parameter you pass to get_dataframes() (defaults to the system's local timezone) |
sender_display_name |
str |
A display name representing the sender of the message; can be a first name, full name, phone number, email address, or "Me" if is_from_me is true for that message |
sender_handle |
str |
The specific handle (phone number or email address) from which the sender sent the message |
is_from_me |
bool |
Whether the message was sent by you (True) or another participant (False) |
is_reaction |
bool |
Whether the message is a reaction (e.g. "Loved ...") |
attachments
A list of all attachments in the conversation, including images, videos, audio, and any other types of files. Please note that no content is included, only metadata.
| Column | Type | Description |
|---|---|---|
ROWID |
int |
The unique identifier of the attachment |
filename |
str |
The filename of the attachment |
mime_type |
str |
The MIME type of the attachment (e.g. image/jpeg) |
message_id |
int |
The ROWID of the associated message |
datetime |
datetime.datetime |
The localized timestamp of the message |
is_from_me |
bool |
Whether the attachment was sent by you (True) or another participant (False) |
sender_handle |
str |
The specific handle (phone number or email address) from which the sender sent the attachment |
handles
A list of all handles (phone numbers and email addresses) associated with the
participants of the conversation (other than the host user / "me"). This allows
for easy joining with the messages dataframe.
| Column | Type | Description |
|---|---|---|
handle_id |
int |
The unique numeric ID of the handle |
name |
str |
The full name of the contact associated with the handle |
first_name |
str |
The first name of the participant (as found on their contact record) |
last_name |
str |
The last name of the participant (as found on their contact record) |
identifier |
str |
The specific handle (phone number or email address) belonging to the participant |
contact_id |
str |
The unique identifier of the contact record |
display_name |
str |
A unique display name for the participant; can be a first name, full name, phone number, or email address (to ensure uniqueness) |
SQL Functions
The Python API also exposes several powerful functions that allow you to query your conversation data using SQL. This is powered by an in-memory SQLite database that is automatically populated with the available iMessage dataframes. Please refer to the Data Schema section above for details on the available tables and their columns.
get_sql_connection(dfs): A context manager which creates a temporary in-memory SQLite database from your ICA dataframes, allowing you to operate on them with theica.execute_sql_query()function (documented below)execute_sql_query(query, con): Executes a SQL query against the connection provided byget_sql_connection; returns a pandas dataframe with the results
import ica
def main() -> None:
# Retrieve conversation data
dfs = ica.get_dataframes(contacts=["Jane Doe"])
# Run SQL queries against the data
with ica.get_sql_connection(dfs) as con:
results = ica.execute_sql_query(
"SELECT * FROM messages WHERE is_from_me = 1",
con
)
ica.output_results(results)
if __name__ == "__main__":
main()
Developer Setup
The following instructions are written for developers who want to run the package locally or write their own analyzers.
We recommend using the uv package manager for easier environment and dependency management (instructions).
1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
2. Create virtual environment and install dependencies
uv sync
3. Run CLI like normal
When you install ICA with uv, an editable installation of the package gets
installed into the virtual environment that uv creates for you. This allows you
to make changes to the source code and continue to invoke ica like normal:
ica message_totals -c 'Thomas Riverstone'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imessage_conversation_analyzer-3.1.0.tar.gz.
File metadata
- Download URL: imessage_conversation_analyzer-3.1.0.tar.gz
- Upload date:
- Size: 22.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a44281e70b87a85b64b13e9c9c299b9980205f4f830efa6be266058ca7fa8dc
|
|
| MD5 |
7f3b4cbe499ded31eb76d0aa0d8b7481
|
|
| BLAKE2b-256 |
d6e004e79a719b94ca1d8b5c39d42e3327357f0d310dc8d59e698891ffb36222
|
File details
Details for the file imessage_conversation_analyzer-3.1.0-py3-none-any.whl.
File metadata
- Download URL: imessage_conversation_analyzer-3.1.0-py3-none-any.whl
- Upload date:
- Size: 30.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22c76644ed085174488a94cef66bb13d1a460325da5622653fcebe33843dfa44
|
|
| MD5 |
fda137cfd81a100f6ad53ee201aa3fa6
|
|
| BLAKE2b-256 |
d6ddba127d7adc10664d609e6e6179156234eb82574619f7783dfbb17c4732c5
|