Skip to main content

Tool to Process Smart Search Results and Identify Top Senders

Project description

Proofpoint Sender Analyzer

This tool helps identify the top senders based on smart search outbound message exports or CSV data.

Requirements:

  • Python 3.9+

Installing the Package

You can install the tool using the following command directly from Github.

pip install git+https://github.com/pfptcommunity/senderstats.git

or can install the tool using pip.

pip install senderstats

Use Cases:

Outbound message volumes and data transferred by:

  • Envelope sender
  • Header From:
  • Return-Path:
  • Envelope header: From:, MessageID Host, MessageID Domain (helpful to identify original sender)
  • Envelope sender and header From: for SPF alignment purposes
  • Random subject line sampling to help understand the type of traffic

Summarize message volume information:

  • Estimated application email traffic based on sender volume threshold:
    • Estimated application data
    • Estimated application messages
    • Estimated application average size
    • Estimated application peak hourly volume
  • Total outbound data
    • Total outbound data
    • Total outbound messages
    • Total outbound average size
    • Total outbound peak hourly volume

Usage Options:

usage: senderstats [-h] -i <file> [<file> ...] -o <xlsx> [--mfrom MFrom] [--hfrom HFrom] [--rpath RPath]
                   [--msgid MsgID] [--subject Subject] [--size MsgSz] [--date Date] [--gen-hfrom] [--gen-rpath]
                   [--gen-alignment] [--gen-msgid] [-t N] [--no-display-name] [--remove-prvs] [--decode-srs]
                   [--no-empty-hfrom] [--sample-subject] [--excluded-domains <domain> [<domain> ...]]
                   [--restrict-domains <domain> [<domain> ...]] [--excluded-senders <sender> [<sender> ...]]
                   [--date-format DateFmt] [--show-skip-detail]

This tool helps identify the top senders based on smart search outbound message exports.

Input / Output arguments (required):
  -i <file> [<file> ...], --input <file> [<file> ...]  Smart search files to read.
  -o <xlsx>, --output <xlsx>                           Output file

Field mapping arguments (optional):
  --mfrom MFrom                                        CSV field of the envelope sender address. (default=Sender)
  --hfrom HFrom                                        CSV field of the header From: address. (default=Header_From)
  --rpath RPath                                        CSV field of the Return-Path: address. (default=Header_Return-
                                                       Path)
  --msgid MsgID                                        CSV field of the message ID. (default=Message_ID)
  --subject Subject                                    CSV field of the Subject, only used if --sample-subject is
                                                       specified. (default=Subject)
  --size MsgSz                                         CSV field of message size. (default=Message_Size)
  --date Date                                          CSV field of message date/time. (default=Date)

Reporting control arguments (optional):
  --gen-hfrom                                          Generate report showing the header From: data for messages
                                                       being sent.
  --gen-rpath                                          Generate report showing return path for messages being sent.
  --gen-alignment                                      Generate report showing envelope sender and header From:
                                                       alignment
  --gen-msgid                                          Generate report showing parsed Message ID. Helps determine the
                                                       sending system
  -t N, --threshold N                                  Adjust summary report threshold for messages per day to be
                                                       considered application traffic. (default=100)

Parsing behavior arguments (optional):
  --no-display-name                                    Remove display and use address only. Converts 'Display Name
                                                       <user@domain.com>' to 'user@domain.com'
  --remove-prvs                                        Remove return path verification strings e.g.
                                                       prvs=tag=sender@domain.com
  --decode-srs                                         Convert sender rewrite scheme,
                                                       forwardmailbox+srs=hash=tt=domain.com=user to user@domain.com
  --no-empty-hfrom                                     If the header From: is empty the envelope sender address is
                                                       used
  --sample-subject                                     Enable probabilistic random sampling of subject lines found
                                                       during processing
  --excluded-domains <domain> [<domain> ...]           Exclude domains from processing.
  --restrict-domains <domain> [<domain> ...]           Constrain domains for processing.
  --excluded-senders <sender> [<sender> ...]           Exclude senders from processing.
  --date-format DateFmt                                Date format used to parse the timestamps.
                                                       (default=%Y-%m-%dT%H:%M:%S.%f%z)

Extended processing controls (optional):
  --show-skip-detail                                   Show skipped details

Usage:
  -h, --help                                           Show this help message and exit

Using the Tool with Proofpoint Smart Search

Export all outbound message traffic as a smart search CSV. You may need to export multiple CSVs if the data per time window exceeds 1M records. The tool can ingest multiple CSVs files at once.

smart_search_outbound

Once the files are downlaoded to a target folder, you can run the following command with the path to the files you downloaded and specify a wildard.

senderstats -i C:\path\to\downloaded\files\smart_search_results_custer_hosted_2024_03_04_*.csv -o C:\path\to\output\file\my_cluster_hosted.xlsx

Sample Output

The execution results should look similar to the following depending the options you select.

Files to be processed:
C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173552.csv
C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173855.csv
C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173656.csv
C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173754.csv
C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173834.csv

Domains excluded from processing:
knowledgefront.com
pphosted.com
ppops.net

Processing:  C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173552.csv
Processing:  C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173855.csv
Processing:  C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173656.csv
Processing:  C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173754.csv
Processing:  C:\Users\ljerabek\Downloads\smart_search_results_cluster_hosted_2024_03_04_173834.csv

File Processing Summary
Total Records:  4409754
Skipped Records:  2237796

Records by Day
2024-02-03: 43926
2024-02-04: 48567
2024-02-05: 82679
2024-02-06: 100960
2024-02-07: 97990
2024-02-08: 100370
2024-02-09: 85954
2024-02-10: 19740
2024-02-11: 15595
2024-02-12: 94800
2024-02-13: 99043
2024-02-14: 96919
2024-02-15: 95478
2024-02-16: 88463
2024-02-17: 19021
2024-02-18: 16961
2024-02-19: 81489
2024-02-20: 96920
2024-02-21: 103170
2024-02-22: 104562
2024-02-23: 81652
2024-02-24: 17902
2024-02-25: 16311
2024-02-26: 97154
2024-02-27: 99578
2024-02-28: 109633
2024-02-29: 104672
2024-03-01: 117695
2024-03-02: 20002
2024-03-03: 14752

Please see report: C:\Users\ljerabek\Downloads\my_cluster_hosted.xlsx

Sample Summary Statistics

image

Sample Details (Sender + From by Volume):

image

Sample Details (Message ID) Inferencing:

image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

senderstats-1.2.3.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

senderstats-1.2.3-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file senderstats-1.2.3.tar.gz.

File metadata

  • Download URL: senderstats-1.2.3.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for senderstats-1.2.3.tar.gz
Algorithm Hash digest
SHA256 24f5493acf828d8556cefd888a4722c275a0abf8869edb5203dae29f3955c8dc
MD5 7214136529aae7cdc4a780a5c0c51088
BLAKE2b-256 13a7610a31eb1220b0794f84fe541d045396425fc520ed936ce9969cb7bd0cd6

See more details on using hashes here.

File details

Details for the file senderstats-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: senderstats-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for senderstats-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 59e58f899dece9c10de5eb9dbbbd1ff10b5bebdcd68a286a40672c31511a0cd7
MD5 eaa17afb129bf18a19a85d6796f434a8
BLAKE2b-256 2371f19742b332930b2bd9d07641a288fd8b6a56bfdf6f4b99007e89effaea76

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page