Skip to main content

Code for the Master of Applied Data Science course Data Analysis and Visualization

Project description

This is the repository for the Master of Applied Data Science course "Data Analysis & Visualisation", previously known as "Data Mining & Exploration". All instructions assume a UNIX machine. You should have received an invite link for a VM; if not, contact your teacher. On the VM, everything is installed (like rye).

Setup the virtual environment

  1. First, make sure you have python >= 3.11. You can check the version with python --version.
  2. Make sure rye is there. Alternatively, use pip to install your environment.
    • check if it is installed by executing rye --help
    • if not, run curl -sSf https://rye.astral.sh/get | bash (not necessary on the VM)
    • watch the intro video for rye at https://rye.astral.sh/guide/
  3. Install the dependecies by navigating to the MADS-DAV folder where the pyproject.toml is located and run rye sync.

Run the preprocessor

Download a chat from Whatsapp and put it in the data/raw folder. Rename the file to `chat.txt' and run the following command:

source .venv/bin/activate

This will activate your virtual environment. You can check which python is being used by running:

which python

After this, you can run the preprocessor with the following command:

analyzer --device ios

Change ios to android if you have an android device. This will run the src/wa_analyzer.py:main method, which will process the chat and save the results in the data/processed folder.

You should see some logs, like this:

2024-02-11 16:07:19.191 | INFO     | __main__:main:71 - Using iOS regexes
2024-02-11 16:07:19.201 | INFO     | __main__:process:61 - Found 1779 records
2024-02-11 16:07:19.201 | INFO     | __main__:process:62 - Appended 152 records
2024-02-11 16:07:19.202 | INFO     | __main__:save:30 - Writing to data/processed/whatsapp-20240211-160719.csv
2024-02-11 16:07:19.206 | SUCCESS  | __main__:save:32 - Done!

Inside the log folder you will find a logfile, which has some additional information that might be useful for debugging.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wa_analyzer-0.4.0.tar.gz (33.9 MB view details)

Uploaded Source

Built Distribution

wa_analyzer-0.4.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file wa_analyzer-0.4.0.tar.gz.

File metadata

  • Download URL: wa_analyzer-0.4.0.tar.gz
  • Upload date:
  • Size: 33.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.2

File hashes

Hashes for wa_analyzer-0.4.0.tar.gz
Algorithm Hash digest
SHA256 2c6ef89e14fd9b6dc816da57cdc5fa231fd6583428a3d9ac532d7c0b1541d3d0
MD5 4d5d5ed8c6da6dcb6d2f31ba9a83bdf2
BLAKE2b-256 17b6654f12777cf1cc72ffc85e4f83a45b959ddd8889505380e264ca9ca0b3d8

See more details on using hashes here.

File details

Details for the file wa_analyzer-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: wa_analyzer-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.2

File hashes

Hashes for wa_analyzer-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 44bd7ee8651efb7bea22bd4ad5418d53bee0bf4aa6c08a607a79cb8a7cd4b5fb
MD5 0c0689f59823761135e7407fad75b364
BLAKE2b-256 fc21b18ae80b81e350fc81526f837a0a1b7de1dd5721e6fc745d8916812ce3a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page