Code for the Master of Applied Data Science course Data Analysis and Visualization
Project description
This is the repository for the Master of Applied Data Science course "Data Analysis & Visualisation", previously known as "Data Mining & Exploration". All instructions assume a UNIX machine. You should have received an invite link for a VM; if not, contact your teacher. On the VM, everything is installed (like rye).
Setup the virtual environment
- First, make sure you have python >= 3.11. You can check the version with
python --version
. - Make sure
rye
is there. Alternatively, usepip
to install your environment.- check if it is installed by executing
rye --help
- if not, run
curl -sSf https://rye.astral.sh/get | bash
(not necessary on the VM) - watch the intro video for rye at https://rye.astral.sh/guide/
- check if it is installed by executing
- Install the dependecies by navigating to the MADS-DAV folder where the
pyproject.toml
is located and runrye sync
.
Run the preprocessor
Download a chat from Whatsapp and put it in the data/raw
folder. Rename the file to `chat.txt' and run the following command:
source .venv/bin/activate
This will activate your virtual environment. You can check which python is being used by running:
which python
After this, you can run the preprocessor with the following command:
analyzer --device ios
Change ios
to android
if you have an android device.
This will run the src/wa_analyzer.py:main
method, which will process the chat and save the results in the data/processed
folder.
You should see some logs, like this:
2024-02-11 16:07:19.191 | INFO | __main__:main:71 - Using iOS regexes
2024-02-11 16:07:19.201 | INFO | __main__:process:61 - Found 1779 records
2024-02-11 16:07:19.201 | INFO | __main__:process:62 - Appended 152 records
2024-02-11 16:07:19.202 | INFO | __main__:save:30 - Writing to data/processed/whatsapp-20240211-160719.csv
2024-02-11 16:07:19.206 | SUCCESS | __main__:save:32 - Done!
Inside the log
folder you will find a logfile, which has some additional information that might be useful for debugging.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file wa_analyzer-0.4.0.tar.gz
.
File metadata
- Download URL: wa_analyzer-0.4.0.tar.gz
- Upload date:
- Size: 33.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c6ef89e14fd9b6dc816da57cdc5fa231fd6583428a3d9ac532d7c0b1541d3d0 |
|
MD5 | 4d5d5ed8c6da6dcb6d2f31ba9a83bdf2 |
|
BLAKE2b-256 | 17b6654f12777cf1cc72ffc85e4f83a45b959ddd8889505380e264ca9ca0b3d8 |
File details
Details for the file wa_analyzer-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: wa_analyzer-0.4.0-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44bd7ee8651efb7bea22bd4ad5418d53bee0bf4aa6c08a607a79cb8a7cd4b5fb |
|
MD5 | 0c0689f59823761135e7407fad75b364 |
|
BLAKE2b-256 | fc21b18ae80b81e350fc81526f837a0a1b7de1dd5721e6fc745d8916812ce3a7 |