An object-oriented approach to mining personal Facebook data.
Project description
Facebook Data Miner
Facebook-Data-Miner provides a set of tools that can help you analyze the data that Facebook has on you.
The vision is to support both data extraction, data analysis and data visualization capabilities through any of the interfaces.
All computation happens on your machine so no data gets sent to remote computers or third-parties.
Prerequisites
As of now the package was only tested on Linux, however with pipenv
it is should be easy to set the application up on Windows.
Python
The application was tested on Debian 10 and Python v3.8.3. You will need Python 3.8 (some features are used).
To download Python refer to the official Python distribution page.
Your Facebook data
This package works by analyzing your Facebook data, so you have to download it.
Please refer to the following link in order to do so.
IMPORTANT NOTE: you have to set Facebook's language to English(US) for the time being you request your data. This change can of course be reverted later.
You will only need the zip file's absolute path later to use this software.
You have to change the DATA_PATH
variable in the
configuration.yml file.
NOTE: facebook-data-miner
will extract your zip file in the same directory.
For this you may need several GBs of free space depending on the volume of the
original data.
This repository
Clone this repository by either using SSH:
git clone git@github.com:tardigrde/facebook-data-miner.git
or HTTPS:
git clone https://github.com/tardigrde/facebook-data-miner.git
Dependecies
This project uses pipenv
for dependency and virtual environment management.
Install it by typing:
pip install --user pipenv
In the project root (where Pipfile is) run:
pipenv install --dev
Make sure you run the application in this environment.
Lint
With the makefile:
make lint
Run tests
With the makefile:
make test
Make sure you run the application in this environment.
Usage
The app has both a CLI and an API. For now, API is the preferred way to run the app since there is no database yet, which would hold your facebook data in memory. CLI works but it's slow.
Jupyter notebook
I wrote two jupyter notebooks in order to showcase the capabilities and features of the API and CLI. The notebook contains lots of comments to help understand how the app is built, and what kind of information you can access, and how.
For this you have to start a jupyter
server.
As in the notebooks mentioned, you have to set the $PYTHONPATH env var
before starting a jupyter server.
export PYTHONPATH="$PWD"
Then type the following in your terminal if you want to use jupyer notebook
:
jupyer notebook
or for jupyter lab
:
jupyter lab
Select notebooks/API.ipynb (or notebooks/CLI.ipynb) and start executing the cells.
The API
As in the notebook already described, the entrypoint is
miner/app.py's App
class. For now the docstring is the only
documentation.
Call it from a script (after you set the data path) like:
from miner.app import App
app = App()
The CLI
The command-line interface has a lot of depth, as you are showed in notebooks/CLI.ipynb, but it is slow, because the data that gets read in does not stay in RAM.
For running the CLI:
export PYTHONPATH="$PWD"
python ./miner/app.py --help
Contribution
Help is more than welcome. It is still a long way to go until v1.0.0
Ideas are welcome too. Feel free to open a new issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file facebook-data-miner-0.1.0.linux-x86_64.tar.gz
.
File metadata
- Download URL: facebook-data-miner-0.1.0.linux-x86_64.tar.gz
- Upload date:
- Size: 34.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5df0ee16531a0954c103c9bdd74de4515e66af174296634e6bc0008c6a4dbd9c |
|
MD5 | 249add51762574740b6fd7c1a5e65b2b |
|
BLAKE2b-256 | abf5d9a05ae805c17884fc1bd62d9dbe8a0abbbcc10a38e08473b946350f8c33 |
File details
Details for the file facebook_data_miner-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: facebook_data_miner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 55.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60c2d2e8642bead691fe5bc19d55b69df1caa4ff22bad9d7aff38d2b3bf42485 |
|
MD5 | ed84447bbe4df76c18cdd2b22e080a35 |
|
BLAKE2b-256 | cd3250217c2b3ce13fb866ba8c205074df13d7e6c8051f8ee3e4d56419cf758a |