Skip to main content

A Universal Social Data Extractor

Project description

GraphiPy

A Universal Social Data Extractor

GraphiPy simplifies the extraction of data from different social media websites. Instead of having to study the different APIs of each website, just provide the API keys and use GraphiPy!

Currently, GraphiPy provides support to 7 different websites:

Installation

GraphiPy is uploaded on PyPI and can be found here.

To install GraphiPy, run pip install GraphiPy

Please note that GraphiPy does not support Python 2 and only works on Python 3.

Video Demonstration

GraphiPy Video

Data Strcuture

GraphiPy acts like a Graph in which all the different information are stored as nodes and connections between different nodes will be stored as edges.

Currently, we have 3 graph types:

All graph types are based on a class called BaseGraph

  • Dictionary Graph To provide easy access, the type of the nodes and edges are stored as keys while the rows of data are stored as values. The rows of data is also a dictionary, with the _id of the nodes and edges as keys (to avoid duplicate data) and the values would be the node and edge objects.

  • Pandas Graph Similar to the Dictionary Graph, the type of nodes and edges are stored as keys and the dataframes are stored as values. Since inserting rows one by one into the dataframe takes polynomial time, the implementation uses the help of Python's dictionary. After a certain number of elements are inside the dictionaries, all of them are converted into dataframes and appended into the existing dataframes.

  • Neo4j Graph GraphiPy directly connects and inserts to your Neo4j database. In order to avoid duplicate data, MERGE is used instead of CREATE. Thus, whenever an existing node _id is inserted, its attributes are updated instead of inserting a completely new node.

API Demos

For more information on how to use GraphiPy, please see one of the notebooks:

Data Exportation and Visualization with NetworkX

GraphiPy can also export data as CSV files and visualize the graphs using NetworkX. It is also possible to convert from one graph type to another (e.g. from Pandas to Neo4j and vice versa). For more information, see this notebook

  • Gephi Support: Gephi is an open-source software for network visualization and analysis. It helps data analysts to intuitively reveal patterns and trends, highlight outliers and tells stories with their data. The csv files exported from Graphify can be directly imported to Gephi. The below figure shows data visualization (via Gephi) of 20 youtube videos with keyword "dota2" extracted via GraphiPy Data of 20 youtube videos with keyword "dota2"

Folder Structure

.
├── demo
|   ├── DataExportDemo.ipynb
|   ├── FacebookDemo.ipynb
|   ├── LinkedinDemo.ipynb
|   ├── PinterestDemo.ipynb
|   ├── RedditDemo.ipynb
|   ├── TumblrDemo.ipynb
|   ├── TwitterDemo.ipynb
|   └── YoutubeDemo.ipynb
├── graphipy
|   ├── api
|   |   ├── _init_.py
|   |   ├── facebook_api.py	
|   |   ├── linkedin_api.py	
|   |   ├── pinterest_api.py
|   |   ├── reddit_api.py	
|   |   ├── tumblr_api.py	
|   |   ├── twitter_api.py	
|   |   └── youtube_api.py	
|   ├── graph
|   |   ├── _init_.py
|   |   ├── graph_base.py
|   |   ├── graph_dict.py
|   |   ├── graph_neo4j.py
|   |   └── graph_pandas.py
|   ├── _init_.py
|   ├── exportnx.py
|   └── graphipy.py
├── .gitignore 
├── README.md
└── requirements.txt
Folder/Filename Description
demo Jupyter notebooks explaining how to use the library in detail
graphipy The major directory of the library containing classes for all social media platforms, graph data structure and exporting functionalities
graphipy/api Class definitions for all social media platforms, including fetch functions and customized nodes and edges
graphipy/graph Definitions of the graph data structure implemented with dictionary, Pandas and Neo4J
requirements.txt All dependencies

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GraphiPy-0.0.2b0.tar.gz (24.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

GraphiPy-0.0.2b0-py3-none-any.whl (28.9 kB view details)

Uploaded Python 3

File details

Details for the file GraphiPy-0.0.2b0.tar.gz.

File metadata

  • Download URL: GraphiPy-0.0.2b0.tar.gz
  • Upload date:
  • Size: 24.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for GraphiPy-0.0.2b0.tar.gz
Algorithm Hash digest
SHA256 affe86cc239c9ed7b154d1d97068f12ec0061b2eb73b943ec647fa651aab9cec
MD5 f842c9bc554a2c053fdda1188dcc4c81
BLAKE2b-256 f59c32c6d9a161f623335a4d1a4f74ab892767df8c16631ef05faf5d80ac9fc2

See more details on using hashes here.

File details

Details for the file GraphiPy-0.0.2b0-py3-none-any.whl.

File metadata

  • Download URL: GraphiPy-0.0.2b0-py3-none-any.whl
  • Upload date:
  • Size: 28.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.5

File hashes

Hashes for GraphiPy-0.0.2b0-py3-none-any.whl
Algorithm Hash digest
SHA256 ced71108e1ebbd8330c6bd5ec849ec0a82740d31204dd8a7f2eebbd4a2205768
MD5 9e98898d390f0ead798a727b013bc1ba
BLAKE2b-256 8b5072387ad6d0afa658b47ccd8b5795a07673000e24c8f4b5b85da9b5e5a45d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page