Serves up Pandas dataframes via the Django REST Framework for client-side (i.e. d3.js) visualizations
Project description
Django REST Framework + Pandas = A Model-driven Visualization API
Django REST Pandas (DRP) provides a simple way to generate and serve Pandas DataFrames via the Django REST Framework. The resulting API can serve up CSV (and a number of other formats) for consumption by a client-side visualization tool like d3.js. The actual client implementation is left to the user - giving full flexibility for whatever visualizations you want to come up with. (That said, if you want some out of the box d3-powered charts for use with DRP, you may be interested in wq.app’s chart.js and/or wq.db’s chart module.)
Usage
Getting Started
pip install rest-pandas
Usage Example
# views.py
from rest_pandas import PandasView
from .models import TimeSeries
class TimeSeriesView(PandasView):
model = TimeSeries
# In response to get(), the underlying Django REST Framework ListAPIView
# will load the default queryset (self.model.objects.all()) and then pass
# it to the following function.
def filter_queryset(self, qs):
# At this point, you can filter queryset based on self.request or other
# settings (useful for limiting memory usage)
return qs
# Then, the included PandasSerializer will serialize the queryset into a
# simple list of dicts (using the DRF ModelSerializer). To customize
# which fields to include, subclass PandasSerializer and set the
# appropriate ModelSerializer options. Then, set the serializer_class
# property on the view to your PandasSerializer subclass.
# Next, the PandasSerializer will load the ModelSerializer result into a
# DataFrame and pass it to the following function on the view.
def transform_dataframe(self, dataframe):
# Here you can transform the dataframe based on self.request
# (useful for pivoting or computing statistics)
return dataframe
# Finally, the included Renderers will process the dataframe into one of
# the output formats below.
# urls.py
from django.conf.urls import patterns, include, url
from rest_framework.urlpatterns import format_suffix_patterns
from .views import TimeSeriesView
urlpatterns = patterns('',
url(r'^data', TimeSeriesView.as_view()),
)
urlpatterns = format_suffix_patterns(urlpatterns)
The default PandasView will serve up all of the available data from the provided model in a simple tabular form. You can also use a PandasViewSet if you are using Django REST Framework’s ViewSets and Routers, or a PandasSimpleView if you would just like to serve up some data without a Django model as the source.
Implementation
The underlying implementation is a set of serializers that take the normal serializer result and put it into a dataframe. Then, the included renderers generate the output using the built in Pandas functionality.
Formats
The following output formats are provided by default. These are provided as renderer classes in order to leverage the content negotiation built into Django REST Framework. This means clients can specify a format via Accepts: text/csv or by appending .csv to the URL (if the above urls.py is followed).
Format |
Content Type |
Pandas Dataframe Function |
Notes |
---|---|---|---|
CSV |
text/csv |
to_csv() |
|
TXT |
text/plain |
to_csv() |
Useful for testing, as most browsers will download a CSV file instead of displaying it |
JSON |
application/json |
to_json() |
|
XLSX |
application/vnd.openxml...sheet |
to_excel() |
|
XLS |
application/vnd.ms-excel |
to_excel() |
|
PNG |
image/png |
plot() |
Currently not very customizable, but a simple way to view the data as an image. |
SVG |
image/svg |
plot() |
Eventually these could become a fallback for clients that can’t handle d3.js |
Perhaps counterintuitively, the CSV renderer is the default in Django REST Pandas, as it is the most stable and useful for API building. While the Pandas JSON serializer is improving, the primary reason for making CSV the default is the compactness it provides over JSON when serializing time series data. This is particularly valuable for Pandas dataframes, in which:
each record has the same keys, and
there are (usually) no nested objects
While a normal CSV file only has a single row of column headers, Pandas can produce files with nested columns. This is a useful way to provide metadata about time series that is difficult to represent in a plain CSV file. However, it also makes the resulting CSV more difficult to parse. For this reason, you may be interested in wq/pandas.js, a d3 extension for loading the complex CSV generated by Pandas Dataframes.
// mychart.js
define(['d3', 'wq/pandas'], function(d3, pandas) {
d3.csv("/data.csv", render);
// Or
pandas.get('/data.csv' render);
function render(error, data) {
d3.select('svg')
.selectAll('rect')
.data(data)
// ...
}
});
You can override the default renderers by setting PANDAS_RENDERERS in your settings.py, or by overriding renderer_classes in your PandasView subclass.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.