Flask + D3.js - Data Dashboard - Part 1
Analysing data is a very important and critical step in any Machine learning process. Before any prediction or classification model is built, one must have a deeper insight of what the dataset is representing.
An analyst might want to convey this interpretation to his readers through data visualizations. A good data visualization is the one, which presents the gist of data easily to a reader who might be novice in the domain.
Data Dashboards or simply Dashboards are one of the interactive and interesting ways for presenting data. It keeps users interested as the dashboards are interactive; moreover, it presents the maximum interpretation of data.
In this article, our goal is to create one such Data Dashboards. We will be using Flask as a back-end script to serve data and D3.js as the Javascript library to render charts on our dashboard based on the served data.
Flask is a pretty robust framework for building web portals in python. D3.js on the other hand, helps you bring data to life using HTML, SVG and CSS. It attaches data to DOM elements and hence the visualizations created can be viewed on any modern day browsers.
Let’s get started. You can get the whole project code at this github link.We will be using following stack:
- Virtualenv as Environment
- Flask to serve data and pages
- D3.js to render charts
- Boostraps for CSS
pip install virtualenv
Create new environment
virtualenv venv
Activate environment
source venv/bin/activate
Install flask and its dependencies
pip install flask
Generate requirements.txt
pip freeze > requirements.txt
run.py
templates/base.html
templates/home.html
static/css/main.css
static/js/main.js
static/js/pieChart.js
static/js/barChart.js
static/js/updateChart.js
static/data/train.csv
from flask import Flask, jsonify, render_template
import csv
import pandas as pd
import numpy as np
app = Flask(__name__)
titanic_df = pd.read_csv("static/data/train.csv")
survived = titanic_df[(titanic_df['Survived']==1) & (titanic_df["Age"].notnull())]
@app.route('/')
def index():
return render_template('home.html')
if __name__ == '__main__':
app.run(debug=True)
We start by importing libraries we will need in our application.
Next with app = Flask(__name__) we create an instance of our application.
Then using pandas read_csv method we read train.csv file and store it as a dataframe in titanic_df
Finally, we create route for our page using app.route(‘/’) . For now the function renders home.html page which we will create later.
def calculate_percentage(val, total):
"""Calculates the percentage of a value over a total"""
percent = np.divide(val, total)
return percent
@app.route('/get_piechart_data')
def get_piechart_data():
class_labels = ['Class I', 'Class II', 'Class III']
pclass_percent = calculate_percentage(survived.groupby('Pclass').size().values, survived['PassengerId'].count())*100
pieChartData = []
for index, item in enumerate(pclass_percent):
eachData = {}
eachData['category'] = class_labels[index]
eachData['measure'] = round(item,1)
pieChartData.append(eachData)
return jsonify(pieChartData)
@app.route('/get_barchart_data')
def get_barchart_data():
age_labels = ['0-9', '10-19', '20-29', '30-39', '40-49', '50-59', '60-69', '70-79']
survived["age_group"] = pd.cut(survived.Age, range(0, 81, 10), right=False, labels=age_labels)
survived[['age_group', 'Pclass']]
survivorFirstClass = survived[survived['Pclass']==1]
survivorSecondClass = survived[survived['Pclass']==2]
survivorThirdClass = survived[survived['Pclass']==3]
survivorAllclassPercent = calculate_percentage(survived.groupby('age_group').size().values,survived['PassengerId'].count())*100
survivorFirstclassPercent = calculate_percentage(survivorFirstClass.groupby('age_group').size().values,survivorFirstClass['PassengerId'].count())*100
survivorSecondclassPercent = calculate_percentage(survivorSecondClass.groupby('age_group').size().values,survivorSecondClass['PassengerId'].count())*100
survivorThirdclassPercent = calculate_percentage(survivorThirdClass.groupby('age_group').size().values,survivorThirdClass['PassengerId'].count())*100
barChartData = []
for index, item in enumerate(survivorAllclassPercent):
eachBarChart = {}
eachBarChart['group'] = "All"
eachBarChart['category'] = age_labels[index]
eachBarChart['measure'] = round(item,1)
barChartData.append(eachBarChart)
for index, item in enumerate(survivorFirstclassPercent):
eachBarChart = {}
eachBarChart['group'] = "Class I"
eachBarChart['category'] = age_labels[index]
eachBarChart['measure'] = round(item,1)
barChartData.append(eachBarChart)
for index, item in enumerate(survivorSecondclassPercent):
eachBarChart = {}
eachBarChart['group'] = "Class II"
eachBarChart['category'] = age_labels[index]
eachBarChart['measure'] = round(item,1)
barChartData.append(eachBarChart)
for index, item in enumerate(survivorThirdclassPercent):
eachBarChart = {}
eachBarChart['group'] = "Class III"
eachBarChart['category'] = age_labels[index]
eachBarChart['measure'] = round(item,1)
barChartData.append(eachBarChart)
return jsonify(barChartData)
In Titanic, passenger cabins were distributed among 3 different classes based on cabin fare. The classes are labeled as Class I (most expensive cabins), Class II (moderate for middle class passengers) and Class III (cheapest cabins).
get_piechart_data() fetches number of survivors from each Class, converts it into percentage value using calculate_percentage function and finally returns data in JSON format.
For deeper insight, we classify each passenger based on their age. We create 7 age-groups and assign respective group to each passenger. In get_barchart_data() function, we perform the data-manipulation before returning it into proper JSON format.
Open base.html and add following html code
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta name="description" content="">
<meta name="author" content="">
<title>Home page
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css" type="text/css" />
<link href="/static/css/main.css" rel="stylesheet">
</head>
<body>
<div class="container-fluid">
<div class="row">
<div class="col-sm-10 mx-auto">
{% block content %}{% endblock %}
</div>
</div>
</div>
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js">
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.2/js/bootstrap.min.js">
<script src="https://d3js.org/d3.v4.min.js">
<script src="https://d3js.org/queue.v1.min.js">
<script>
var piechartDataUrl = "{{ url_for('get_piechart_data') }}";
var barchartDataUrl = "{{ url_for('get_barchart_data') }}";
< /script>
<script src="{{ url_for('static', filename='js/pieChart.js') }}">
<script src="{{ url_for('static', filename='js/barChart.js') }}">
<script src="{{ url_for('static', filename='js/updateChart.js') }}">
<script src="{{ url_for('static', filename='js/main.js') }}">
</body>
</html>
In our base.html we added boostrap specific CSS and JS files, d3.js, queue.js; which we will use to process and render data.
Then we have javascript files which have custom scripts: pieChart.js will have script specific to pieChart on our dashboard, barChart.js will render barChart, updateChart.js will handle the interactive events between pieChart and barChart. main.js will have script common to whole dashboard.
We also have 2 variables, piechartDataUrl
and barchartDataUrl
, which will fetch us JSON response data.
Next open home.html and lets extend this file. Add following code:
{% extends "base.html" %}
{% block content %}
<div id="pieChart"></div>
<div id="barChart"></div>
{% endblock content %}
First line extends base.html. Then we have div elements where we will add chart elements for our dashboard. Simple and neat.
Next, we add styling to our page. Open static/css/main.css and open following code:
#pieChart {
position:relative;
top:40%;
left:10px;
width:400px;
height: 400px;
display: inline-block;
font-size: 12px;
}
#barChart {
position:relative;
top:40%;
height: 400px;
display: inline-block;
font-size: 12px;
}
#pieChart .title, #barChart .title{
font-weight: bold;
}
.slice {
font-size: 13px;
font-family: Verdana;
fill: white;
font-weight: bold;
cursor: pointer;
}
Takeaways
Phew!! That is a lot of code to digest. In part-1 of this series, we accomplished following tasks:
- initial setup of environment and project
- back-end python code to serve page and data
- html pages to present our content
- basic CSS styiling
In part-2, we will be working on rendering pie-chart based on formated data.
- Titanic Data available at Kaggle
- Dashboard inspired from Diethard Steiner’s Block
- D3.js library used for data dashboards