Flask + D3.js - Data Dashboard - Part 1

Analysing data is a very important and critical step in any Machine learning process. Before any prediction or classification model is built, one must have a deeper insight of what the dataset is representing.

An analyst might want to convey this interpretation to his readers through data visualizations. A good data visualization is the one, which presents the gist of data easily to a reader who might be novice in the domain.

Data Dashboards or simply Dashboards are one of the interactive and interesting ways for presenting data. It keeps users interested as the dashboards are interactive; moreover, it presents the maximum interpretation of data.

In this article, our goal is to create one such Data Dashboards. We will be using Flask as a back-end script to serve data and D3.js as the Javascript library to render charts on our dashboard based on the served data.

Flask is a pretty robust framework for building web portals in python. D3.js on the other hand, helps you bring data to life using HTML, SVG and CSS. It attaches data to DOM elements and hence the visualizations created can be viewed on any modern day browsers.

Let’s get started. You can get the whole project code at this github link.We will be using following stack:

  • Virtualenv as Environment
  • Flask to serve data and pages
  • D3.js to render charts
  • Boostraps for CSS

0. Goal & Prerequisite We will be working on famous Titanic dataset. Our goal is to analyze survivors from Titanic dataset. Download train.csv, as it contains labeled data for each passenger.

1. Setup environment Install virtualenv


        pip install virtualenv
    

Create new environment


        virtualenv venv
    

Activate environment


        source venv/bin/activate
    

Install flask and its dependencies


        pip install flask
    

Generate requirements.txt


        pip freeze > requirements.txt
    

2. Setup file structure Create a new folder flask-d3. Create following files and folders structure:

    
        run.py
        templates/base.html
        templates/home.html
        static/css/main.css
        static/js/main.js
        static/js/pieChart.js
        static/js/barChart.js
        static/js/updateChart.js
        static/data/train.csv
    

3. Creating project entry point Open run.py in your code editor and write following code:

    
    from flask import Flask, jsonify, render_template
    import csv
    import pandas as pd
    import numpy as np

    app = Flask(__name__)

    titanic_df = pd.read_csv("static/data/train.csv")
    survived = titanic_df[(titanic_df['Survived']==1) & (titanic_df["Age"].notnull())]

    @app.route('/')
    def index():
        return render_template('home.html')
    
    if __name__ == '__main__':
            app.run(debug=True)
    

We start by importing libraries we will need in our application. Next with app = Flask(__name__) we create an instance of our application. Then using pandas read_csv method we read train.csv file and store it as a dataframe in titanic_df

Finally, we create route for our page using app.route(‘/’) . For now the function renders home.html page which we will create later.

4. Creating JSON response containing data In this step, we will create few functions which will return data in JSON format to frontend. These functions work as an api and their only role is to do data-manipulation and return data in JSON format. Add following code in run.py

        
    def calculate_percentage(val, total):
        """Calculates the percentage of a value over a total"""
        percent = np.divide(val, total)

        return percent

    @app.route('/get_piechart_data')
    def get_piechart_data():
        class_labels = ['Class I', 'Class II', 'Class III']
        pclass_percent = calculate_percentage(survived.groupby('Pclass').size().values, survived['PassengerId'].count())*100

        pieChartData = []
        for index, item in enumerate(pclass_percent):
            eachData = {}
            eachData['category'] = class_labels[index]
            eachData['measure'] =  round(item,1)
            pieChartData.append(eachData)

        return jsonify(pieChartData)

    @app.route('/get_barchart_data')
    def get_barchart_data():
        age_labels = ['0-9', '10-19', '20-29', '30-39', '40-49', '50-59', '60-69', '70-79']
        survived["age_group"] = pd.cut(survived.Age, range(0, 81, 10), right=False, labels=age_labels)
        survived[['age_group', 'Pclass']]

        survivorFirstClass = survived[survived['Pclass']==1]
        survivorSecondClass = survived[survived['Pclass']==2]
        survivorThirdClass = survived[survived['Pclass']==3]

        survivorAllclassPercent = calculate_percentage(survived.groupby('age_group').size().values,survived['PassengerId'].count())*100
        survivorFirstclassPercent = calculate_percentage(survivorFirstClass.groupby('age_group').size().values,survivorFirstClass['PassengerId'].count())*100
        survivorSecondclassPercent = calculate_percentage(survivorSecondClass.groupby('age_group').size().values,survivorSecondClass['PassengerId'].count())*100
        survivorThirdclassPercent = calculate_percentage(survivorThirdClass.groupby('age_group').size().values,survivorThirdClass['PassengerId'].count())*100

        barChartData = []
        for index, item in enumerate(survivorAllclassPercent):
            eachBarChart = {}
            eachBarChart['group'] = "All"
            eachBarChart['category'] = age_labels[index]
            eachBarChart['measure'] = round(item,1)
            barChartData.append(eachBarChart)


        for index, item in enumerate(survivorFirstclassPercent):
            eachBarChart = {}
            eachBarChart['group'] = "Class I"
            eachBarChart['category'] = age_labels[index]
            eachBarChart['measure'] = round(item,1)
            barChartData.append(eachBarChart)

        for index, item in enumerate(survivorSecondclassPercent):
            eachBarChart = {}
            eachBarChart['group'] = "Class II"
            eachBarChart['category'] = age_labels[index]
            eachBarChart['measure'] = round(item,1)
            barChartData.append(eachBarChart)

        for index, item in enumerate(survivorThirdclassPercent):
            eachBarChart = {}
            eachBarChart['group'] = "Class III"
            eachBarChart['category'] = age_labels[index]
            eachBarChart['measure'] = round(item,1)
            barChartData.append(eachBarChart)

        return jsonify(barChartData)
        
    

In Titanic, passenger cabins were distributed among 3 different classes based on cabin fare. The classes are labeled as Class I (most expensive cabins), Class II (moderate for middle class passengers) and Class III (cheapest cabins).

get_piechart_data() fetches number of survivors from each Class, converts it into percentage value using calculate_percentage function and finally returns data in JSON format.

For deeper insight, we classify each passenger based on their age. We create 7 age-groups and assign respective group to each passenger. In get_barchart_data() function, we perform the data-manipulation before returning it into proper JSON format.

4. Creating basic html templates In our templates file, we have base.html and home.html. base.html will consist of basic layout, which will be common for all other pages. home.html will extend base.html, where we can add html specific to home page only.

Open base.html and add following html code

    
    <!DOCTYPE html>
    <html lang="en">
    <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta name="description" content="">
    <meta name="author" content="">
    <title>Home page
    
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css" type="text/css" />
    
    <link href="/static/css/main.css" rel="stylesheet">
    </head>
    <body>
        <div class="container-fluid">
        <div class="row">
        <div class="col-sm-10 mx-auto">
        
        {% block content %}{% endblock %}
        </div>
        </div>
        </div>
        
        <script src="https://code.jquery.com/jquery-3.3.1.slim.min.js">
        <script src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.2/js/bootstrap.min.js">
        <script src="https://d3js.org/d3.v4.min.js">
        <script src="https://d3js.org/queue.v1.min.js">
        <script>
        var piechartDataUrl = "{{ url_for('get_piechart_data') }}";
        var barchartDataUrl = "{{ url_for('get_barchart_data') }}";
        < /script>
        <script src="{{ url_for('static', filename='js/pieChart.js') }}">
        <script src="{{ url_for('static', filename='js/barChart.js') }}">
        <script src="{{ url_for('static', filename='js/updateChart.js') }}">
        <script src="{{ url_for('static', filename='js/main.js') }}">
    </body>
    </html>
    
    

In our base.html we added boostrap specific CSS and JS files, d3.js, queue.js; which we will use to process and render data.

Then we have javascript files which have custom scripts: pieChart.js will have script specific to pieChart on our dashboard, barChart.js will render barChart, updateChart.js will handle the interactive events between pieChart and barChart. main.js will have script common to whole dashboard.

We also have 2 variables, piechartDataUrl and barchartDataUrl, which will fetch us JSON response data.

Next open home.html and lets extend this file. Add following code:


        {% extends "base.html" %}
        {% block content %}
        <div id="pieChart"></div>
        <div id="barChart"></div>
        {% endblock content %}
          
    

First line extends base.html. Then we have div elements where we will add chart elements for our dashboard. Simple and neat.

Next, we add styling to our page. Open static/css/main.css and open following code:

        
    #pieChart {
        position:relative;
        top:40%;
        left:10px;
        width:400px;
        height: 400px;
        display: inline-block;
        font-size: 12px;
    }
    
    
    #barChart {
        position:relative;
        top:40%;
        height: 400px;
        display: inline-block;
        font-size: 12px;
    }

    #pieChart .title, #barChart .title{
        font-weight: bold;
    }
    
    .slice {
        font-size: 13px;
        font-family: Verdana;
        fill: white;
        font-weight: bold;   
        cursor: pointer;
            }
        
    



Takeaways

Phew!! That is a lot of code to digest. In part-1 of this series, we accomplished following tasks:

  • initial setup of environment and project
  • back-end python code to serve page and data
  • html pages to present our content
  • basic CSS styiling

In part-2, we will be working on rendering pie-chart based on formated data.


Note, References & Links: