Deploying Keras Model in Production using Flask

Upasana | December 07, 2019 | 7 min read | 2,395 views | Flask - Python micro web framework

In this article, we are going to discuss the process of building a REST API over keras’s saved model and deploying it to production using Flask and Gunicorn/WSGI.

If you are looking for tensorflow 2.0 support then refer to this article.

Introduction

We are going to take example of a mood detection model which is built using NLTK, keras in python. When we train deep learning model in keras, we always need some other part as well to test its results and if we want to demo then we cannot show raw probabilities (output from model) and have to show interactive results such that someone who is not from this background shall also be able to understand the results.

Keras

Keras is an open-source neural-network library written in Python. It is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit, Theano, and PlaidML. It is designed to enable fast experimentation with deep neural networks, and focuses on being user-friendly, modular, and extensible.

NLTK

NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.

Now, NLTK has added support for indian languages as well.

Flask

Flask is a micro web framework written in python, which is frequently used by developers to create simple REST endpoints.

Flask homepage

http://flask.pocoo.org

We will be creating one python script for calling REST Endpoints using flask application and will be keeping classes in services folder.

Mood detection Model

This model was built on 1,82,689 observations which includes data based on emotions categories as Anger, disgust, joy, sadness, shame, guilt and fear. Model is based in Bi-directional LSTM and was trained on only 50 epochs. Since, data was not normalized earlier to retain the pattern, BatchNormalisation layer was also used in model. Below are the recall scores from the model stats on test data:

Anger : 0.72
Disgust : 0.68
Fear : 0.96
Guilt : 0.63
Joy : 0.92
Sad : 0.94
Shame : 0.81

Directory structure

Our directory structure is going to be like:

In src folder, we have two directories and main.py to start flask app.

Directory mood-saved-models contains saved keras models and saved tokenizer in pickle format.
Directory service contains services scripts in .py.

Text pre-processing

Before training deep learning models with the textual data we have, we usually perform few transformations on the data to clean it and convert it into vector format. This process is generally known as text pre-processing.

Since, we perform these tasks on training data then we shall be doing the same on testing data as well.

Now, we are going to build a service for the same which will pre-process the text before sending it to model for prediction.

TextPreprocessing Method

def text_preprocessing(self,text):
        eyes = r"[8:=;]"
        nose = r"['`-]?"

        def re_sub(pattern, repl):
            return re.sub(pattern, repl, text, flags=self.FLAGS)

        text = re_sub(r"https?:\/\/\S+\b|www\.(\w+\.)+\S*", " ")
        text = re_sub(r"@\w+", "user")
        text = re_sub(r"{}{}[)dD]+|[)dD]+{}{}".format(eyes, nose, nose, eyes), "smile")
        text = re_sub(r"{}{}p+".format(eyes, nose), "laugh")
        text = re_sub(r"{}{}\(+|\)+{}{}".format(eyes, nose, nose, eyes), "sad")
        text = re_sub(r"{}{}[\/|l*]".format(eyes, nose), "neutral")
        text = re_sub(r"/"," / ")
        text = re_sub(r"<3","love")
        text = re_sub(r"[-+]?[.\d]*[\d]+[:,.\d]*", " ")
        text = re_sub(r"#\S+", self.hashtag)
        text = re_sub(r"([!?.]){2,}", r"\1 repeat")
        text = re_sub(r"\b(\S*?)(.)\2{2,}\b", r"\1\2 <elong>")
        text = re_sub(r"([A-Z]){2,}", self.allcaps)

        return text.lower()

We will be using this method to clean the text. It involves

Removing repetitive words
converting smileys to text
extracting text from hashtags.

We can also add spell corrector such that it can take care of typos. There is library named as enchant which can be used to correct spelling od the words. Try installing and using it by pip install pyenchant. This shall work on Mac OS X and Ubuntu, not sure about windows

So now, the whole class is going to be look like below:

TextPreprocessing.py

import re


class TextPreprocessing(object):


    def __init__(self):
        self.FLAGS = re.MULTILINE | re.DOTALL

    def hashtag(self,text):
        text = text.group()
        hashtag_body = text[1:]
        if hashtag_body.isupper():
            result = " {} ".format(hashtag_body.lower())
        else:
            result = " ".join([""] + [re.sub(r"([A-Z])",r" \1", hashtag_body, flags=self.FLAGS)])
        return result

    def allcaps(self,text):
        text = text.group()
        return text.lower() + " "

    def re_sub(self,pattern, repl,text):
            return re.sub(pattern, repl, text, flags=self.FLAGS)

    def tweet_preprocessing(self,text):
        eyes = r"[8:=;]"
        nose = r"['`-]?"

        def re_sub(pattern, repl):
            return re.sub(pattern, repl, text, flags=self.FLAGS)

        text = re_sub(r"https?:\/\/\S+\b|www\.(\w+\.)+\S*", " ")
        text = re_sub(r"@\w+", "user")
        text = re_sub(r"{}{}[)dD]+|[)dD]+{}{}".format(eyes, nose, nose, eyes), "smile")
        text = re_sub(r"{}{}p+".format(eyes, nose), "laugh")
        text = re_sub(r"{}{}\(+|\)+{}{}".format(eyes, nose, nose, eyes), "sad")
        text = re_sub(r"{}{}[\/|l*]".format(eyes, nose), "neutral")
        text = re_sub(r"/"," / ")
        text = re_sub(r"<3","love")
        text = re_sub(r"[-+]?[.\d]*[\d]+[:,.\d]*", " ")
        text = re_sub(r"#\S+", self.hashtag)
        text = re_sub(r"([!?.]){2,}", r"\1 repeat")
        text = re_sub(r"\b(\S*?)(.)\2{2,}\b", r"\1\2 <elong>")
        text = re_sub(r"([A-Z]){2,}", self.allcaps)

        return text.lower()

Now we need to make a service for loading saved model of keras and make it a predict function as well. But, saved deep learning models are usually big in size and some of theme even takes time to load themselves. we shall implement the service in a way such that we won’t have to load it, at every call of endpoint.

To avoid this problem, we will be using singleton design pattern.

SentimentService.py

from keras.models import model_from_json
import pickle

class SentimentService(object):
    model1 = None
    tokenizer = None

    @classmethod
    def load_deep_model(self, model):
        json_file = open('./src/mood-saved-models/' + model + '.json', 'r')
        loaded_model_json = json_file.read()
        loaded_model = model_from_json(loaded_model_json)

        loaded_model.load_weights("./src/mood-saved-models/" + model + ".h5")

        loaded_model._make_predict_function()
        return loaded_model

    @classmethod
    def get_model1(self):
        if self.model1 is None:
            self.model1 = self.load_deep_model('model5_ver1')
        return self.model1

    @classmethod
    def load_tokenizer(self):
        if self.tokenizer is None:
            with open('./src/mood-saved-models/tokenizer.pickle', 'rb') as handle:
                self.tokenizer = pickle.load(handle)
        return self.tokenizer

load_tokenizer is for loading saved tokenizer.

Now, we need to build endpoints which will be using these services. We will be building three endpoints.

Health Check, to check status of flask service if it is running or not.
get structure & parameters of saved model
get prediction of the model

Health Check

@app.route("/heath", methods=["GET"])
def heath():
    return Response(json.dumps({"status":"UP"}), status=200, mimetype='application/json')

Get structure of model

@app.route("/show_model", methods=["GET"])
def show_model():
    model = request.args.get("model", default=None,type=str)
    model_format = json.loads(open('mood-saved-models/' + model + '.json').read())
    return Response(json.dumps(model_format), status=200, mimetype='application/json')

Detect mood from text

@app.route('/mood-detect', methods=['POST'])
def model_predict():

    if not request.json or not 'text' in request.json:
        abort(400)

    tp = TextPreprocessing()

    sent = pd.Series(request.json['text'])
    new_sent = [tp.tweet_preprocessing(i) for i in sent]

    seq = SentimentService.load_tokenizer().texts_to_sequences(pd.Series(''.join(new_sent)))
    test = pad_sequences(seq, maxlen=256)

    with backend.get_session().graph.as_default() as g:
        model = SentimentService.get_model1()

    res = model.predict_proba(test,batch_size=32, verbose=0)

    lab_list = ['anger', 'disgust', 'fear', 'guilt', 'joy', 'sadness', 'shame']
    moods = {}
    for actual, probabilities in zip(lab_list, res[0]):
        moods[actual] = 100*probabilities

    return Response(json.dumps(moods), status=200, mimetype='application/json')

Now, we are ready to use this service to detect from a text.

Run main.py and get results after calling endpoints.

$ python src/main.py

To get structure of model [GET]

GET http://0.0.0.0:5000/show_model?model=model5_ver1

Output

{
    "class_name": "Sequential",
    "config": [
        {
            "class_name": "Embedding",
            "config": {
                "name": "embedding_2",
                "trainable": false,
                "batch_input_shape": [
                    null,
                    256
                ],
                "dtype": "float32",
                "input_dim": 57888,
                "output_dim": 100,
                "embeddings_initializer": {
                    "class_name": "RandomUniform",
                    "config": {
                        "minval": -0.05,
                        "maxval": 0.05,
                        "seed": null
                    }
                },
                "embeddings_regularizer": null,
                "activity_regularizer": null,
                "embeddings_constraint": null,
                "mask_zero": false,
                "input_length": 256
            }
        },
        {
            "class_name": "SpatialDropout1D",
            "config": {
                "name": "spatial_dropout1d_4",
                "trainable": true,
                "rate": 0.2,
                "noise_shape": null,
                "seed": null
            }
        },
        {
            "class_name": "Bidirectional",
            "config": {
                "name": "bidirectional_7",
                "trainable": true,
                "layer": {
                    "class_name": "LSTM",
                    "config": {
                        "name": "lstm_13",
                        "trainable": true,
                        "return_sequences": true,
                        "return_state": false,
                        "go_backwards": false,
                        "stateful": false,
                        "unroll": false,
                        "units": 128,
                        "activation": "tanh",
                        "recurrent_activation": "hard_sigmoid",
                        "use_bias": true,
                        "kernel_initializer": {
                            "class_name": "VarianceScaling",
                            "config": {
                                "scale": 1,
                                "mode": "fan_avg",
                                "distribution": "uniform",
                                "seed": null
                            }
                        },
                        "recurrent_initializer": {
                            "class_name": "Orthogonal",
                            "config": {
                                "gain": 1,
                                "seed": null
                            }
                        },
                        "bias_initializer": {
                            "class_name": "Zeros",
                            "config": {}
                        },
                        "unit_forget_bias": true,
                        "kernel_regularizer": null,
                        "recurrent_regularizer": null,
                        "bias_regularizer": null,
                        "activity_regularizer": null,
                        "kernel_constraint": null,
                        "recurrent_constraint": null,
                        "bias_constraint": null,
                        "dropout": 0.2,
                        "recurrent_dropout": 0.2,
                        "implementation": 1
                    }
                },
                "merge_mode": "concat"
            }
        },
        {
            "class_name": "BatchNormalization",
            "config": {
                "name": "batch_normalization_10",
                "trainable": true,
                "axis": -1,
                "momentum": 0.99,
                "epsilon": 0.001,
                "center": true,
                "scale": true,
                "beta_initializer": {
                    "class_name": "Zeros",
                    "config": {}
                },
                "gamma_initializer": {
                    "class_name": "Ones",
                    "config": {}
                },
                "moving_mean_initializer": {
                    "class_name": "Zeros",
                    "config": {}
                },
                "moving_variance_initializer": {
                    "class_name": "Ones",
                    "config": {}
                },
                "beta_regularizer": null,
                "gamma_regularizer": null,
                "beta_constraint": null,
                "gamma_constraint": null
            }
        },
        {
            "class_name": "Bidirectional",
            "config": {
                "name": "bidirectional_8",
                "trainable": true,
                "layer": {
                    "class_name": "LSTM",
                    "config": {
                        "name": "lstm_14",
                        "trainable": true,
                        "return_sequences": false,
                        "return_state": false,
                        "go_backwards": false,
                        "stateful": false,
                        "unroll": false,
                        "units": 128,
                        "activation": "tanh",
                        "recurrent_activation": "hard_sigmoid",
                        "use_bias": true,
                        "kernel_initializer": {
                            "class_name": "VarianceScaling",
                            "config": {
                                "scale": 1,
                                "mode": "fan_avg",
                                "distribution": "uniform",
                                "seed": null
                            }
                        },
                        "recurrent_initializer": {
                            "class_name": "Orthogonal",
                            "config": {
                                "gain": 1,
                                "seed": null
                            }
                        },
                        "bias_initializer": {
                            "class_name": "Zeros",
                            "config": {}
                        },
                        "unit_forget_bias": true,
                        "kernel_regularizer": null,
                        "recurrent_regularizer": null,
                        "bias_regularizer": null,
                        "activity_regularizer": null,
                        "kernel_constraint": null,
                        "recurrent_constraint": null,
                        "bias_constraint": null,
                        "dropout": 0.2,
                        "recurrent_dropout": 0.2,
                        "implementation": 1
                    }
                },
                "merge_mode": "concat"
            }
        },
        {
            "class_name": "Dense",
            "config": {
                "name": "dense_10",
                "trainable": true,
                "units": 7,
                "activation": "sigmoid",
                "use_bias": true,
                "kernel_initializer": {
                    "class_name": "VarianceScaling",
                    "config": {
                        "scale": 1,
                        "mode": "fan_avg",
                        "distribution": "uniform",
                        "seed": null
                    }
                },
                "bias_initializer": {
                    "class_name": "Zeros",
                    "config": {}
                },
                "kernel_regularizer": null,
                "bias_regularizer": null,
                "activity_regularizer": null,
                "kernel_constraint": null,
                "bias_constraint": null
            }
        }
    ],
    "keras_version": "2.2.2",
    "backend": "tensorflow"
}

To get prediction [POST]

POST http://0.0.0.0:5000/mood-detect

Request body

{
	"text": "great i am liking it"
}

Response

{
    "anger": 7.112710922956467,
    "disgust": 3.1775277107954025,
    "fear": 12.434638291597366,
    "guilt": 2.8116755187511444,
    "joy": 56.977683305740356,
    "sadness": 13.96680623292923,
    "shame": 3.2702498137950897
}

Github repository

Source code is available on the github repository. You can clone the project from github and run it on your system.

Production deployment using WSGI

You can checkout these 3 series articles for production deployment of Flask endpoints:

Thanks for reading this article.

ebook PDF - Cracking Java Interviews v3.5 by Munish Chandel

Book you may be interested in..

ebook PDF - Cracking Spring Microservices Interviews for Java Developers

Find more on this topic:

Machine Learning

Data science, machine learning, python, R, big data, spark, the Jupyter notebook, and much more

Last updated 1 week ago

Subscribe to Interview Questions

Do you like cookies? 🍪 We use cookies to ensure you get the best experience on our website. Learn more

Deploying Keras Model in Production using Flask

Introduction

Keras

NLTK

Flask

Mood detection Model

Directory structure

Text pre-processing

To get structure of model [GET]

To get prediction [POST]

Github repository

Production deployment using WSGI

Top articles in this category:

Recommended books for interview preparation:

ebook PDF - Cracking Java Interviews v3.5 by Munish Chandel

ebook PDF - Cracking Spring Microservices Interviews for Java Developers

Find more on this topic:

Machine Learning

Subscribe to Interview Questions