Named Entity Recognition using spaCy and Flask

Upasana | October 19, 2019 | 8 min read | 1,491 views | Flask - Python micro web framework


In this tutorial you will learn how to build/deploy flask based REST endpoint for Named Entity Recognition using spaCy’s statistical models.

Tutorial outline

  1. Setting up the environment

  2. Named Entity recognition using spaCy

  3. Creating Flask endpoint for NER using spaCy

  4. Production deployment

    • Configure WSGI production server

    • Nginx configuration

    • Creating System Service on Ubuntu

  5. API Benchmarking

  6. Writing integration testcase for API endpoints

We will be following a steps in refrence to github project : https://github.com/upasana-mittal/ner_spacy_app So you can clone it to get started with project.

Step 1. Setting up the virtual environment

Why do we need virtual environment?

Creating virtual environment has following benefits:

  1. Virtual enironment let you run your project in an isolated environment, which does not affect environment of other projects.

  2. You can freeze the exact dependency versions using requirements.txt, which enables you to replicate the exact environment later on another machine.

  3. You can use multiple versions of python and dependent packages in a project without effecting system version.

You must have Python 3.6 installed on your system.

Install the virtual environment
$ pip install virtualenv
Create the virtual environment
$ virtualenv -p python3.6 venv
Activate virtual environment
$ source venv/bin/activate
Install the requirements by using the below command:
$ pip install -r requirements.txt
Afterwards, install spaCy packages using below commands:
$ python -m spacy download en
$ python -m spacy download en_core_web_md
$ python -m spacy download en_core_web_lg
Run program
$ python src/main.py

Step 2. Named Entity recognition using spaCy

What is Named Entity Recognition?

Named Entity Recognition is also known as entity extraction and works as information extraction which locates named entities mentioned in unstructured text and tags them into pre-defined categories such as PERSON, ORGANISATION, LOCATION, DATE TIME etc.

Essential info about entities:

Code Value

GEO

Geographical Entity

ORG

Organization

PER

Person

GPE

Geopolitical Entity

TIM

Time indicator

ART

Artifact

EVE

Event

NAT

Natural Phenomenon

In NER, entities are tagged based on BILUO scheme

 B egin - First token of multi-token entity
 I n    - Middle token of multi-token entity
 L ast  - Last token of multi-token entity
 U nit  - A single token entity
 O ut   - A non-token entity

Input : Elon Musk is the founder of Tesla

Output : Elon Musk(PER) is the founder of Tesla(ORG)

What is spaCy?

spaCy is a free open source library for natural language processing in python. It features Named Entity Recognition(NER), Part of Speech tagging(POS), word vectors etc. Using spaCy, one can easily create linguistically sophisticated statistical models for a variety of NLP Problems.

For more knowledge, visit https://spacy.io/

Here, we will be using spaCy’s statistical models for detecting entities from raw text. spaCy provides APIs for entity detetction from many languages like :

  • English

  • German

  • French

  • Spanish

  • Portugese

  • Italian

  • Dutch

  • Greek

  • Multi-language support for only PER, LOC, ORG and MISC entities detection

Currently, we will be developing api only for English language. For english language, we have three kinds of models i.e.

Three versions of spaCy models

  • en_core_web_sm : This model is multi-task CNN trained on OntoNotes. It assigns context-specific token vectors, POS tags, dependency parse and named entities.

  • en_core_web_md : This model is multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

  • en_core_web_lg : This model is multi-task CNN trained on OntoNotes, with GloVe vectors trained on Common Crawl. Assigns word vectors, context-specific token vectors, POS tags, dependency parse and named entities.

Demo spaCy application
import spacy

nlp = spacy.load('en_core_web_lg') ## Loading model -- en_core_web_lg

input = "Neeta Goyal owned Jet Airways to shut operations India today"

tag_entities = [(x, x.ent_iob_, x.ent_type_) for x in nlp(input)]
print(tag_entities)
Output
[(Neeta, 'B', 'PERSON'), (Goyal, 'I', 'PERSON'), (owned, 'O', ''), (Jet, 'B', 'ORG'), (Airways, 'I', 'ORG'), (to, 'O', ''), (shut, 'O', ''), (operations, 'O', ''), (India, 'B', 'GPE'), (today, 'O', '')]
Input
entities = dict([(str(x), x.label_) for x in nlp(input).ents])
print(entities)
Output
{'Neeta Goyal': 'PERSON', 'Jet Airways': 'ORG', 'India': 'GPE'}

Step 3. Creating Flask endpoint for NER using spaCy

Flask is a micro web framework written in python, which is frequently used by developers to create simple REST endpoints.

Flask homepage

http://flask.pocoo.org

We will create two python scripts for our flask application:

NamedEntity.py

this class holds and retains the model in memory for improved performance. It exposes a method get_entities() that gets the model and gets the entities.

/src/NamedEntity.py
import en_core_web_lg, en_core_web_sm, en_core_web_md

class NamedEntityService(object):
    model1 = None  # Where we keep the model when it's loaded
    model2 = None  # Where we keep the model when it's loaded
    model3 = None  # Where we keep the model when it's loaded

    @classmethod
    def get_model1(cls):
        """Get the model object for this instance, loading it if it's not already loaded."""
        if cls.model1 is None:
            cls.model1 = en_core_web_sm.load()
        return cls.model1

    @classmethod
    def get_model2(cls):
        """Get the model object for this instance, loading it if it's not already loaded."""
        if cls.model2 is None:
            cls.model2 = en_core_web_md.load()
        return cls.model2

    @classmethod
    def get_model3(cls):
        """Get the model object for this instance, loading it if it's not already loaded."""
        if cls.model3 is None:
            cls.model3 = en_core_web_lg.load()
        return cls.model3

    @classmethod
    def get_entities(cls, input, mod):
        """For the input, get entities and return them."""
        switcher = {
            "en_core_web_sm": cls.get_model1(),
            "en_core_web_md": cls.get_model2(),
            "en_core_web_lg": cls.get_model3()
        }
        clf = switcher.get(mod, cls.get_model1())
        return dict([(str(x), x.label_) for x in clf(input).ents])
main.py

This is the main script that creates Flask API. It serves the incoming HTTP requests with text payload and return the recognized entities in JSON format. Under the hood, NamedEntity.py is used to fetch the named entities. Here we are using en_core_web_sm model for detecting entities.

/src/main.py
@app.route("/predict_entity", methods=["POST"])
def predict_entity():
    text = request.json["text"]
    model = request.json["model"]
    entity_dict = NamedEntityService.get_entities(text, model)
    return Response(json.dumps(entity_dict), status=200, mimetype='application/json')

Here is the main entry point for the application that runs the application on port 5000 with debug flag on.

main entry point for flask app
if __name__ == "__main__":
    app.run(port = 5000, debug=True, threaded=True)

By default flask app will run using Werkzeug WSGI server, which is fine for development purpose. For production deployment, gunicorn or some other production WSGI server should be used.

main.py script will be something like below:

from flask import Flask, Response, request
import json
from src.NamedEntity import *

app = Flask(__name__)

@app.route("/health", methods=['GET'])
def get_status():
    return Response(json.dumps({"status":"UP"}), status=200, mimetype='application/json')

@app.route("/", methods=['GET'])
def get_help():
    return Response(json.dumps(
        {"languages supported": ["English"],
         "models": {"en_core_web_sm": {"Language": "English",
                                       "Description": "This model is multi-task CNN trained on OntoNotes. "
                                                      "Assigns context-specific token vectors, POS tags, "
                                                      "dependency parse and named entities.",
                                       "Accuracy Metrics": {"F-Score": 85.86,
                                                            "Precision": 86.33,
                                                            "Recall": 85.39}},

                    "en_core_web_md": {"Language": "English",
                                       "Description": "This model is multi-task CNN trained on OntoNotes, "
                                                      "with GloVe vectors trained on Common Crawl. Assigns word vectors, "
                                                      "context-specific token vectors, POS tags, dependency parse and named entities.",
                                       "Accuracy Metrics": {"F-Score": 85.56,
                                                            "Precision": 86.88,
                                                            "Recall": 86.25}},
                    "en_core_web_lg": {"Language": "English",
                                       "Description": "This model is multi-task CNN trained on OntoNotes, with GloVe vectors"
                                                      " trained on Common Crawl. Assigns word vectors, context-specific token vectors,"
                                                      " POS tags, dependency parse and named entities.",
                                       "Accuracy Metrics": {"F-Score": 85.56,
                                                            "Precision": 86.88,
                                                            "Recall": 86.25}}
                    },
         "sample input body": {
             "text": "i will fire you raman prasad. Return my hyundai verna and hyundai creta",
             "model": "en_core_web_sm"
         }
         }
    ), status=200, mimetype='application/json')

@app.route("/predict_entity", methods=["POST"])
def predict_entity():
    text = request.json["text"]
    model = request.json["model"]
    entity_dict = NamedEntityService.get_entities(text, model)
    return Response(json.dumps(entity_dict), status=200, mimetype='application/json')

@app.route("/predict_entity_sm", methods=["GET"])
def predict_entity_sm():
    """
            for h2load testing
                    """
    text = request.args.get("text", type=str)
    model = "en_core_web_sm"
    entity_dict = NamedEntityService.get_entities(text, model)
    return Response(json.dumps(entity_dict), status=200, mimetype='application/json')


if __name__ == "__main__":
    app.run(port = 5000, debug=True, threaded=True)

Invoking the endpoint

We can use either Postman or curl to invoke the REST endpoint created above.

Using Postman to invoke the endpoint

You can use Postman to invoke the REST endpoint.

Request URI
POST http://localhost:5000/predict_entity
Request body
{
    "text": "Raman Prasad, return my Hyundai Verna and Hyundai Creta",
    "model": "en_core_web_sm"
}
Response body
{
    "Raman Prasad": "PERSON",
    "Hyundai": "ORG",
    "Verna": "PRODUCT"
}

Using curl to invoke the endpoint

We can use command line utility curl to invoke the predict_entity endpoint:

Request
$ curl --header "Content-Type: application/json" \
    --request POST \
    --data '{"text":"Return my Hyundai Verna and Hyundai Creta","model":"en_core_web_sm"}' \
    http://localhost:5000/predict_entity
Response
{"Hyundai": "ORG", "Verna": "PRODUCT", "Creta": "PRODUCT"}

Step 4. Production deployment

In order to run this application in production environment, we need to make the following changes:

  1. use production WSGI server instead of in-built development server (Werkzeug).

  2. Use reverse proxy to protect/speed up and load balance the backend server

When running publicly rather than in development, you should not use the built-in development server (flask run). The development server is provided by Werkzeug for convenience, but is not designed to be particularly efficient, stable, or secure.

flask setup production
Typical Flask Setup in a production environment

We can choose gunicorn as the production WSGI server, which can be installed using below pip command:

Install gunicorn
$ pip install gunicorn

Afterwards, we just need to start the app using WSGI gunicorn, with below command:

Start gunicorn with 2 worker thread
$ gunicorn --workers 2 --bind 127.0.0.1:5000 app.src.wsgi:app
Program output
[2019-04-16 19:30:33 +0530] [94095] [INFO] Starting gunicorn 19.9.0
[2019-04-16 19:30:33 +0530] [94095] [INFO] Listening at: http://0.0.0.0:5000 (94095)
[2019-04-16 19:30:33 +0530] [94095] [INFO] Using worker: sync
[2019-04-16 19:30:33 +0530] [94100] [INFO] Booting worker with pid: 94100

Creating Service

Create a systemd service on your ubuntu box with the below configuration.

/etc/systemd/system/carvia-spacy.service
[Unit]
Description=Carvia Spacy NER
After=syslog.target

[Service]
User=munish
WorkingDirectory=/home/munish/build/ner_spacy_tutorial/src
Environment="PATH=/home/munish/build/ner_spacy_tutorial/venv/bin"
ExecStart=/home/munish/build/ner_spacy_tutorial/venv/bin/gunicorn --workers=1 --timeout=120 --bind 127.0.0.1:8060 wsgi:app
SuccessExitStatus=143

[Install]
WantedBy=multi-user.target

Reload the systemd configuration from disk:

$ sudo systemctl daemon-reload
$ sudo systemctl start carvia-spacy.service
$ sudo systemctl stop carvia-spacy.service

Server shall start using systemd service now.

API Benchmarking

h2load is a modern API benchmarking tool that we can use for performance testing of our recently created endpoint.

h2load for benchmarking
$ h2load -n1000 -c8 --h1 http://localhost:5000/predict_entity_sm?text=Sunny+Singh+is+CEO+of+Edifecs
Benchmarking stats
finished in 2.79s, 358.00 req/s, 4.20KB/s
requests: 1000 total, 1000 started, 1000 done, 1000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 1000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 11.73KB (12008) total, 110.35KB (113000) headers (space savings 0.00%), 24.41KB (25000) data
                     min         max         mean         sd        +/- sd
time for request:     7.79ms     29.17ms     21.94ms      2.60ms    71.90%
time for connect:       89us       633us       257us       232us    75.00%
time to 1st byte:     7.91ms     24.09ms     20.21ms      5.92ms    87.50%
req/s           :      44.75       45.15       44.90        0.15    62.50%

Top articles in this category:
  1. Flask Interview Questions
  2. Part 3: Dockerize Flask application and build CI/CD pipeline in Jenkins
  3. Part 2: Deploy Flask API in production using WSGI gunicorn with nginx reverse proxy
  4. Part 1: Creating and testing Flask REST API
  5. Deploying Keras Model in Production using Flask
  6. Blueprints in Flask API Development
  7. Google Data Scientist interview questions with answers

Recommended books for interview preparation:

Find more on this topic: