Deployment of Containerized Machine Learning Model Application on AWS Elastic Container Service(ECS)

By Srinivas Pratapgiri (Goanna student)

Machine learning engineer has to build, train and also deploy the machine learning model using the data that has been provided to him so that end-users around the world can use the trained model to make predictions. My objective here is to build a model using Logistic Regression algorithm to predict whether the patient is diagnosed as Benign or Malignant for Breast Cancer examination, convert this trained model into a web application, containerize and deploy it on Amazon Elastic Container Service(ECS).

In this blog, I will explain the below

  1. Process of training the machine learning model on the local machine, and saving the model as a pickle file so that we need not train the model every time we do predictions on it.
  2. Converting the trained model as web application using Flask web framework
  3. Uploading the folder to GitHub, building and pushing Docker Image to AWS Elastic Container Registry (ECR).
  4. Finally deploying the Containerized web app on Amazon Elastic Container Service(ECS).

1) Building, Training and Testing Machine Leaning Model

The process of building, training, and testing the Logistic Regression model with Wisconsin Breast Cancer data set has already been explained in my previous blog Logistic Regression — Breast Cancer Prediction(link provided below) to understand this step better. The prediction is based on 30 variables present in the data set

2) Pickle file

The model after being properly trained will be ready to do predictions on the unseen data. To do predictions multiple times we need to run the entire training process multiple times which is cumbersome. If we can convert the trained model and save it as pickle file, we don’t have to go through the training process each time to predict on new data parameters. This is done by Pickle, a python package that serializes the python object into byte streams and can be stored on the local disk. Here model.pkl is the pickle file of the trained model

import pickle
with open('./model.pkl', 'wb') as model_pkl:
    pickle.dump(model, model_pkl)

3) Flask API

Next step in the process, will be to build a web application for predicting the breast cancer.To create a web application, we need to have a framework that creates both web apps and APIs.One of the popular web framework for Python is Flask.We will use Flask to build our web application.

First step in building the API will be to load the pickle file that has been saved in the previous step

# Load the pickle file 
with open('./model.pkl', 'rb') as model_pkl:
    model = pickle.load(model_pkl)

Next import Flask and request methods from the flask library, next instantiate a Flask object named app.

# Import Flask ans request for creating API
from flask import Flask, request
    
# Initialise a Flask object
app = Flask(__name__)

Create a function called predict_cancer that takes input from a web browser to do predictions.The code is given below

# Create an API endpoint for predicting
@app.route('/predict')
def predict_cancer():
    # Read all necessary request parameters
    s1 = request.args.get('s1')
    s2 = request.args.get('s2')
    s3 = request.args.get('s3')
    s4 = request.args.get('s4')
    s5 = request.args.get('s5')
    s6 = request.args.get('s6')
    s7 = request.args.get('s7')
    s8 = request.args.get('s8')
    s9 = request.args.get('s9')
    s10 = request.args.get('s10')  
    s11 = request.args.get('s11')
    s12 = request.args.get('s12')
    s13 = request.args.get('s13')
    s14 = request.args.get('s14')
    s15 = request.args.get('s15')
    s16 = request.args.get('s16')
    s17 = request.args.get('s17')
    s18 = request.args.get('s18')
    s19 = request.args.get('s19')
    s20 = request.args.get('s20')  
    s21 = request.args.get('s21')
    s22 = request.args.get('s22')
    s23 = request.args.get('s23')
    s24 = request.args.get('s24')
    s25 = request.args.get('s25')
    s26 = request.args.get('s26')
    s27 = request.args.get('s27')
    s28 = request.args.get('s28')
    s29 = request.args.get('s29')
    s30 = request.args.get('s30') 
    
    # get the prediction for unseen data
    unseen_data = np.array([[s1, s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12,s13,s14,s15,s16,s17,s18,s19,s20,s21,s22,s23,s24,s25,s26,s27,s28,s29,s30]]).astype(np.float64)
    
    result = model.predict(unseen_data)
    
    # return the result
    return 'Predicted result for observation ' + str(unseen_data) + ' is: ' + str(result)

The last step will be to specify the host and port on which the app should run.

#running the server
if __name__ == '__main__':
     app.run(host='0.0.0.0', port=5000)
view raw

All the previous code blocks were put together and named as app.py

import pickle
import sys
import numpy as np
from sklearn.linear_model import LogisticRegression

# Load the pickle file 
with open('./model.pkl', 'rb') as model_pkl:
    model = pickle.load(model_pkl)
    
# Import Flask for creating API
from flask import Flask, request
    
# Initialise a Flask object
app = Flask(__name__)

# Create an API endpoint for predicting
@app.route('/predict')
def predict_cancer():
    # Read all necessary request parameters
    s1 = request.args.get('s1')
    s2 = request.args.get('s2')
    s3 = request.args.get('s3')
    s4 = request.args.get('s4')
    s5 = request.args.get('s5')
    s6 = request.args.get('s6')
    s7 = request.args.get('s7')
    s8 = request.args.get('s8')
    s9 = request.args.get('s9')
    s10 = request.args.get('s10')      
    s11 = request.args.get('s11')
    s12 = request.args.get('s12')
    s13 = request.args.get('s13')
    s14 = request.args.get('s14')
    s15 = request.args.get('s15')
    s16 = request.args.get('s16')
    s17 = request.args.get('s17')
    s18 = request.args.get('s18')
    s19 = request.args.get('s19')
    s20 = request.args.get('s20')     
    s21 = request.args.get('s21')
    s22 = request.args.get('s22')
    s23 = request.args.get('s23')
    s24 = request.args.get('s24')
    s25 = request.args.get('s25')
    s26 = request.args.get('s26')
    s27 = request.args.get('s27')
    s28 = request.args.get('s28')
    s29 = request.args.get('s29')
    s30 = request.args.get('s30') 

    #prediction for unseen data
    unseen_data = np.array([[s1, s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12,s13,s14,s15,s16,s17,s18,s19,s20,s21,s22,s23,s24,s25,s26,s27,s28,s29,s30]]).astype(np.float64)
    
    result = model.predict(unseen_data)
    
    # return the result
    return 'Predicted result for observation ' + str(unseen_data) + ' is: ' + str(result)

if __name__ == '__main__':
     app.run(host='0.0.0.0', port=5000)

Finally, we can do predictions on the web app from any web browser by copying the URL shown below. The link is longer, as it contains 30 variables

http://localhost:5000/predict?s1=-0.96666522&s2=0.32786912&s3=-0.93579507&s4=-0.91104225&s5=0.60962671&s6=0.36569592&s7=-0.10914833&s8=-0.62181482&s9=-0.63860111&s10=0.53651178&s11=-0.46379509&s12=0.5132434&s13=-0.45632075&s14=-0.59189989&s15=0.67370318&s16=1.26928541&s17=2.17185315&s18=1.12535098&s19=0.64821758&s20=1.09244461&s21=-0.96440581&s22=-0.08750638&s23=-0.94145109&s24=-0.84547739&s25=-0.07511418&s26=-0.01862761&s27=-0.10400188&s28=-0.47718048&s29=-0.5634723&s30=0.05526303

The result for above URL with the given data will be ‘0’, 0 means the patient is diagnosed as Benign for breast cancer.

Predicted result for observation [[-0.96666522 0.32786912 -0.93579507 -0.91104225 0.60962671 0.36569592 -0.10914833 -0.62181482 -0.63860111 0.53651178 -0.46379509 0.5132434 -0.45632075 -0.59189989 0.67370318 1.26928541 2.17185315 1.12535098 0.64821758 1.09244461 -0.96440581 -0.08750638 -0.94145109 -0.84547739 -0.07511418 -0.01862761 -0.10400188 -0.47718048 -0.5634723 0.05526303]] is: [‘0’]

A screen shot is shown below

4) Docker Containerization

Now the machine learning model web application can run on the local machine. The web app which runs on local machine may not run on another machine because of the following reasons

1) The OS and environment variables compatibility issues.

2) Python and its libraries such as Sci-kit learn, Pandas, Numpy, Matplotlib, Seaborn etc may not be installed on it. Even if they are installed, they could be different versions.

All these platform dependency problems can be solved by containerization. What is a Container ?

A container is a software that packages up code and all its dependencies, so the application runs quickly and reliably from one machine to another.

How can these containers be created ?

These are created by Docker software.

What is a Docker ?

A Docker is a tool designed to simplify the steps — create, deploy, and run applications on containers.

My working directory is my_docker. It contains the machine learning model file, pickle file model.pkl, text file requirements.txt and a docker file Dockerfile

requirements.txt

Contains all the python packages along with versions required to run the model

flask

sklearn==0.0

scikit-learn==0.24.1

pandas==0.25.1

numpy==1.16.5

matplotlib==3.1.1

seaborn==0.9.0

Dockerfile

FROM python:3.7
COPY ./requirements.txt  /docker_app/requirements.txt
WORKDIR /docker_app
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
COPY . /docker_app
EXPOSE 5000
ENTRYPOINT ["python"]
CMD ["app.py"] 

Each command in the Dockerfile is explained as follows:

FROM python:3.7 :The base docker image is Python,we will build our image on the top of this image.

COPY ./requirements.txt /docker_app/requirements.txt:

The COPY command copies requirement.txt file on the current folder to a folder called docker_app within the image we are trying to build.

WORKDIR /docker_app: WORKDIR command changes the working directory to docker_app within the image.

RUN pip install — upgrade pip: installs and upgrade pip package

RUN pip install -r requirements.txt: installs all the packages listed in requirement text file.

COPY . /docker_app: copy everything else into image.

EXPOSE 5000: will expose port 5000 to the outside for accessing web app

CMD [“app.py”]: will start the app.py application whenever the container boots up.

4) Uploading folder to GitHub

my_docker folder containing all the above files has been uploaded to GitHub. One can access them through the link below

https://github.com/spratapa/my_docker.git

Next step is to create a Docker Image and push it to Amazon ECR.

5) Creating a Docker Image and push it to Amazon ECR

In this section, I will walk you through the steps in building, tagging and pushing a Docker Image into Amazon ECR.

What is Amazon ECR ?

Amazon Elastic Container Registry (ECR) is a fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts anywhere. Amazon ECR eliminates the need to operate your own container repositories or worry about scaling the underlying infrastructure. Amazon ECR hosts your images in a highly available and high-performance architecture, allowing you to reliably deploy images for your container applications.

Log into your AWS Console and open the Amazon ECR service.

Create a new repository cancer-repository

cancer-repository will be created with no Docker Image in it

The following commands are available once you open the view push commands tab

you can build, tag and push the docker images into Amazon ECR repository in many ways. We have chosen to use EC2 Instance. Create an EC2 instance, SSH to the instance, install all the packages and start building the Docker Image.

All the commands that are used to install packages, clone repository from GitHub, next give authentication to Docker to use Amazon ECR, next build, tag and push the Docker Image to the Amazon ECR repository as shown below

sudo yum install -y awscli # Install awscli
sudo yum install -y git # Install git
sudo yum install -y docker # Install docker
sudo service docker start # start docker
sudo git clone https://github.com/spratapa/my_docker.git # get the my_docker file from git
aws ecr get-login-password --region ap-southeast-2 | docker login --username AWS --password-stdin <AWS Account Number>.dkr.ecr.ap-southeast-2.amazonaws.com # authenticate docker with Amazon ECR
docker build -t cancer-proj . #Build docker Image with name cancer-proj
docker tag cancer-proj:latest <AWS Account Number>.dkr.ecr.ap-southeast-2.amazonaws.com/cancer-repository:latest
docker push <AWS Account Number>.dkr.ecr.ap-southeast-2.amazonaws.com/cancer-repository:latest # push the Image to Amazon ECR Repository

The Docker Image with Image tag latest is now available for use in the cancer-repository

Now we can create and deploy containers with the Docker Image present in the Amazon ECR. The number of containers to be deployed depends on the number of the end users across the globe. There are many container orchestration tools available to do this job for us. Kubernetes, Amazon ECS, Amazon EKS are some of those that are widely used. For this application, we used Amazon ECS service for deployment and management of Docker containers.

6) Deploying CancerApp on Amazon ECS

Before going through the deployment process using Amazon ECS, let us know What is Amazon ECS ?

Amazon ECS is a fully managed container orchestration service. Amazon ECS supports Docker and enables you to run and manage Docker containers. Applications you package as a container locally will deploy and run on Amazon ECS without the need for any configuration changes.

The following are steps involved in this section

a) Creating Amazon ECS Cluster

Cluster is a logical grouping of tasks or services.Your tasks and services are run on infrastructure that is registered to a cluster. The Infrastructure that is chosen for the cluster of this web app is EC2 instance.

The steps involved in creation of a Cluster named CancerAppCluster is shown below

Selecting EC2 Instance as Launch type for the Cluster

we have selected t2.micro as the EC2 Instance type for application as it is part of the free tier

Select default VPC, Subnets, Security Group and Enable Auto assign public IP

Finally, a Cluster CancerAppCancer has been created with EC2 Instance as launch type.

Next step in the process is to create a task definition. Let us see what is Task definition ?

b) Creating a new Task definition

Amazon ECS allows you to define tasks through a declarative JSON template called a Task Definition. Within a Task Definition you can specify one or more containers that are required for your task, including the Docker repository and image, memory and CPU requirements, shared data volumes, and how the containers are linked to each other. You can launch as many tasks as you want from a single Task Definition file that you can register with the service. Task Definition files also allow you to have version control over your application specification.

The followings are the steps involved in creating a new Task definitions

click create a new Task definition

select EC2 Instance type

Name the Task Definition as CancerAppTaskDef

Create a container CancerAppContainer by pulling the Docker Image ECR. Specify the host and container ports.

c) Running Tasks

In this step, we will run the Tasks that have been configured in Task definitions of the Cluster.

click Run new Task and specify Task Definition and Cluster names

Task is running now

Final step in the whole process is to test the app with unseen data.

6) Testing the CancerApp

Now that my CancerApp is running on Amazon ECS, it is time to do prediction from the app with unseen variables to know whether the patient is diagnosed as Benign or Malignant.

Open the EC2 Instance that is running the app, copy the public IPv4 address and open it on port 8888 to predict . The public address is 54.253.8.196.

Copy the URL given below in any web browser to see the result of the prediction.

http://54.253.8.196:8888/predict?s1=-0.96666522&s2=0.32786912&s3=-0.93579507&s4=-0.91104225&s5=0.60962671&s6=0.36569592&s7=-0.10914833&s8=-0.62181482&s9=-0.63860111&s10=0.53651178&s11=-0.46379509&s12=0.5132434&s13=-0.45632075&s14=-0.59189989&s15=0.67370318&s16=1.26928541&s17=2.17185315&s18=1.12535098&s19=0.64821758&s20=1.09244461&s21=-0.96440581&s22=-0.08750638&s23=-0.94145109&s24=-0.84547739&s25=-0.07511418&s26=-0.01862761&s27=-0.10400188&s28=-0.47718048&s29=-0.5634723&s30=0.05526303

The CancerApp is not accessible from the above public address and port 8888 which is shown below.

Why is my app not running ?

To troubleshoot this issue

Open the security group attached to the EC2 Instance , open the Edit inbound rules and we can see port 80 which is accessible from the internet, but we hosted the app on port 8888.

So, we should add Port 8888 and make it accessible to all internet users

Hurray! finally my CancerApp is in operation. Now when we copy the URL in the browser, it predicted the result as ‘0’ which means a patient is diagnosed as Benign for breast cancer.


Reproduced with permission
Author: Srinivas Pratapgiri (Goanna student)
Source: https://srinipratapgiri.medium.com/deployment-of-containerized-machine-learning-model-application-on-aws-elastic-container-cbd1464643b3

Sentiment Analysis on Customer Data — Microservice deployed in Conda environment on a docker container.

By Srinivas Pratapgiri (Goanna student)

Microservice architecture is used to build large scale and complex applications that are composed of small, independent and loosely coupled services. Docker containers are used to deploy microservices.

The main objective of this blog is to

  1. Develop a sentiment analysis classifier to predict the outcomes of the given customer data as either Positive or Negative using PyTorch Transformer library
  2. Convert the classifier into a web application using flask and REST API to make model inferences against arbitrary user data
  3. Build a Dockerfile with Ubuntu as base Image and create a Conda environment to manage Python dependencies.
  4. Deploy the Docker Image on to the Docker container using Docker Compose
  5. Use a bash script containing cURL commands to make inferences from the classifier as either Positive or Negative

1) Sentiment Analysis Classifier

The model for predicting the sentiments of the customer data is built using Transformer library, which contains pre-trained models for NLP — Natural Language Processing. The library is backed by three most popular deep learning libraries- PyTorchTensorFlow and Jax. The model uses the pipeline object in transformers to make all preprocessing and post processing steps on the input text data and generates the inferences on the text. Next, in your python application program import the pickle module to store model object in a file and get it back later on.

import transformers

from transformers import pipeline

model = pipeline("sentiment-analysis")

import pickle
with open('./model.pkl', 'wb') as model_pkl:
    pickle.dump(model, model_pkl)

2) Web Application

Next step in the process is to convert the sentiment classifier into a web application using flask python web framework and use Flask-RESTful API to build a REST API to make inferences from sentiment classifier.

The steps are explained below

a) Import Flask class and an instance of this class will be our Web Service Gate Interface(WSGI) application.

b) Next create an instance of this class . The first argument is the name of the applications module package, where it looks for resources such as templates and static files.

c) We then use the route() decorator to tell Flask which URL should trigger the function i.e the decorator tells the @app that whenever a user visits the app domain at the given .route(for eg. /prediction), execute the prediction() function. In addition to accepting the URL of a route as a parameter, the @app.route() decorator can accept a second argument, a list of HTTP methods. In my application I used POST method to send the customer data.

d) To access the incoming data in Flask inside the prediction(), we have to use the request object. request.get_json() converts the JSON object into Python object. The function jsonify() returns the response in the JSON format.

e) Finally, the run() method runs the application. The default Host it listens on is local host(127.0.0.1) and in my application the Host is set to ‘0.0.0.0’ so that server is available externally.

The code for the above steps is shown below:

#import transformers
from transformers import pipeline

import pickle
with open('./model.pkl', 'rb') as model_pkl:
    model = pickle.load(model_pkl)
        
# Import Flask for creating API
from flask import Flask, request,jsonify

# Initialise a Flask object
app = Flask(__name__)

# Create an API endpoint for predicting
@app.route('/predict')

def predict():
# return the result back
    return  "predict"
    
@app.route('/prediction', methods=['POST'])
def prediction():
    data = request.get_json(force=True)
    s1=data['s']
    result = model(s1)[0]
    final = {"Label:": result['label'],
                 "Confidence Score:":result['score']
           }
    return jsonify(final)

if __name__ == '__main__':
     app.run(host='0.0.0.0', port=5000)

Now we can run the app on the local host with the following command on command line or anaconda cmd.

python app.py

Now that the app is running on the local host and we can make inferences from the app by giving some input text using cURL commands as shown below.

curl -H "Content-Type: application/json" http://127.0.0.1:5000/prediction -d "{\"s\" : \"I enjoy studying computational algorithms\"}"

The result of the above prediction is Positive with a confidence score of 0.99 as shown below:

{"Confidence Score:":0.998389720916748,"Label:":"POSITIVE"}

It is tested with an another input:

curl -H "Content-Type: application/json" http://127.0.0.1:5000/prediction -d "{\"s\" : \"This Challenge is Terrifying!\"}"

The result of the above prediction is Negative with a confidence score of 0.98.

{"Confidence Score:":0.983649492263794,"Label:":"NEGATIVE"}

3) Dockerfile

This step is all about building a Dockerfile with base image as Ubuntu:18.04 and Conda environment to manage all Python dependencies with the specifications as provided in the environment.yml file which is shown as below:

name: test
channels:
- conda-forge
- defaults
- pytorch
dependencies:
- pip
- python=3.6
- pytorch
- transformers
- flask

A brief about Ubuntu and Conda.

Ubuntu is an open source Linux operating system distribution that runs on the desktop,cloud and on internet connected things.

Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. Conda quickly installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments on your local computer. It was created for Python programs, but it can package and distribute software for any language.

Conda as a package manager helps you find and install packages. If you need a package that requires a different version of Python, you do not need to switch to a different environment manager, because Conda is also an environment manager. With just a few commands, you can set up a totally separate environment to run that different version of Python, while continuing to run your usual version of Python in your normal environment.

The Dockerfile for my application is as follows and I will explain the instructions of this file.

#Base Image
FROM ubuntu:18.04

# Update Ubuntu image and install curl package
RUN apt-get update && apt-get install -y curl

# Install miniconda to /miniconda
RUN curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
RUN bash Miniconda3-latest-Linux-x86_64.sh -p /miniconda -b
RUN rm Miniconda3-latest-Linux-x86_64.sh
ENV PATH=/miniconda/bin:${PATH}
RUN conda update -y conda

WORKDIR /app
COPY . /app

# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml

# Make RUN commands use the new environment:
SHELL ["conda", "run", "-n", "test", "/bin/bash", "-c"]

# The code to run when container is started:
COPY app.py .
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "test", "python", "app.py"]

Docker starts building the new image taking Ubuntu:18.04 as base image:

FROM ubuntu:18.04

Ubuntu base image is updated and cURL is installed using below:

RUN apt-get update && apt-get install -y curl

Below command downloads the latest Miniconda3 and installs it. The installer is deleted after installation. The ENV command sets the path for conda and finally, conda package is updated to keep it up-to-date for future.

RUN curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh                       
RUN bash Miniconda3-latest-Linux-x86_64.sh -p /miniconda -b                       RUN rm Miniconda3-latest-Linux-x86_64.sh                       
ENV PATH=/miniconda/bin:${PATH}                       
RUN conda update -y conda

Below command creates a working directory /app for the Docker Image:

WORKDIR /app

Below command copies all the files from your local directory to /app directory:

COPY . /app

Below command copies the environment.yml file to the /app directory and conda reads the environment variables specified in environment.yml file to create a new conda environment in the container.

COPY environment.yml .                      
RUN conda env create -f environment.yml

Run command uses the new conda environment.

SHELL ["conda", "run", "-n", "test", "/bin/bash", "-c"]

Copy app.py to the /app directory on container and the app.py runs in the conda env when the container starts.

COPY app.py .                       
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "test", "python", "app.py"]

4) Docker Compose

There are two ways to build and run a Docker Image

a) docker build and docker run

b) create a docker-compose.yml file and use docker-compose up command

In this work, I used docker compose to build and run the Docker Image

Before going into the process of building and running Image let us see

What is a Docker-compose?

Docker-compose is a tool for defining and running multi-container docker applications. With compose, you use a yml file to configure the applications and services. The advantage of the docker compose is that with a single command, we can create and start all the services from our configuration. Here is the docker-compose.yml file for my application.

version : "3.7"
services:
  app:
    build: .
    container_name: arq_test
    restart: always
    ports:
     - "5000:5000"

It’s version is 3.7, runs one service called app, the container name is arq_test and host port are mapped to the container port:

docker Image can be built and run using single command

docker-compose up

The snapshot of docker image build and running process is shown below:

5) test_service.sh

Now that my sentiment analysis app is running in the conda env inside the docker container, we can test the inferences app using cURL commands. The cURL commands are written in a bash script file named test_servcie.sh

test_service.sh file is:

#!/bin/bash
curl -H "Content-Type: application/json" http://0.0.0.0:5000/prediction -d @pos.json
curl -H "Content-Type: application/json" http://0.0.0.0:5000/prediction -d @neg.json

pos.json file is:

{
"s":"This challenge is lots of fun!"
}

neg.json file is:

{
"s":"This challenge is terrifying!"
}

The results of the prediction are the confidence score and Label.


Reproduced with permission
Author: Srinivas Pratapgiri (Goanna student)
Source: https://srinipratapgiri.medium.com/sentiment-analysis-on-customer-data-microservice-deployed-in-conda-environment-on-a-docker-7e9f63439c96