Understanding Containerized Blockchain behavior on AWS Fargate

This is the second part of my series for evaluating automated orchestration services for Blockchain.

In the previous article “Orchestrating Resilient Blockchain on Kubernetes”, I took you through orchestrating Blockchain and evaluated the case for Kubernetes in a non-scalable, non-elastic environment. With the aid of simple blockchain source code, we interacted with the Blockchain via REST APIs. Additionally, we also saw how monitoring capabilities of Kubernetes can allow us to monitor Blockchain as low as Pod level. Finally, we disrupted the Blockchain to observed how Kubernetes can create resiliency within the Blockchain infrastructure.

Problem Definition

As our Kubernetes cluster was running on a local static resource, the number of pods orchestrated by Kubernetes for the Blockchain infrastructure failed to scale out to handle the number of requests. Although Kubernetes helped to achieve the resiliency within the Blockchain infrastructure, it still lacked automated managed scalability. This failed running Blockchain services.

Due to an enormous amount of available elastic resources on the cloud, we will evaluate AWS Fargate as the automated orchestration service for Blockchain with the following topics

  • Viability of orchestrating Blockchain on AWS Fargate (a serverless compute engine for containers).
  • Basic architectural design for Blockchain on Fargate.
  • Simple blockchain deployment on Fargate.
  • Monitoring and observing the effects of the chaos using Cloudwatch ContainerInsights.
  • Limitations

Note: In this article, we will not be covering the cost aspects.

Architectural Overview

Keeping in view the problem, our architecture is based on a Fargate cluster running within a VPC, comprising of multiple subnets for high availability. The routing tables of these subnets are sending and receiving network traffic to Internet Gateway, hence making them public subnets. To ensure high availability and load balancing between all the running blockchain tasks, internet-facing Application Load Balancer (ALB) takes care of the distribution of the requests. The application load balancer is distributing the load across multiple Availability Zone over multiple subnets. The ALB accepts the traffic from clients and distributes them across the running blockchain tasks within the Fargate cluster, as shown in the figure below.

Basic architectural overview of the Blockchain service running on AWS Fargate

For simplicity and comprehension, the architecture has been broken down in two following stacks:

  • Networking stack: Networking stack is responsible for deploying and managing VPC resources with public subnets where our Fargate blockchain containers will be running. This stack also creates several IAM roles with necessary permissions to perform several operations such as manage ECS resources, as well as creating an internet-facing Application Load Balancer (ALB).
  • Service stack: Service stack is responsible for defining Fargate Tasks, deploying our Blockchain service on the Fargate cluster and handling Auto Scaling of Blockchain service. It also manages Target Group which is connected to our created ALB to traffic across the registered targets (which is our Blockchain service here).
  • Monitoring: Cloudwatch enables us to monitor several metrics relevant for load balancer, registered targets, networking, Fargate cluster among the examples. Cloudwatch Container Insights provides us with capabilities to go as deep as running services and tasks levels within the cluster.

Keeping in view the architecture, we will be deploying the CloudFormation templates which have been broken down keeping in mind the same structure.

But … Why Fargate?

Good question! Earlier when AWS launched a stand-alone ECS service as one its orchestration engine (like Kubernetes), developers were expected to put effort and invest time)on setting up the infrastructure such as VPC on which cluster will be running, cluster itself with task and service definitions, EC2 instances within the cluster, Elastic Block Storage (EBS) sizes, etc. Additionally, Blockchain infrastructure requires managing of physical nodes which increases the complexity for handling infrastructure.

AWS Fargate has enabled the developers to focus more on the part for what we are paid, i.e. business logic!

Hence, using our networking stack; a developer can just deploy a Fargate cluster with multiple sets of command lines running a Basic Blockchain infrastructure. Additionally, the developer can focus on the business logic of service stack which is responsible to handle the Blockchain. As the stacks are deployed using CloudFormation templates, therefore it gives developers also the independence to manage/scale the resources as per needs.

As Fargate manages the resource requirements via task definitions and provisions them by abstracting underlying infrastructure, the developer can now simply focus on the required resources and providing the details in task definitions. Additionally, scaling also remains the responsibility of AWS Fargate for the running services. This can be easily done by configuring the tasks and services definitions.

We will take here again step-by-step approach and for this article, I will primarily focus on the ECS launch type of Fargate. Furthermore, there are also discussions going on over the vision and reality for orchestrating Blockchain on a serverless compute engines (like AWS Fargate). We will address both topics in one of the future articles in detail.

Which Fargate launch types to choose?

Fargate provides two launch types for provisioning microservices, Amazon Elastic Container Service and Amazon Elastic Kubernetes Service (EKS). Lets checkout the difference below:

ECSEKS
1 ECS is a proprietary AWS compute service for running containers on AWS infrastructureEKS is a managed Kubernetes service for containers but uses a Kubernetes control plane
2ECS is designed to interact/exploit other AWS services. Configuration and integration of ECS services and tasks are quite straight forward. Kubernetes is open-source, and EKS is a managed Kubernetes service which provides is portability to other cloud providers.
3Deployments are managed through tasks and services definitionsKubernetes can provide fine-grain control over deployed services.
Differences between AWS Fargate execution engines

Let’s get (technically) deeper!

Since this is NOT the tutorial on how to create Fargate Cluster for blockchain; therefore we will be using CloudFormation templates to automate this deployment. But before we can move ahead, basics are most important to understand. Firstly, let’s get started with some AWS Fargate basics, move ahead with the request flow and ultimately discussing the overall design.

Fargate basics

Our Fargate cluster is based on ECS which consists of several constructs. The first basic ones are Tasks and Services definitions:

  • Task definition: A task definition is a blueprint of our containerized application. Since our blockchain application is based on Docker, we will define first the task definition with a container image, resource requirements, setting up the environment variables, launch types etc.).

    Service definition incorporates the execution of tasks. Most importantly, we also have to provide subnet where our tasks are going to be executed provided our cluster is spanned over Multi-AZs and also attach a security group to the task’s Elastic Network Interface (ENI).
  • Service definition: A service definition defines which tasks are required to be executed, the desired number of tasks to handle the requests, configuring auto-scaling based on different policies etc. From a networking perspective, service also takes care of on which subnet our tasks need to be executed and which security group needs to be assigned the task’s ENI.

    Service is also responsible for performing health checks for the tasks. Hence, if any of the tasks/containers goes down because of some reason, service will ensure to replace the task by spinning up the new one so that desired count is maintained all the time.

    Furthermore, a service can run behind an internet-facing load balancer which we will be doing for the blockchain application.
Screenshot from the AWS Console, depicting the scopes of basic ECS constructs

Request lifecycle overview

So how does the request flows from client to the blockchain service running on AWS Fargate? The diagram has been depicted below along with the step-by-step description:

  1. When our networking stack is created, the internet-facing load balancer is created in the public subnets of the VPC, exposing its DNS name. Load balancers become the single point of entry to service for the clients.
  2. Whenever a client send a request using the exposed DNS name, it is passed via internet gateway within the VPC which routes it to the application load balancer.
  3. The application forwards the request to the task’s ENI. Task’s ENI starts communicating with the exposed container port.
    • Fargate service has already been started with the desired number of Blockchain tasks. Each task is running a container which is exposed on port 3000
    • With each deployment of Fargate task, ENI is also created for a relevant task to communicate with the internal VPC network.
  4. As soon as the number of requests starts to increase, the auto-scaling event is generated to scale out more Blockchain tasks to handle the requests.
  5. To handle the scalability within the Blockchain, we are using ‘Target Tracking Policy’: a dynamic scaling which tries to keep a performance metric (such as Average CPU consumption) up to a threshold; e.g. On average 50% of CPU consumption of total running tasks. If the average CPU consumption increases more than 50%, it will scale out a new task to bring the average CPU consumption within the threshold.

This flow has been depicted in the diagram below:

Request flow from the client over the internet via Internet Gateway to the Blockchain container

Let’s build!

In this tutorial, we will be using Fargate with ECS launch type, and the CloudFormation templates for the above depicted architectural diagram can be found in my GitHub repository. The code and deployment are organized in the following manner:

Process of deploying these stacks are quite straightforward:

  1. Deploying Network stack: Execute the following command to deploy first the networking stack:
aws cloudformation deploy --template-file=deployment/ecs-cf-template/vpc-stack.yaml --stack-name=FargateECSNetworkingStack --capabilities CAPABILITY_IAM

Note: Please note the name of the stack-name previously provided in the command, i.e. FargateECSNetworkingStack. The networking stack exports several values which are later used in the deployment of the Blockchain service stack.

  1. Deploying Service stack: Once networking stack has been deployed, execute the following CloudFormation command to deploy the service stack:
aws cloudformation deploy --template-file=deployment/ecs-cf-template/service-stack.yaml --stack-name=FargateECSServiceStack --capabilities CAPABILITY_IAM
  1. Enable Monitoring: Enabling Cloudwatch Container Insights to get performance data from the containers running in Fargate cluster:
$> aws ecs put-account-setting --name "containerInsights" --value "enabled" & aws ecs update-cluster-settings --cluster FargateNetworkingStack-ECSCluster-BW652L6JV184 --settings name=containerInsights,value=enabled

Note: Please ensure the name here is equivalent to the name given in the first step while deploying networking stack, i.e. FargateECSNetworkingStack.

Let’s create Chaos!

Chaos makes the muse ~ UNKNOWN

Whenever we discuss about large-scale, distributed applications like Blockchain; High availability, Scalability, and Resiliency are of high concerns. Even when all the distributed applications are functioning properly in normal circumstances, a simple disruption like increase in requests during peak hours or hardware failure can identify weaknesses within the system. In this article, we will examine a case of handling an enormous number of requests and observing how our infrastructure based on Fargate responds for the Blockchain service.

Diagram depicting the increase in capacity (LCU) of our application load balancer with the burst of requests

Using ApacheBench, a workload with the combination of Blockchain APIs defined in “Orchestrating Resilient Blockchain on Kubernetes” were executed to send an enormous amount of requests to the load balancer. This resulted in the increased capacity of the load balancer, which can be seen from the diagram above. Using Cloudwatch, we can observe the following 3 behaviours from the following performance graphs:

Cloudwatch performance graphs depicting scalability of Blockchain infrastructure on AWS Fargate
  1. As the enormous amount of requests started to reach the Blockchain load balancer, the blockchain service started to scale-out the blockchain tasks (containers) to handle requests.
  2. However, after sometime when the clients stopped sending requests; the blockchain tasks quantity remained constant for some time, waiting for the requests. Later after some time, these Blockchain tasks started to scale-in. (Note: This behaviour is quite interesting as well and I will discuss this in my subsequent articles in somewhat more detail)
  3. After a while, an enormous amount of requests are sending out again, touching as much as 75K requests. As the Blockchain tasks already scaled-in by this point, therefore HTTP 500 failures can be observed at the initial stage. Later when the Blockchain tasks started to scale-out the HTTP 500 failure requests reduced and successful HTTP 200 responses were served. This can also be seen in the diagram below.
Diagram depicting how scaling Blockchain impacts successes and failures of HTTP requests

Furthermore, Cloudwatch also enables us to analyze the cluster utilization while processing the requests. Since we are using Target tracking scalability policy which is set at 60%, cluster utilization touched the peak as soon as the requests started to receive. This is because the desired number of blockchain tasks has the increased consumption eventually leading the blockchain service to fire scale-out events. It is, therefore, cluster CPU also got busy processing scale-out events.

Cluster CPU utilization for scaling the tasks

Limitations

Several limitations can be observed within our current infrastructure:

  1. During the scale-out events, several requests failed while our Blockchain service was scaling-out. In critical systems like Blockchain with important transactions; this should not be the case. To tackle that, an additional layer should be introduced that decouples the running tasks and incoming requests.
  2. Apart from what Cloudwatch provides to monitor the Blockchain at the infrastructure and networking level; it does lack support for the basic metrics of number of blocks available in a Blockchain, number of failed/successful mining, total number of contracts, time to mine a transaction, number of mined transactions per second, time to resolve conflicts between the nodes etc.
  3. Execution of Fargate tasks within the public subnets of the VPC. A well-secured architecture will be the creation of private subnets and executing the Fargate tasks within those private subnets.

Conclusion:

In the previous article, we observed a resilient but non-scalable Blockchain service. We tried to resolve the scalability issue of Blockchain using AWS Fargate in this article and evaluated it using AWS CloudWatch. I took you through the architectural discussion of the Blockchain Networking and Service stacks. These stacks were deployed using the CloudFormation templates created in this GitHub repository. At a later stage, we disrupted the Blockchain service by inducing innumerable requests to the load balancer. We observed how Blockchain service scales-out the Blockchain tasks with the increase of requests; hence increasing the scalability of the Blockchain. However, a question remains on the implications of scaling-in the Blockchain service; which will be addressed in one of the following articles. In the end, we also pointed out several limitations of the infrastructure.

Published by backdoorcodr

Passionate about Cloud, Data; and everything that revolves around them!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: