The easiest way to implement Auto-scaling per HTTP request in your AKS cluster
There are many ways to implement auto-scaling in a Kubernetes cluster. The most common one is to configure HPA, which uses out of the box metrics from the cluster’s hardware, (CPU, RAM, and even GPU) then scale up/down according to a given threshold. But what if you want to scale per HTTP request? I spent too much time on implementing an effective solution, so I decided to make an informative article that will guide you through the process.
Let’s start with the tools we are going to use:
- KEDA — is a Kubernetes-based Event Driven Auto-scaler. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed.
- Log Analytics Workspace — is a tool in the Azure portal used to edit and run log queries with data in Azure Monitor Logs
- API Management (optional) —Provides a way to create consistent and modern API gateways for existing back-end services in Azure
- Application insights (optional) — An extensible Application Performance Management (APM) service for developers and DevOps professionals in Azure
In this scenario I’m going to use both of API Management and Application insights solutions (despite the fact they are optional), but keep in mind you may apply the same idea only with KEDA and Log Analytics Workspace.
Let’s take a look in a diagram that shows how the architecture works:
Explanation - Once a client hits the API management, two processes begin in parallel:
- Request is being processed (the upper section)
- Request data is being sent to log analytics (the lower section)
Keda reads the data from log analytics workspace and scales the Kubernetes deployment based on a given query.
Armed with this knowledge, let’s dive straight into the details.
Assuming you already have the resources mentioned above, let’s point out the steps of the implementation:
- Install KEDA in your cluster.
- Send data from API management to Application insights
- Send data from Application insights to Log analytics workspace
- Configure KEDA to work with your Azure resources.
- Test it.
Now let’s start working!
1st step — Installing KEDA
You may use KEDA official deployment documentation, or just follow the steps below:
A. Install helm in your machine.
B. Install KEDA (using helm) by running the following commands in your cluster:
helm repo add kedacore https://kedacore.github.io/charts
helm repo updatekubectl create namespace keda
helm install keda kedacore/keda --namespace keda
By doing this, we basically created a custom k8s resource called keda.sh, in this resource we will have three k8s objects (YAML files), those files will eventually produce a custom HPA that will be responsible for scaling our deployment (application):
We will configure those three later, don’t worry, now let’s proceed to step number two.
2nd step — Send data from APIM to Application insights
A. Go to your API management resource in Azure and choose one API.
B. Go to ‘Settings’ tab an scroll down to the bottom.
C. If you already have application insights configured for this API management, go to ‘Diagnostics Logs’ and enable application insights. Otherwise, click on ‘Manage’ and add it to your API management resource.
After doing this, you should see something similar to this:
D. Go to your Application Insights and verify your data is there, you have two options to do this:
- Go to ‘Performance’ tab and see your APIs performance.
- Go to ‘Logs ‘ tab → ‘Tables’ → click on the eye next to ‘dependencies’ → ‘See in query editor’.
3rd step — Send data from Application insights to Log Analytics
A. Go to ‘Properties’ tab in Application Insights.
B. In ‘Workspace’ property — click on ‘Migrate to Workspace-based’
C. Choose the Log Analytics Workspace under the relevant subscription:
D. Verify you see the data in by running the following query in the Log Analytics Workspace:
|where TimeGenerated > ago(24h)
If you want to look for a specific operation, add this to your query:
|where Name == "POST /example-service/v1/abcd
Example output for a random query:
Now that we have all the data in Log Analytics Workspace, we can use Keda to query this resource.
4th step — Configure KEDA to work with Azure resources
The scaler we chose is Log Analytics Workspace (obviously). Since most of Azure users make use of this resource, we are going to exploit its usage to the maximum.
Again, I’m referring you to KEDA documentation about Azure Log Analytics scaler in case more details are required.
This step is the most complicated one, so stay focused.
A. First, we’ll need to create a service-principal for authentication purposes, if you already have one, you can just assign it as a contributor to your Log Analytics Workspace, if not, you can simply run this command in PowerShell:
az ad sp create-for-rbac --name ServicePrincipalName
Note: Save the password in a notepad so you won’t lose it! it’s impossible to recover it afterwards.
B. Preparing the Secret.yaml
- In the secret file, you should insert your service-principal values
- The values are encoded with base64 format (type: Opaque) you can search for base64 encoder/decoder in google to make the conversion.
- The highlighted values should be changed.
tenantId: "QVpVUkVfQURfVEVOQU5UX0lE" #Base64 encoded Azure Active Directory tenant id
clientId: "U0VSVklDRV9QUklOQ0lQQUxfQ0xJRU5UX0lE" #Base64 encoded Application id from your Azure AD Application/service principal
clientSecret: "U0VSVklDRV9QUklOQ0lQQUxfUEFTU1dPUkQ=" #Base64 encoded Password from your Azure AD Application/service principal
workspaceId: "TE9HX0FOQUxZVElDU19XT1JLU1BBQ0VfSUQ=" #Base64 encoded Log Analytics workspace id
C. Preparing TriggerAuthentication.yaml
- This file is using the values from the Secret.yaml file.
- Only the namespace value should be changed.
- parameter: tenantId
- parameter: clientId
- parameter: clientSecret
- parameter: workspaceId
D. Preparing the ScaledObject.yaml
- The highlighted values should be changed
- The query below is just an example, you can use this template to build a custom query to your application.
- type: azure-log-analytics
let AvgDuration = ago(5m);
let ThresholdCoefficient = 0.8;
| where Name == "POST /example-service/v1/abcde"
| where TimeGenerated > AvgDuration
| summarize MetricValue = count()
| project MetricValue, Threshold = MetricValue * ThresholdCoefficient
E. Save the three YAML sections above in one file, and apply it to your cluster with the following command:
kubectl apply -f <yourfile>.yaml
The ScaledObject.yaml has created a custom HPA based on your query, we will check that out in the next step.
5th step — Testing
A. First, let’s check the HPA by running the command below.
kubectl get hpa -n <your-application-namespace>
You should receive the following output:
If you have this resource in your cluster, it means that KEDA was deployed successfully.
B. Send some requests.
In order toverify that everything is really working, we need to send requests to our application of course. (I recommend to set the threshold value to be relatively low so the scaling process will start immediately)
Few seconds after the first request had been sent, you should already see the TARGETS count is starting to increase.
C. Finally, we can run the command
kubectl get pods -n <your-application-namespace>
As you can see, new pods are being created.
Now you can test your auto-scaling and adjust the query to be perfect.
As already said, this guide was written because I couldn’t find something similar to this over the internet, so I decided to make one by myself.
Feel free to contact me for any questions.
Hope you found it useful!
I’m 26 y/o from Israel, working as DevOps Engineer in AU10TIX.