In past decade we have seen industries are evolving their businesses by migrating their applications from monolithic to microservices using various container orchestrations such as K8s, ECS, Docker etc. In this process lot of efforts have been put in designing architecture which should be scable, robust, reliable and so on. As application and business start evolving, we must transit our focus to optimize resource and limit allocations so that application will not be under or over utilized the resources. Hence reducing the cost with right-size workloads.
Before directly hop into concepts, I would like tell intent of this article or blog via a problem statement.
Python Based RestFul API running as POD in K8s Cluster with below
- No AutoScaler(HPA or VPA)
- No Correct QoS (Quality of Service)
- No Observibilty Dashboard
- No Application Workload Trends
- Stress/Load Testing can’t benchmark resource limit and request as traffic is majorly depend on real data based on customer profile.
In-case of high traffic or heavy load API gets crashed. So in order to make it available we need to scale out pod therefore it requires to HPA to be in-place. Since we got no QoS or history of application which could help us to benchmark our HPA thresholds. This is where GOLDILOCKS comes into existance.
In this article we will share guidance and journey of implementing Goldilocks which helped us in optimize resource allocation and right-size application via QoS recommendations.
Table of Contents:
- Ramification of Resources Right-Sizing in K8s Application.
- Introduction to Goldilocks.
- Solution Overview
a. Architecture of Solution
b. Installation of Goldilocks Pre-requisites and Dependencies
c. Labelling and Enabling Target Namespace
d. Deploying Goldilocks in Cluster
e. Application Integration to Goldilocks
f. Check Goldilocks Recommendation Dashboard
Ramification of Resources Right-Sizing in K8s Application
Right-sizing of resource allocation in K8s can be done embedding resource specification block in application deployment manifests and it will ramify application in below ways.
Introduction to Goldilocks
Goldilocks is a Fairwinds open source project which helps organisation by accelerating the correct or right size of resource requirement to their K8s Application. It is composed of K8s VPA(Vertical Pod Autoscaler) which inturn provides a controller which is responsible to create VPA Objects for workloads in your cluster. Along with this, a dashboard which visualize resource recommendations for the all enabled/monitored workloads.
This will contain solution overview as a whole starting from its architecture, installations, deployment of application and checking resource recommendations on dashboard.
3.a. Architecture of Solution
3.b. Installation of Goldilocks Pre-requisites and Dependencies
There are few prerequiste and dependencies which requires to be inplace before goldilocks installation.
Below are the prerequisite:
3.b.1 Check if Metric Server is deployed or not?
In this step we are going to check if metric server is installed on existing cluster or not, if not then run below commands:
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server
helm upgrade --install metrics-server metrics-server/metrics-server
# Verify metric server installation
kubectl top pods -n kube-system
** This Metric server is further utilized by VPA to fetch resource metrics**
3.c. Labelling and Enabling Target Namespace
How we can enable target namespace for resource recommendation?
Target Namespaces: Those namespaces for which we want resource recommendations can be enable by simply adding below label to it.
goldilocks.fairwinds.com/enabled: true label to a namespace.
kubectl create ns mvc
kubectl label ns mvc goldilocks.fairwinds.com/enabled=true
3.d. Deploying Goldilocks in Cluster
Goldilocks Deployment deploys three K8s objects (Controller,VPA Recommender,Dashboard)
- Controller responsible for creating the VPA objects for the workloads for target namespace whose is enabled for a Goldilocks recommendation
- VPA Recommender is responsible for providing the resource recommendations for the workloads
- Dashboard will visualize the summary of resource recommendation made up by VPA recommender.
** Goldilocks can be deployed by running below chart.**
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm upgrade --install goldilocks fairwinds-stable/goldilocks --namespace goldilocks --create-namespace --set vpa.enabled=true
cloudmonk@Garvits-MacBook-Air ~ % kubectl get po -n goldilocks
NAME READY STATUS RESTARTS AGE
goldilocks-controller-b764bbb9-r9sxt 1/1 Running 2 (6d2h ago) 16d
goldilocks-dashboard-85c954ff99-bbmdt 1/1 Running 2 (6d2h ago) 16d
goldilocks-dashboard-85c954ff99-vqstm 1/1 Running 2 (6d2h ago) 16d
goldilocks-vpa-admission-controller-bbb69d975-952d6 1/1 Running 2 (6d2h ago) 16d
goldilocks-vpa-recommender-68d77754b4-ms7vv 1/1 Running 2 (6d2h ago) 16d
3.e. Application Integration to Goldilocks
In this step, we integrate our existing application in cluster for which we want to generate some resource recommendation, The Moment you set label in step 3.c. Goldilocks VPA Recommender Object gets created for each application deploy.
cloudmonk@Garvits-MacBook-Air Desktop % kubectl get vpa -n mvc
NAME MODE CPU MEM PROVIDED AGE
goldilocks-mvc-app Off 15m 104857600 True 23s
3.f. Check Goldilocks Recommendation Dashboard
Goldilocks Dashboard can be accessed at port 8080, Use below command to access and check the resource recommendations at here
kubectl -n goldilocks port-forward svc/goldilocks-dashboard 8080:80
Before start checking recommendations, let’s understand What is QoS?
“It stand as Quality of Service (QoS) class. Kubernetes assigns each Pod a QoS class based on the resource requests and limits of its component Containers. QoS classes have been used by Kubernetes to decide which Pods to evict from a Node experiencing Node Pressure. In here we have observed that the recommendations are available for two distinct (QoS) types: Guaranteed and Burstable. For more detail please refer link”
Let’s analyze the our targeted namespace (mvc) where our application is deployed as mvc-app.
In above screenshot we can clearly see that there are two QoS which recommend the memory and cpu limit for our application.
It is clearly observed that we have given high compute resources to our application which get over-provisioned and based on Goldilocks recommendations it can be optimize. The Burstable QoS recommendation for CPU request and CPU limit is 15m and 15m compared to the current setting of 100m and 300m for Guaranteed QoS. Memory request and limits are recommended to be 105M and 105M, compared to the current setting of 180Mi and 300 Mi.
In order follow and embed the recommended resource specs, we can just copy the respective manifest file for the desired QoS class & deploy the workloads by just editing the deployment or update it helm chart or standalone manifest which will then be right-sized and optimized.
For instance, if we want to apply QoS recommendation to our application. We can do it by editting the deployment.
Let’s run the
kubectl edit command to the deployment to apply the recommendations:
kubectl edit deployment mvc-app -n mvc
Apply the recommended YAML in resource block of deployment manifest. once it gets applied, we can see pods get restart and comes back with updated config.
kubectl describe deployment mvc-app -n mvc
kubectl describe deployment mvc-app -n mvc
CreationTimestamp: Fri, 15 Sep 2023 03:18:28 +0530
Annotations: deployment.kubernetes.io/revision: 2
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Ports: 8080/TCP, 9404/TCP
Host Ports: 0/TCP, 0/TCP
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
NewReplicaSet: mvc-app-67cdc49555 (1/1 replicas created)
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 57m deployment-controller Scaled up replica set mvc-app-6cbbcf458d to 1
Normal ScalingReplicaSet 41s deployment-controller Scaled up replica set mvc-app-67cdc49555 to 1
Normal ScalingReplicaSet 24s deployment-controller Scaled down replica set mvc-app-6cbbcf458d to 0
By this article we observed that how Goldilocks identified our right size for resource request and limit which further helped us to benchmark HPA. It helped in taking fast decision to set correct QoS for our application with minimal efforts which usually takes lot and lots efforts by looking observibility trends. It also make our client happy as it has made significant impact on cost in first place.
GitHub - FairwindsOps/goldilocks: Get your resource requests "Just Right"
Get your resource requests "Just Right". Contribute to FairwindsOps/goldilocks development by creating an account on…
Goldilocks Documentation | Fairwinds
Goldilocks is a utility that can help you identify a starting point for resource requests and limits in Kubernetes.
Configure Quality of Service for Pods
This page shows how to configure Pods so that they will be assigned particular Quality of Service (QoS) classes…