✍ Posted by Immersive Builder Seong
Karpenter 란?
Karpenter는 AWS로 구축된 오픈 소스 고성능 쿠버네티스 클러스터 오토스케일러입니다.
어플리케이션 워크로드의 변화에 대응하여 적절한 크기의 컴퓨팅 리소스를 신속하게 실행합니다. 그리고 클러스터의 컴퓨팅 리소스 공간을 자동으로 최적화하여 비용을 절감하고 성능을 개선합니다.
중간에 ASG 없이 동작하므로 CA와 비교하여 몇 초만에 컴퓨팅 리소스를 제공할 수 있는 점이 가장 큰 특징입니다.
작동 방식
- Provisioning : 모니터링 ▶ 스케줄링 되지 않은 파드 발견 ▶ 스펙 평가 ▶ 생성
- Deprovisioning : 모니터링 ▶ 비어있는 노드 발견 ▶ 제거
- 파드에 적합한 가장 저렴한 인스턴스로 증설됩니다.
- 자동으로 PV가 존재하는 서브넷에 노드를 생성합니다.
- ttlSecondsAfterEmpty : 노드에 데몬셋을 제외한 파드가 존재하지 않을 경우 해당 값 이후 자동으로 노드를 정리합니다.
- ttlSecondsUntilExpired : 설정한 만료 기간이 경과한 노드는 자동으로 cordon, drain 처리하여 노드를 정리합니다.
- Consolidation : 스펙이 높은 노드 하나가 낮은 노드 여러 개보다 비용이 저렴하다면 자동으로 합쳐줍니다.
Getting Started with Karpenter
1. EKS 클러스터 배포
# 변수 정보 확인
$ export | egrep 'ACCOUNT|AWS_' | egrep -v 'SECRET|KEY'
declare -x ACCOUNT_ID="732659419746"
declare -x AWS_ACCOUNT_ID="732659419746"
declare -x AWS_DEFAULT_REGION="ap-northeast-2"
declare -x AWS_PAGER=""
declare -x AWS_REGION="ap-northeast-2"
# 환경변수 설정
$ export KARPENTER_NAMESPACE="kube-system"
$ export K8S_VERSION="1.29"
$ export KARPENTER_VERSION="0.35.2"
$ export TEMPOUT=$(mktemp)
$ export ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"
$ export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"
$ export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
$ export AWS_PARTITION="aws"
$ export CLUSTER_NAME="${USER}-karpenter-demo"
$ echo "export CLUSTER_NAME=$CLUSTER_NAME" >> /etc/profile
$ echo $KARPENTER_VERSION $CLUSTER_NAME $AWS_DEFAULT_REGION $AWS_ACCOUNT_ID $TEMPOUT $ARM_AMI_ID $AMD_AMI_ID $GPU_AMI_ID
0.35.2 root-karpenter-demo ap-northeast-2 732659419746 /tmp/tmp.AXGY0WJ6I6 ami-02664ab2476ef662b ami-04c86c383de71b083 ami-05d368d4276378e7c
# IAM Policy, Role(KarpenterRole-myeks2) 생성
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > "${TEMPOUT}" \
&& aws cloudformation deploy \
--stack-name "Karpenter-${CLUSTER_NAME}" \
--template-file "${TEMPOUT}" \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides "ClusterName=${CLUSTER_NAME}"
# EKS 클러스터 배포
$ eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ${CLUSTER_NAME}
region: ${AWS_DEFAULT_REGION}
version: "${K8S_VERSION}"
tags:
karpenter.sh/discovery: ${CLUSTER_NAME}
iam:
withOIDC: true
serviceAccounts:
- metadata:
name: karpenter
namespace: "${KARPENTER_NAMESPACE}"
roleName: ${CLUSTER_NAME}-karpenter
attachPolicyARNs:
- arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}
roleOnly: true
iamIdentityMappings:
- arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
managedNodeGroups:
- instanceType: m5.large
amiFamily: AmazonLinux2
name: ${CLUSTER_NAME}-ng
desiredCapacity: 2
minSize: 1
maxSize: 10
iam:
withAddonPolicies:
externalDNS: true
EOF
# EKS 클러스터 배포 확인
$ eksctl get cluster
NAME REGION EKSCTL CREATED
root-karpenter-demo ap-northeast-2 True
$ eksctl get nodegroup --cluster $CLUSTER_NAME
CLUSTER NODEGROUP STATUS CREATED MIN SIZE MAX SIZE DESIRED CAPACITY INSTANCE TYPE IMAGE ID ASG NAME TYPE
root-karpenter-demo root-karpenter-demo-ng ACTIVE 2024-04-06T23:58:37Z 1 10 2 m5.largeAL2_x86_64 eks-root-karpenter-demo-ng-a8c75aec-c03f-5a46-7d20-e8976aee8644 managed
$ eksctl get iamidentitymapping --cluster $CLUSTER_NAME
ARN USERNAME GROUPS ACCOUNT
arn:aws:iam::732659419746:role/KarpenterNodeRole-root-karpenter-demo system:node:{{EC2PrivateDNSName}} system:bootstrappers,system:nodes
arn:aws:iam::732659419746:role/eksctl-root-karpenter-demo-nodegro-NodeInstanceRole-ynZpFVeiXlm5 system:node:{{EC2PrivateDNSName}} system:bootstrappers,system:nodes
$ eksctl get iamserviceaccount --cluster $CLUSTER_NAME
NAMESPACE NAME ROLE ARN
kube-system aws-node arn:aws:iam::732659419746:role/eksctl-root-karpenter-demo-addon-iamserviceac-Role1-PCPg44mreCGF
kube-system karpenter arn:aws:iam::732659419746:role/root-karpenter-demo-karpenter
# Worker Node 정보 확인
$ kubectl get node --label-columns=node.kubernetes.io/instance-type,eks.amazonaws.com/capacityType,topology.kubernetes.io/zone
NAME STATUS ROLES AGE VERSION INSTANCE-TYPE CAPACITYTYPE ZONE
ip-192-168-29-128.ap-northeast-2.compute.internal Ready <none> 3m25s v1.29.0-eks-5e0fdde m5.large ON_DEMAND ap-northeast-2c
ip-192-168-55-141.ap-northeast-2.compute.internal Ready <none> 3m21s v1.29.0-eks-5e0fdde m5.large ON_DEMAND ap-northeast-2a
2. Karpenter 설치
# Karpenter 설치를 위한 환경변수 설정
$ export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name "${CLUSTER_NAME}" --query "cluster.endpoint" --output text)"
$ export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
$ echo "${CLUSTER_ENDPOINT} ${KARPENTER_IAM_ROLE_ARN}"
https://EBBEF77A895A32747912C23D41CDFE34.gr7.ap-northeast-2.eks.amazonaws.com arn::iam::732659419746:role/root-karpenter-demo-karpenter
# Karpenter 설치
$ helm install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
--set "serviceAccount.annotations.eks\.amazonaws\.com/role-arn=${KARPENTER_IAM_ROLE_ARN}" \
--set "settings.clusterName=${CLUSTER_NAME}" \
--set "settings.interruptionQueue=${CLUSTER_NAME}" \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi \
--wait
# Karpenter 설치 확인
$ kubectl get crd | grep karpenter
ec2nodeclasses.karpenter.k8s.aws 2024-04-07T00:21:45Z
nodeclaims.karpenter.sh 2024-04-07T00:21:45Z
nodepools.karpenter.sh 2024-04-07T00:21:45Z
3. Grafana 대시보드 세팅
# Helm Repo
$ helm repo add grafana-charts https://grafana.github.io/helm-charts
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ kubectl create namespace monitoring
# Prometheus 설치
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/prometheus-values.yaml | envsubst | tee prometheus-values.yaml
alertmanager:
persistentVolume:
enabled: false
server:
fullnameOverride: prometheus-server
persistentVolume:
enabled: false
extraScrapeConfigs: |
- job_name: karpenter
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- kube-system
relabel_configs:
- source_labels:
- __meta_kubernetes_endpoints_name
- __meta_kubernetes_endpoint_port_name
action: keep
regex: karpenter;http-metrics
$ helm install --namespace monitoring prometheus prometheus-community/prometheus --values prometheus-values.yaml
Release "prometheus" has been upgraded. Happy Helming!
NAME: prometheus
NAMESPACE: monitoring
STATUS: deployed
REVISION: 2
TEST SUITE: None
# Grafana 설치
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/grafana-values.yaml | tee grafana-values.yaml
datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
version: 1
url: http://prometheus-server:80
access: proxy
dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
orgId: 1
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default
dashboards:
default:
capacity-dashboard:
url: https://karpenter.sh/preview/getting-started/getting-started-with-karpenter/karpenter-capacity-dashboard.json
performance-dashboard:
url: https://karpenter.sh/preview/getting-started/getting-started-with-karpenter/karpenter-performance-dashboard.json
$ helm install --namespace monitoring grafana grafana-charts/grafana --values grafana-values.yaml
NAME: grafana
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
$ kubectl patch svc -n monitoring grafana -p '{"spec":{"type":"LoadBalancer"}}'
service/grafana patched
# Admin 암호
$ kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
DL4JYG**************
# Grafana 접속
$ kubectl annotate service grafana -n monitoring "external-dns.alpha.kubernetes.io/hostname=grafana.$MyDomain"
service/grafana annotated
$ echo -e "grafana URL = http://grafana.$MyDomain"
grafana URL = http://grafana.okms1017.name
4. NodePool 생성
# Create NodePool
$ cat <<EOF | envsubst | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ["2"]
nodeClassRef:
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
name: default
limits:
cpu: 1000
disruption:
consolidationPolicy: WhenUnderutilized
expireAfter: 720h # 30 * 24h = 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2 # Amazon Linux 2
role: "KarpenterNodeRole-${CLUSTER_NAME}" # replace with your cluster name
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
amiSelectorTerms:
- id: "${ARM_AMI_ID}"
- id: "${AMD_AMI_ID}"
# - id: "${GPU_AMI_ID}" # <- GPU Optimized AMD AMI
# - name: "amazon-eks-node-${K8S_VERSION}-*" # <- automatically upgrade when a new AL2 EKS Optimized AMI is released. This is unsafe for production workloads. Validate AMIs in lower environments before deploying them to production.
EOF
nodepool.karpenter.sh/default created
ec2nodeclass.karpenter.k8s.aws/default created
# NodePool 확인
$ kubectl get nodepool,ec2nodeclass
NAME NODECLASS
nodepool.karpenter.sh/default default
NAME AGE
ec2nodeclass.karpenter.k8s.aws/default 21s
5. Scale up deployment
# 파드당 1Core 최소 보장
$ cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
resources:
requests:
cpu: 1
EOF
deployment.apps/inflate created
# Scale up
$ kubectl get pod
$ kubectl scale deployment inflate --replicas 5
deployment.apps/inflate scaled
$ kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller
6. Scale down deployment
# Scale down
$ kubectl delete deployment inflate && date
$ kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller
[출처]
1) CloudNet@, AEWS 실습 스터디
2) https://aws.amazon.com/ko/blogs/korea/introducing-karpenter-an-open-source-high-performance-kubernetes-cluster-autoscaler/
3) How-to-monitor-and-reduce-your-compute-costs.pdf, 6p
4) https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/
5) https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/setup-provisioner/
6) https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/node-provisioning/
'AWS > EKS' 카테고리의 다른 글
[AEWS2] 6-2. EKS Authentication & Authorization (0) | 2024.04.13 |
---|---|
[AEWS2] 6-1. JWT 란? (0) | 2024.04.13 |
[AEWS2] 5-1. EKS AutoScaling (1) | 2024.04.07 |
[AEWS2] 4-2. Prometheus & Grafana (0) | 2024.03.31 |
[AEWS2] 4-1. EKS Observability (0) | 2024.03.31 |