본문 바로가기
AWS/EKS

[AEWS2] 5-2. Karpenter

by okms1017 2024. 4. 7.
728x90

✍ Posted by Immersive Builder  Seong
 

Karpenter 란?

Karpenter는 AWS로 구축된 오픈 소스 고성능 쿠버네티스 클러스터 오토스케일러입니다. 
 
어플리케이션 워크로드의 변화에 대응하여 적절한 크기의 컴퓨팅 리소스를 신속하게 실행합니다. 그리고 클러스터의 컴퓨팅 리소스 공간을 자동으로 최적화하여 비용을 절감하고 성능을 개선합니다. 
 
중간에 ASG 없이 동작하므로 CA와 비교하여 몇 초만에 컴퓨팅 리소스를 제공할 수 있는 점이 가장 큰 특징입니다. 
 

Karpenter - 1
Karpenter - 2

 

작동 방식 

  • Provisioning : 모니터링 ▶ 스케줄링 되지 않은 파드 발견 ▶ 스펙 평가 ▶ 생성 
  • Deprovisioning : 모니터링 ▶ 비어있는 노드 발견 ▶ 제거 
  • 파드에 적합한 가장 저렴한 인스턴스로 증설됩니다. 
  • 자동으로 PV가 존재하는 서브넷에 노드를 생성합니다. 
  • ttlSecondsAfterEmpty : 노드에 데몬셋을 제외한 파드가 존재하지 않을 경우 해당 값 이후 자동으로 노드를 정리합니다. 
  • ttlSecondsUntilExpired : 설정한 만료 기간이 경과한 노드는 자동으로 cordon, drain 처리하여 노드를 정리합니다. 
  • Consolidation : 스펙이 높은 노드 하나가 낮은 노드 여러 개보다 비용이 저렴하다면 자동으로 합쳐줍니다. 

 

Getting Started with Karpenter

 
1. EKS 클러스터 배포 

 

# 변수 정보 확인
$ export | egrep 'ACCOUNT|AWS_' | egrep -v 'SECRET|KEY'
declare -x ACCOUNT_ID="732659419746"
declare -x AWS_ACCOUNT_ID="732659419746"
declare -x AWS_DEFAULT_REGION="ap-northeast-2"
declare -x AWS_PAGER=""
declare -x AWS_REGION="ap-northeast-2"

# 환경변수 설정
$ export KARPENTER_NAMESPACE="kube-system"
$ export K8S_VERSION="1.29"
$ export KARPENTER_VERSION="0.35.2"
$ export TEMPOUT=$(mktemp)
$ export ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"
$ export AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"
$ export GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
$ export AWS_PARTITION="aws"
$ export CLUSTER_NAME="${USER}-karpenter-demo"
$ echo "export CLUSTER_NAME=$CLUSTER_NAME" >> /etc/profile
$ echo $KARPENTER_VERSION $CLUSTER_NAME $AWS_DEFAULT_REGION $AWS_ACCOUNT_ID $TEMPOUT $ARM_AMI_ID $AMD_AMI_ID $GPU_AMI_ID
0.35.2 root-karpenter-demo ap-northeast-2 732659419746 /tmp/tmp.AXGY0WJ6I6 ami-02664ab2476ef662b ami-04c86c383de71b083 ami-05d368d4276378e7c

# IAM Policy, Role(KarpenterRole-myeks2) 생성
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml  > "${TEMPOUT}" \
&& aws cloudformation deploy \
  --stack-name "Karpenter-${CLUSTER_NAME}" \
  --template-file "${TEMPOUT}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --parameter-overrides "ClusterName=${CLUSTER_NAME}"

# EKS 클러스터 배포 
$ eksctl create cluster -f - <<EOF
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: ${CLUSTER_NAME}
  region: ${AWS_DEFAULT_REGION}
  version: "${K8S_VERSION}"
  tags:
    karpenter.sh/discovery: ${CLUSTER_NAME}

iam:
  withOIDC: true
  serviceAccounts:
  - metadata:
      name: karpenter
      namespace: "${KARPENTER_NAMESPACE}"
    roleName: ${CLUSTER_NAME}-karpenter
    attachPolicyARNs:
    - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}
    roleOnly: true

iamIdentityMappings:
- arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}"
  username: system:node:{{EC2PrivateDNSName}}
  groups:
  - system:bootstrappers
  - system:nodes

managedNodeGroups:
- instanceType: m5.large
  amiFamily: AmazonLinux2
  name: ${CLUSTER_NAME}-ng
  desiredCapacity: 2
  minSize: 1
  maxSize: 10
  iam:
    withAddonPolicies:
      externalDNS: true
EOF

# EKS 클러스터 배포 확인 
$ eksctl get cluster
NAME                    REGION          EKSCTL CREATED
root-karpenter-demo     ap-northeast-2  True
$ eksctl get nodegroup --cluster $CLUSTER_NAME
CLUSTER                 NODEGROUP               STATUS  CREATED                 MIN SIZE        MAX SIZE        DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID        ASG NAME                                                        TYPE
root-karpenter-demo     root-karpenter-demo-ng  ACTIVE  2024-04-06T23:58:37Z    1               10              2                       m5.largeAL2_x86_64      eks-root-karpenter-demo-ng-a8c75aec-c03f-5a46-7d20-e8976aee8644 managed
$ eksctl get iamidentitymapping --cluster $CLUSTER_NAME
ARN                                                                                             USERNAME                                GROUPS ACCOUNT
arn:aws:iam::732659419746:role/KarpenterNodeRole-root-karpenter-demo                            system:node:{{EC2PrivateDNSName}}       system:bootstrappers,system:nodes
arn:aws:iam::732659419746:role/eksctl-root-karpenter-demo-nodegro-NodeInstanceRole-ynZpFVeiXlm5 system:node:{{EC2PrivateDNSName}}       system:bootstrappers,system:nodes
$ eksctl get iamserviceaccount --cluster $CLUSTER_NAME
NAMESPACE       NAME            ROLE ARN
kube-system     aws-node        arn:aws:iam::732659419746:role/eksctl-root-karpenter-demo-addon-iamserviceac-Role1-PCPg44mreCGF
kube-system     karpenter       arn:aws:iam::732659419746:role/root-karpenter-demo-karpenter

# Worker Node 정보 확인
$ kubectl get node --label-columns=node.kubernetes.io/instance-type,eks.amazonaws.com/capacityType,topology.kubernetes.io/zone
NAME                                                STATUS   ROLES    AGE     VERSION               INSTANCE-TYPE   CAPACITYTYPE   ZONE
ip-192-168-29-128.ap-northeast-2.compute.internal   Ready    <none>   3m25s   v1.29.0-eks-5e0fdde   m5.large        ON_DEMAND      ap-northeast-2c
ip-192-168-55-141.ap-northeast-2.compute.internal   Ready    <none>   3m21s   v1.29.0-eks-5e0fdde   m5.large        ON_DEMAND      ap-northeast-2a

 

 

2. Karpenter 설치 

 

# Karpenter 설치를 위한 환경변수 설정
$ export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name "${CLUSTER_NAME}" --query "cluster.endpoint" --output text)"
$ export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter"
$ echo "${CLUSTER_ENDPOINT} ${KARPENTER_IAM_ROLE_ARN}"
https://EBBEF77A895A32747912C23D41CDFE34.gr7.ap-northeast-2.eks.amazonaws.com arn::iam::732659419746:role/root-karpenter-demo-karpenter

# Karpenter 설치 
$ helm install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" --namespace "${KARPENTER_NAMESPACE}" --create-namespace \
  --set "serviceAccount.annotations.eks\.amazonaws\.com/role-arn=${KARPENTER_IAM_ROLE_ARN}" \
  --set "settings.clusterName=${CLUSTER_NAME}" \
  --set "settings.interruptionQueue=${CLUSTER_NAME}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi \
  --wait
  
# Karpenter 설치 확인 
$ kubectl get crd | grep karpenter
ec2nodeclasses.karpenter.k8s.aws             2024-04-07T00:21:45Z
nodeclaims.karpenter.sh                      2024-04-07T00:21:45Z
nodepools.karpenter.sh                       2024-04-07T00:21:45Z

 

3. Grafana 대시보드 세팅 

 

# Helm Repo
$ helm repo add grafana-charts https://grafana.github.io/helm-charts
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ kubectl create namespace monitoring

# Prometheus 설치
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/prometheus-values.yaml | envsubst | tee prometheus-values.yaml
alertmanager:
  persistentVolume:
    enabled: false

server:
  fullnameOverride: prometheus-server
  persistentVolume:
    enabled: false

extraScrapeConfigs: |
    - job_name: karpenter
      kubernetes_sd_configs:
      - role: endpoints
        namespaces:
          names:
          - kube-system
      relabel_configs:
      - source_labels:
        - __meta_kubernetes_endpoints_name
        - __meta_kubernetes_endpoint_port_name
        action: keep
        regex: karpenter;http-metrics
$ helm install --namespace monitoring prometheus prometheus-community/prometheus --values prometheus-values.yaml
Release "prometheus" has been upgraded. Happy Helming!
NAME: prometheus
NAMESPACE: monitoring
STATUS: deployed
REVISION: 2
TEST SUITE: None

# Grafana 설치 
$ curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/grafana-values.yaml | tee grafana-values.yaml
datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
    - name: Prometheus
      type: prometheus
      version: 1
      url: http://prometheus-server:80
      access: proxy
dashboardProviders:
  dashboardproviders.yaml:
    apiVersion: 1
    providers:
    - name: 'default'
      orgId: 1
      folder: ''
      type: file
      disableDeletion: false
      editable: true
      options:
        path: /var/lib/grafana/dashboards/default
dashboards:
  default:
    capacity-dashboard:
      url: https://karpenter.sh/preview/getting-started/getting-started-with-karpenter/karpenter-capacity-dashboard.json
    performance-dashboard:
      url: https://karpenter.sh/preview/getting-started/getting-started-with-karpenter/karpenter-performance-dashboard.json
$ helm install --namespace monitoring grafana grafana-charts/grafana --values grafana-values.yaml
NAME: grafana
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
$ kubectl patch svc -n monitoring grafana -p '{"spec":{"type":"LoadBalancer"}}'
service/grafana patched

# Admin 암호 
$ kubectl get secret --namespace monitoring grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
DL4JYG**************

# Grafana 접속 
$ kubectl annotate service grafana -n monitoring "external-dns.alpha.kubernetes.io/hostname=grafana.$MyDomain"
service/grafana annotated
$ echo -e "grafana URL = http://grafana.$MyDomain"
grafana URL = http://grafana.okms1017.name

 

 

4. NodePool 생성 

 

# Create NodePool
$ cat <<EOF | envsubst | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h # 30 * 24h = 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2 # Amazon Linux 2
  role: "KarpenterNodeRole-${CLUSTER_NAME}" # replace with your cluster name
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER_NAME}" # replace with your cluster name
  amiSelectorTerms:
    - id: "${ARM_AMI_ID}"
    - id: "${AMD_AMI_ID}"
#   - id: "${GPU_AMI_ID}" # <- GPU Optimized AMD AMI 
#   - name: "amazon-eks-node-${K8S_VERSION}-*" # <- automatically upgrade when a new AL2 EKS Optimized AMI is released. This is unsafe for production workloads. Validate AMIs in lower environments before deploying them to production.
EOF
nodepool.karpenter.sh/default created
ec2nodeclass.karpenter.k8s.aws/default created

# NodePool 확인 
$ kubectl get nodepool,ec2nodeclass
NAME                            NODECLASS
nodepool.karpenter.sh/default   default
NAME                                     AGE
ec2nodeclass.karpenter.k8s.aws/default   21s

 

5. Scale up deployment

 

# 파드당 1Core 최소 보장 
$ cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
          resources:
            requests:
              cpu: 1
EOF
deployment.apps/inflate created

# Scale up
$ kubectl get pod

$ kubectl scale deployment inflate --replicas 5
deployment.apps/inflate scaled
$ kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

 

6. Scale down deployment 

 

# Scale down 
$ kubectl delete deployment inflate && date
$ kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

 
 
 


[출처]
1) CloudNet@, AEWS 실습 스터디
2) https://aws.amazon.com/ko/blogs/korea/introducing-karpenter-an-open-source-high-performance-kubernetes-cluster-autoscaler/

 

Karpenter 소개 – 오픈 소스 고성능 Kubernetes 클러스터 오토스케일러 | Amazon Web Services

이제 Karpenter를 프로덕션에 적용할 준비를 마쳤습니다. Karpenter는 AWS로 구축된 유연한 오픈 소스의 고성능 Kubernetes 클러스터 오토스케일러입니다. 애플리케이션 로드의 변화에 대응하여 적절한

aws.amazon.com

3) How-to-monitor-and-reduce-your-compute-costs.pdf, 6p
4) https://karpenter.sh/docs/getting-started/getting-started-with-karpenter/

 

Getting Started with Karpenter

Set up a cluster and add Karpenter

karpenter.sh

5) https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/setup-provisioner/

 

Set up the Node Pool | EKS Workshop

Karpenter configuration comes in the form of a NodePool CRD (Custom Resource Definition). A single Karpenter NodePool is capable of handling many different Pod shapes. Karpenter makes scheduling and provisioning decisions based on Pod attributes such as la

www.eksworkshop.com

6) https://www.eksworkshop.com/docs/autoscaling/compute/karpenter/node-provisioning/

 

Automatic Node Provisioning | EKS Workshop

We'll start putting Karpenter to work by examining how it can dynamically provision appropriately sized EC2 instances depending on the needs of Pods that cannot be scheduled at any given time. This can reduce the amount of unused compute resources in an EK

www.eksworkshop.com

728x90

'AWS > EKS' 카테고리의 다른 글

[AEWS2] 6-2. EKS Authentication & Authorization  (0) 2024.04.13
[AEWS2] 6-1. JWT 란?  (0) 2024.04.13
[AEWS2] 5-1. EKS AutoScaling  (1) 2024.04.07
[AEWS2] 4-2. Prometheus & Grafana  (0) 2024.03.31
[AEWS2] 4-1. EKS Observability  (0) 2024.03.31