Custom User Data and AMI Configuration

Learn how to configure custom UserData and AMIs with Karpenter

This document describes how you can customize the UserData and AMIs for your EC2 worker nodes, without using a launch template.

Configuration

In order to specify custom user data and AMIs, you must include them within a AWSNodeTemplate resource. You can then reference this AWSNodeTemplate resource through spec.providerRef in your provisioner.

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  providerRef:
    name: bottlerocket-example
  ...

Examples

Your UserData and AMIs can be added to spec.userData and spec.amiSelector respectively in the AWSNodeTemplate resource -

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: bottlerocket-example
spec:
  amiFamily: Bottlerocket
  instanceProfile: MyInstanceProfile
  subnetSelector:
    karpenter.sh/discovery: my-cluster
  securityGroupSelector:
    karpenter.sh/discovery: my-cluster
  userData:  |
    [settings.kubernetes]
    kube-api-qps = 30
    [settings.kubernetes.eviction-hard]
    "memory.available" = "20%"    
  amiSelector:
    karpenter.sh/discovery: my-cluster

For more examples on configuring these fields for different AMI families, see the examples here.

UserData Content and Merge Semantics

Karpenter will evaluate and merge the UserData that you specify in the AWSNodeTemplate resources depending upon the AMIFamily that you have chosen.

Bottlerocket

  • Your UserData must be valid TOML.
  • Karpenter will automatically merge settings to ensure successful bootstrap including cluster-name, api-server and cluster-certificate. Any labels and taints that need to be set based on pod requirements will also be specified in the final merged UserData.
    • All Kubelet settings that Karpenter applies will override the corresponding settings in the provided UserData. For example, if you’ve specified settings.kubernetes.cluster-name, it will be overridden.
    • If MaxPods is specified via the binary arg to Karpenter, the value will override anything specified in the UserData.
    • If ClusterDNS is specified via spec.kubeletConfiguration, then that value will override anything specified in the UserData.
  • Unknown TOML fields will be ignored when the final merged UserData is generated by Karpenter.

Consider the following example to understand how your custom UserData settings will be merged in.

Your UserData -

[settings.kubernetes.eviction-hard]
"memory.available" = "12%"
[settings.kubernetes]
"unknown-setting" = "unknown"
[settings.kubernetes.node-labels]
'field.controlled.by/karpenter': 'will-be-overridden'

Final merged UserData -

[settings]
[settings.kubernetes]
api-server = 'https://cluster'
cluster-certificate = 'ca-bundle'
cluster-name = 'cluster'

[settings.kubernetes.node-labels]
'karpenter.sh/capacity-type' = 'on-demand'
'karpenter.sh/provisioner-name' = 'provisioner'

[settings.kubernetes.node-taints]

[settings.kubernetes.eviction-hard]
'memory.available' = '12%%'

AL2 and Ubuntu

  • Your UserData must be in the MIME multi part archive format.
  • Karpenter will merge a final MIME part to the end of your UserData parts which will bootstrap the worker node. Karpenter will have full control over all the parameters being passed to the bootstrap script.
    • Karpenter will continue to set MaxPods, ClusterDNS and all other parameters defined in spec.kubeletConfiguration as before.

Consider the following example to understand how your custom UserData will be merged -

Your UserData -

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BOUNDARY"

--BOUNDARY
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash
echo "Running custom user data script"

--BOUNDARY--

The final merged UserData that will be applied to your worker nodes -

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="//"

--//
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash
echo "Running custom user data script"

--//
Content-Type: text/x-shellscript; charset="us-ascii"

#!/bin/bash -xe
exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1
/etc/eks/bootstrap.sh 'test-cluster' --apiserver-endpoint 'https://test-cluster' --b64-cluster-ca 'ca-bundle' \
--use-max-pods false \
--container-runtime containerd \
--kubelet-extra-args '--node-labels=karpenter.sh/capacity-type=on-demand,karpenter.sh/provisioner-name=test  --max-pods=110'
--//--

You can also set kubelet-config properties by modifying the kubelet-config.json file before the EKS bootstrap script starts the kubelet:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: kubelet-config-example
spec:
  subnetSelector:
    karpenter.sh/discovery: my-cluster
  securityGroupSelector:
    karpenter.sh/discovery: my-cluster
  userData: |
    MIME-Version: 1.0
    Content-Type: multipart/mixed; boundary="BOUNDARY"

    --BOUNDARY
    Content-Type: text/x-shellscript; charset="us-ascii"

    #!/bin/bash
    echo "$(jq '.kubeAPIQPS=50' /etc/kubernetes/kubelet/kubelet-config.json)" > /etc/kubernetes/kubelet/kubelet-config.json

    --BOUNDARY--

Custom AMIs

You can specify a set of AMIs for a provisioner to use by specifying an AMISelector that identifies AMIs to use through EC2 tags or via a comma-separated list.

Defining AMI constraints

Karpenter will automatically determine the architecture that an EC2 AMI is compatible with (amd64, arm64), but other constraints of an AMI can be expressed as tags on the EC2 AMI. For example, if you want to limit an EC2 AMI to only be used with instanceTypes that have an nvidia GPU, you can specify an EC2 tag with a key of karpenter.k8s.aws/instance-gpu-manufacturer and value nvidia on that AMI.

All labels defined in the scheduling documentation can be used as requirements for an EC2 AMI.

> aws ec2 describe-images --image-id ami-123 --query Images[0].Tags
[
    {
        "Key": "karpenter.sh/discovery",
        "Value": "my-cluster"
    },
    {
        "Key": "Name",
        "Value": "amazon-eks-node-1.21-customized-v0"
    },
    {
        "Key": "karpenter.k8s.aws/instance-gpu-manufacturer",
        "Value": "nvidia"
    }
]

AMIFamily

When you give Karpenter an AMI ID to use, you can specify which AMIFamily they belong to. This will determine how Karpenter should use your AMI. For example, if you define the AMIFamily to be AL2, then Karpenter will assume that a worker node using that AMI should be bootstrapped in the same manner as EKS-optimized AL2 AMIs. This is useful when your custom images are variants of EKS-optimized AMIs and there are no differences in how bootstrapping needs to be performed.

When the AMIFamily is set to Custom, then Karpenter will not attempt to bootstrap the worker node. You must set the necessary commands through spec.UserData to ensure that your worker node joins the cluster.

Binpacking semantics for AMIFamily

In order for Karpenter to accurately binpack your pods in a worker node, it needs to know the eventual allocatable capacity on your node. This capacity has several dimensions (cpu, memory, ephemeral-storage) and is a function of the instanceType as well as the AMI.

  • When the AMIFamily is AL2, Bottlerocket or Ubuntu, Karpenter will bin-pack your pods in the same way as other EKS-optimized AMIs of that family.
  • When the AMIFamily is Custom, Karpenter assumes that the amount of allocatable cpu, memory and ephemeral-storage is identical to AL2 EKS-Optimized AMIs, regardless of how the node is being bootstrapped.
    • When the AMIFamily is Custom, Karpenter has no way of knowing which ephemeral volume will be used for pods. Therefore, it will default to using the last volume in spec.blockDeviceMappings to determine the total available ephemeral capacity on a worker node.