This repository contains the implementation of Kata remote hypervisor.
Kata remote hypervisor enables creation of Kata VMs on any environment without requiring baremetal servers or nested
virtualization support.
Goals
Accept requests from Kata shim to create/delete Kata VM instances without requiring nested virtualization support.
Manage VM instances in the cloud to run pods using cloud (virtualization) provider APIs
Forward communication between kata shim on a worker node VM and kata agent on a pod VM
Provide a mechanism to establish a network tunnel between a worker and pod VMs to Kubernetes pod network
Architecture
The background and description of the components involved in ‘peer pods’ can be found in the architecture documentation.
Components
Cloud API adaptor (cmd/cloud-api-adaptor) - cloud-api-adator implements the remote hypervisor support.
It is possible that the cloud-api-adaptor-daemonset is not deployed correctly. To see what is wrong with it run the following command and look at the events to get insights:
$ kubectl -n confidential-containers-system describe ds cloud-api-adaptor-daemonset
Name: cloud-api-adaptor-daemonset
Selector: app=cloud-api-adaptor
Node-Selector: node-role.kubernetes.io/worker=
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 8m13s daemonset-controller Created pod: cloud-api-adaptor-daemonset-2pjbb
But if the cloud-api-adaptor-daemonset is up and in the Running state, like shown above then look at the pods’ logs, for more insights:
Note: This is a single node cluster. So there is only one pod named cloud-api-adaptor-daemonset-*. But if you are running on a multi-node cluster then look for the node your workload fails to come up and only see the logs of corresponding CAA pod.
If the problem hints that something is wrong with the configuration then look at the configmaps or secrets needed to run CAA:
$ kubectl -n confidential-containers-system get cm
NAME DATA AGE
cc-operator-manager-config 1 1h
kube-root-ca.crt 1 1h
peer-pods-cm 7 1h
$ kubectl -n confidential-containers-system get secret
NAME TYPE DATA AGE
peer-pods-secret Opaque 0 1h
ssh-key-secret Opaque 1 1h
export AWS_REGION="us-east-1" # mandatory
export PODVM_DISTRO=rhel # mandatory
export INSTANCE_TYPE=t3.small # optional, default is t3.small
export IMAGE_NAME=peer-pod-ami # optional
export VPC_ID=vpc-01234567890abcdef # optional, otherwise, it creates and uses the default vpc in the specific region
export SUBNET_ID=subnet-01234567890abcdef # must be set if VPC_ID is set
If you want to change the volume size of the generated AMI, then set the VOLUME_SIZE environment variable.
For example if you want to set the volume size to 40 GiB, then do the following:
export VOLUME_SIZE=40
Create a custom AWS VM image based on Ubuntu 22.04 having kata-agent and other dependencies
NOTE: For setting up authenticated registry support read this documentation.
cd image
make image
You can also build the custom AMI by running the packer build inside a container:
Once the image creation is complete, you can use the following CLI command as well to
get the AMI_ID. The command assumes that you are using the default AMI name: peer-pod-ami
Convert QCOW2 image to RAW format
You’ll need the qemu-img tool for conversion.
qemu-img convert -O raw podvm.qcow2 podvm.raw
Upload RAW image to S3 and create AMI
You can use the following helper script to upload the podvm.raw image to S3 and create an AMI
Note that AWS cli should be configured to use the helper script.
Install kubectl by following the instructions here.
Ensure that the tools curl, git, jq and sipcalc are installed.
Azure Preparation
Azure login
There are a bunch of steps that require you to be logged into your Azure account:
az login
Retrieve your subscription ID:
exportAZURE_SUBSCRIPTION_ID=$(az account show --query id --output tsv)
Set the region:
exportAZURE_REGION="eastus"
Note: We selected the eastus region as it not only offers AMD SEV-SNP machines but also has prebuilt pod VM images readily available.
exportAZURE_REGION="eastus2"
Note: We selected the eastus2 region as it not only offers Intel TDX machines but also has prebuilt pod VM images readily available.
exportAZURE_REGION="eastus"
Note: We have chose region eastus because it has prebuilt pod VM images readily available.
Resource group
Note: Skip this step if you already have a resource group you want to use. Please, export the resource group name in the AZURE_RESOURCE_GROUP environment variable.
Create an Azure resource group by running the following command:
exportAZURE_RESOURCE_GROUP="caa-rg-$(date '+%Y%m%b%d%H%M%S')"az group create \
--name "${AZURE_RESOURCE_GROUP}"\
--location "${AZURE_REGION}"
Deploy Kubernetes using AKS
Make changes to the following environment variable as you see fit:
Note: Optionally, deploy the worker nodes into an existing Azure Virtual Network (VNet) and subnet by adding the following flag: --vnet-subnet-id $MY_SUBNET_ID.
Deploy AKS with single worker node to the same resource group you created earlier:
Download kubeconfig locally to access the cluster using kubectl:
az aks get-credentials \
--resource-group "${AZURE_RESOURCE_GROUP}"\
--name "${CLUSTER_NAME}"
User assigned identity and federated credentials
CAA needs privileges to talk to Azure API. This privilege is granted to CAA by associating a workload identity to the CAA service account. This workload identity (a.k.a. user assigned identity) is given permissions to create VMs, fetch images and join networks in the next step.
Note: If you use an existing AKS cluster it might need to be configured to support workload identity and OpenID Connect (OIDC), please refer to the instructions in this guide.
The VMs that will host Pods will commonly require access to internet services, e.g. to pull images from a public OCI registry. A discrete subnet can be created next to the AKS cluster subnet in the same VNet. We then attach a NAT gateway with a public IP to that subnet:
For CAA to be able to manage VMs assign the identity VM and Network contributor roles, privileges to spawn VMs in $AZURE_RESOURCE_GROUP and attach to a VNet in $AKS_RG.
az role assignment create \
--role "Virtual Machine Contributor"\
--assignee "$USER_ASSIGNED_CLIENT_ID"\
--scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
--role "Reader"\
--assignee "$USER_ASSIGNED_CLIENT_ID"\
--scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
--role "Network Contributor"\
--assignee "$USER_ASSIGNED_CLIENT_ID"\
--scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AKS_RG}"
Create the federated credential for the CAA ServiceAccount using the OIDC endpoint from the AKS cluster:
exportAKS_OIDC_ISSUER="$(az aks show \
--name "${CLUSTER_NAME}"\
--resource-group "${AZURE_RESOURCE_GROUP}"\
--query "oidcIssuerProfile.issuerUrl"\
-otsv)"
Note: If you are using Calico Container Network Interface (CNI) on the Kubernetes cluster, then, configure Virtual Extensible LAN (VXLAN) encapsulation for all inter workload traffic.
Above image version is in the format YYYY.MM.DD, so to use the latest image should be today’s date or yesterday’s date.
If you have made changes to the CAA code that affects the pod VM image and you want to deploy those changes then follow these instructions to build the pod VM image. Once image build is finished then export image id to the environment variable AZURE_IMAGE_ID.
CAA container image
Export the following environment variable to use the latest release image of CAA:
Find an appropriate tag of pre-built image suitable to your needs here.
exportCAA_TAG=""
Caution: You can also use the latest tag but it is not recommended, because of its lack of version control and potential for unpredictable updates, impacting stability and reproducibility in deployments.
If you have made changes to the CAA code and you want to deploy those changes then follow these instructions to build the container image. Once the image is built export the environment variables CAA_IMAGE and CAA_TAG.
Annotate Service Account
Annotate the CAA Service Account with the workload identity’s CLIENT_ID and make the CAA DaemonSet use workload identity for authentication:
Generic CAA deployment instructions are also described here.
Run sample application
Ensure runtimeclass is present
Verify that the runtimeclass is created after deploying CAA:
kubectl get runtimeclass
Once you can find a runtimeclass named kata-remote then you can be sure that the deployment was successful. A successful deployment will look like this:
$ kubectl get runtimeclass
NAME HANDLER AGE
kata-remote kata-remote 7m18s
This guide describes how to set up a demo environment on IBM Cloud for peer pod VMs using the operator deployment approach.
The high level flow involved is:
Build and upload a peer pod custom image to IBM Cloud
Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure
Deploy Confidential-containers operator
Deploy and validate that the nginx demo works
Clean-up and deprovision
Pre-reqs
When building the peer pod VM image, it is simplest to use the container based approach, which only requires either
docker, or podman, but it can also be built locally.
Note: the peer pod VM image build and upload is de-coupled from the cluster creation and operator deployment stage,
so can be built on a different machine.
There are a number of packages that you will need to install in order to create the Kubernetes cluster and peer pod enable it:
Terraform, Ansible, the IBM Cloud CLI and kubectl are all required for the cluster creation and explained in
the cluster pre-reqs guide.
Tip: If you are using Ubuntu linux, you can run follow command:
$ sudo apt-get install jq
You will also require go and make to be installed.
Peer Pod VM Image
A peer pod VM image needs to be created as a VPC custom image in IBM Cloud in order to create the peer pod instances
from. The peer pod VM image contains components like the agent protocol forwarder and Kata agent that communicate with
the Kubernetes worker node and carry out the received instructions inside the peer pod.
Building a Peer Pod VM Image via Docker [Optional]
You may skip this step and use one of the release images, skip to Import Release VM Image but for the latest features you may wish to build your own.
You can do this by following the process document. If building within a container ensure that --build-arg CLOUD_PROVIDER=ibmcloud is set and --build-arg ARCH=s390x for an s390x architecture image.
Note: At the time of writing issue, #649 means when creating an s390x image you also need to add two extra
build args: --build-arg UBUNTU_IMAGE_URL="" and --build-arg UBUNTU_IMAGE_CHECKSUM=""
Note: If building the peer pod qcow2 image within a VM, it may take a lot of resources e.g. 8 vCPU and
32GB RAM due to the nested virtualization performance limitations. When running without enough resources, the failure
seen is similar to:
Build 'qemu.ubuntu' errored after 5 minutes 57 seconds: Timeout waiting for SSH.
Upload the built peer pod VM image to IBM Cloud
You can follow the process documented from the cloud-api-adaptor/ibmcloud/image to extract and upload
the peer pod image you’ve just built to IBM Cloud as a custom image, noting to replace the
quay.io/confidential-containers/podvm-ibmcloud-ubuntu-s390x reference with the local container image that you built
above e.g. localhost/podvm_ibmcloud_s390x:latest.
This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be
needed in the kustomize step later.
Import Release VM Image
Alternatively to use a pre-built peer pod VM image you can follow the process documented with the release images found at quay.io/confidential-containers/podvm-generic-ubuntu-<ARCH>. Running this command will require docker or podman, as per tools
This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be
needed in later steps.
Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure
If you don’t have a Kubernetes cluster for testing, you can follow the open-source
instructions
to set up a basic cluster where the Kubernetes nodes run on IBM Cloud provided infrastructure.
The caa-provisioner-cli simplifies deploying the operator and the cloud-api-adaptor resources on to any cluster. See the test/tools/README.md for full instructions. To create an ibmcloud ready version follow these steps
# Starting from the cloud-api-adaptor root directorypushd test/tools
make BUILTIN_CLOUD_PROVIDERS="ibmcloud" all
popd
This will create caa-provisioner-cli in the test/tools directory. To use the tool with an existing self-managed cluster you will need to setup a .properties file containing the relevant ibmcloud information to enable your cluster to create and use peer-pods. Use the following commands to generate the .properties file, if not using a selfmanaged cluster please update the terraform commands with the appropriate values manually.
exportIBMCLOUD_API_KEY=# your ibmcloud apikeyexportPODVM_IMAGE_ID=# the image id of the peerpod vm uploaded in the previous stepexportPODVM_INSTANCE_PROFILE=# instance profile name that runs the peerpod (bx2-2x8 or bz2-2x8 for example)exportCAA_IMAGE_TAG=# cloud-api-adaptor image tag that supports this arch, see quay.io/confidential-containers/cloud-api-adaptorpushd ibmcloud/cluster
cat <<EOF > ../../selfmanaged_cluster.properties
IBMCLOUD_PROVIDER="ibmcloud"
APIKEY="$IBMCLOUD_API_KEY"
PODVM_IMAGE_ID="$PODVM_IMAGE_ID"
INSTANCE_PROFILE_NAME="$PODVM_INSTANCE_PROFILE"
CAA_IMAGE_TAG="$CAA_IMAGE_TAG"
SSH_KEY_ID="$(terraform output --raw ssh_key_id)"
EOFpopd
This will create a selfmanaged_cluster.properties files in the cloud-api-adaptor root directory.
The final step is to run the caa-provisioner-cli to install the operator.
exportCLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties fileexportTEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"# prevent the test from removing the cloud-api-adaptor resources from the clusterexportTEST_TEARDOWN="no"pushd test/tools
./caa-provisioner-cli -action=install
popd
End-2-End Test Framework
To validate that a cluster has been setup properly, there is a suite of tests that validate peer-pods across different providers,
the implementation of these tests can be found in test/e2e/common_suite_test.go).
Assuming CLOUD_PROVIDER and TEST_PROVISION_FILE are still set in your current terminal you can execute these tests
from the cloud-api-adaptor root directory by running the following commands
exportKUBECONFIG=$(pwd)/ibmcloud/cluster/config
make test-e2e
Uninstall and clean up
There are two options for cleaning up the environment once testing has finished, or if you want to re-install from a
clean state:
If using a self-managed cluster you can delete the whole cluster following the
Delete the cluster documentation and then start again.
If you instead just want to leave the cluster, but uninstall the Confidential Containers and peer pods
feature, you can use the caa-provisioner-cli to remove the resources.
exportCLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties fileexportTEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"pushd test/tools
./caa-provisioner-cli -action=uninstall
popd
This document contains instructions for using, developing and testing the cloud-api-adaptor with libvirt.
Creating an end-to-end environment for testing and development
In this section you will learn how to setup an environment in your local machine to run peer pods with
the libvirt cloud API adaptor. Bear in mind that many different tools can be used to setup the environment
and here we just make suggestions of tools that seems used by most of the peer pods developers.
Requirements
You must have a Linux/KVM system with libvirt installed and the following tools:
Assume that you have a ‘default’ network and storage pools created in libvirtd system instance (qemu:///system). However,
if you have a different pool name then the scripts should be able to handle it properly.
Create the Kubernetes cluster
Use the kcli_cluster.sh script to create a simple two VMs (one control plane and one worker) cluster
with the kcli tool, as:
./kcli_cluster.sh create
With kcli_cluster.sh you can configure the libvirt network and storage pools that the cluster VMs will be created, among
other parameters. Run ./kcli_cluster.sh -h to see the help for further information.
If everything goes well you will be able to see the cluster running after setting your Kubernetes config with:
$ kcli list kube
+-----------+---------+-----------+-----------------------------------------+
| Cluster | Type | Plan | Vms |
+-----------+---------+-----------+-----------------------------------------+
| peer-pods | generic | peer-pods | peer-pods-ctlplane-0,peer-pods-worker-0 |
+-----------+---------+-----------+-----------------------------------------+
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
peer-pods-ctlplane-0 Ready control-plane,master 6m8s v1.25.3
peer-pods-worker-0 Ready worker 2m47s v1.25.3
Prepare the Pod VM volume
In order to build the Pod VM without installing the build tools, you can use the Dockerfiles hosted on ../podvm directory to run the entire process inside a container. Refer to podvm/README.md for further details. Alternatively you can consume pre-built podvm images as explained here.
Next you will need to create a volume on libvirt’s system storage and upload the image content. That volume is used by
the cloud-api-adaptor program to instantiate a new Pod VM. Run the following commands:
Install and configure Confidential Containers and cloud-api-adaptor in the cluster
The easiest way to install the cloud-api-adaptor along with Confidential Containers in the cluster is through the
Kubernetes operator available in the install directory of this repository.
Start by creating a public/private RSA key pair that will be used by the cloud-api-provider program, running on the
cluster workers, to connect with your local libvirtd instance without password authentication. Assume you are in the
libvirt directory, do:
cd ../install/overlays/libvirt
ssh-keygen -f ./id_rsa -N ""cat id_rsa.pub >> ~/.ssh/authorized_keys
Note: ensure that ~/.ssh/authorized_keys has the right permissions (read/write for the user only) otherwise the
authentication can silently fail. You can run chmod 600 ~/.ssh/authorized_keys to set the right permissions.
You will need to figure out the IP address of your local host (e.g. 192.168.122.1). Then try to remote connect with
libvirt to check the keys setup is fine, for example:
$ virsh -c "qemu+ssh://$USER@192.168.122.1/system?keyfile=$(pwd)/id_rsa" nodeinfo
CPU model: x86_64
CPU(s): 12
CPU frequency: 1084 MHz
CPU socket(s): 1
Core(s) per socket: 6
Thread(s) per core: 2
NUMA cell(s): 1
Memory size: 32600636 KiB
Now you should finally install the Kubernetes operator in the cluster with the help of the install_operator.sh script. Ensure that you have your IP address exported in the environment, as shown below, then run the install script:
cd ../../../libvirt/
exportLIBVIRT_IP="192.168.122.1"exportSSH_KEY_FILE="id_rsa"./install_operator.sh
If everything goes well you will be able to see the operator’s controller manager and cloud-api-adaptor Pods running:
You will also notice that Kubernetes runtimeClass resources
were created on the cluster, as for example:
$ kubectl get runtimeclass
NAME HANDLER AGE
kata-remote kata-remote 7m18s
Create a sample peer-pods pod
At this point everything should be fine to get a sample Pod created. Let’s first list the running VMs so that we can later check
the Pod VM will be really running. Notice below that we got only the cluster node VMs up:
$ virsh -c qemu:///system list
Id Name State
------------------------------------
3 peer-pods-ctlplane-0 running
4 peer-pods-worker-0 running
Create the sample_busybox.yaml file with the following content:
$ kubectl apply -f sample_busybox.yaml
pod/busybox created
$ kubectl wait --for=condition=Ready pod/busybox
pod/busybox condition met
Check that the Pod VM is up and running. See on the following listing that podvm-busybox-88a70031 was
created:
$ virsh -c qemu:///system list
Id Name State
----------------------------------------
5 peer-pods-ctlplane-0 running
6 peer-pods-worker-0 running
7 podvm-busybox-88a70031 running
You should also check that the container is running fine. For example, compare the kernels are different as shown below:
The peer-pods pod can be deleted as any regular pod. On the listing below the pod was removed and you can note that the
Pod VM no longer exists on Libvirt:
$ kubectl delete -f sample_busybox.yaml
pod "busybox" deleted
$ virsh -c qemu:///system list
Id Name State
------------------------------------
5 peer-pods-ctlplane-0 running
6 peer-pods-worker-0 running
Delete Confidential Containers and cloud-api-adaptor from the cluster
You might want to reinstall the Confidential Containers and cloud-api-adaptor into your cluster. There are two options:
Delete the Kubernetes cluster entirely and start over. In this case you should just run ./kcli_cluster.sh delete to
wipe out the cluster created with kcli
Uninstall the operator resources then install them again with the install_operator.sh script
Let’s show you how to delete the operator resources. On the listing below you can see the actual pods running on
the confidential-containers-system namespace:
$ kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn 2/2 Running 0 20h
cc-operator-daemon-install-fkkzz 1/1 Running 0 20h
cloud-api-adaptor-daemonset-libvirt-lxj7v 1/1 Running 0 20h
In order to remove the *-daemon-install-* and *-cloud-api-adaptor-daemonset-* pods, run the following command from the
root directory:
CLOUD_PROVIDER=libvirt make delete
It can take some minutes to get those pods deleted, afterwards you will notice that only the controller-manager is
still up. Below is shown how to delete that pod and associated resources as well:
$ kubectl get pods -n confidential-containers-system
NAME READY STATUS RESTARTS AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn 2/2 Running 0 20h
$ kubectl delete -f install/yamls/deploy.yaml
namespace "confidential-containers-system" deleted
serviceaccount "cc-operator-controller-manager" deleted
role.rbac.authorization.k8s.io "cc-operator-leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-manager-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-metrics-reader" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-proxy-role" deleted
rolebinding.rbac.authorization.k8s.io "cc-operator-leader-election-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-manager-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-proxy-rolebinding" deleted
configmap "cc-operator-manager-config" deleted
service "cc-operator-controller-manager-metrics-service" deleted
deployment.apps "cc-operator-controller-manager" deleted
customresourcedefinition.apiextensions.k8s.io "ccruntimes.confidentialcontainers.org" deleted
$ kubectl get pods -n confidential-containers-system
No resources found in confidential-containers-system namespace.