This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Cloud API Adaptor

Documentation for Cloud API Adaptor a.k.a Peer Pods

Introduction

This repository contains the implementation of Kata remote hypervisor. Kata remote hypervisor enables creation of Kata VMs on any environment without requiring baremetal servers or nested virtualization support.

Goals

  • Accept requests from Kata shim to create/delete Kata VM instances without requiring nested virtualization support.
  • Manage VM instances in the cloud to run pods using cloud (virtualization) provider APIs
  • Forward communication between kata shim on a worker node VM and kata agent on a pod VM
  • Provide a mechanism to establish a network tunnel between a worker and pod VMs to Kubernetes pod network

Architecture

The background and description of the components involved in ‘peer pods’ can be found in the architecture documentation.

Components

Installation

Please refer to the instructions mentioned in the following doc.

Supported Providers

  • aws
  • azure
  • ibmcloud
  • libvirt
  • vsphere

Adding a new provider

Please refer to the instructions mentioned in the following doc.

Contribution

This project uses the Apache 2.0 license. Contribution to this project requires the DCO 1.1 process to be followed.

Collaborations

1 - Cloud API Adaptor Troubleshooting

Generic troubleshooting steps after installation of Cloud API Adaptor

Application pod created but it stays in ContainerCreating state

Let’s start by looking at the pods deployed in the confidential-containers-system namespace:

$ kubectl get pods -n confidential-containers-system -o wide
NAME                                              READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
cc-operator-controller-manager-76755f9c96-pjj92   2/2     Running   0          1h    10.244.0.14   aks-nodepool1-22620003-vmss000000   <none>           <none>
cc-operator-daemon-install-79c2b                  1/1     Running   0          1h    10.244.0.16   aks-nodepool1-22620003-vmss000000   <none>           <none>
cc-operator-pre-install-daemon-gsggj              1/1     Running   0          1h    10.244.0.15   aks-nodepool1-22620003-vmss000000   <none>           <none>
cloud-api-adaptor-daemonset-2pjbb                 1/1     Running   0          1h    10.224.0.4    aks-nodepool1-22620003-vmss000000   <none>           <none>

It is possible that the cloud-api-adaptor-daemonset is not deployed correctly. To see what is wrong with it run the following command and look at the events to get insights:

$ kubectl -n confidential-containers-system describe ds cloud-api-adaptor-daemonset
Name:           cloud-api-adaptor-daemonset
Selector:       app=cloud-api-adaptor
Node-Selector:  node-role.kubernetes.io/worker=
...
Events:
  Type    Reason            Age    From                  Message
  ----    ------            ----   ----                  -------
  Normal  SuccessfulCreate  8m13s  daemonset-controller  Created pod: cloud-api-adaptor-daemonset-2pjbb

But if the cloud-api-adaptor-daemonset is up and in the Running state, like shown above then look at the pods’ logs, for more insights:

kubectl -n confidential-containers-system logs daemonset/cloud-api-adaptor-daemonset

Note: This is a single node cluster. So there is only one pod named cloud-api-adaptor-daemonset-*. But if you are running on a multi-node cluster then look for the node your workload fails to come up and only see the logs of corresponding CAA pod.

If the problem hints that something is wrong with the configuration then look at the configmaps or secrets needed to run CAA:

$ kubectl -n confidential-containers-system get cm
NAME                         DATA   AGE
cc-operator-manager-config   1      1h
kube-root-ca.crt             1      1h
peer-pods-cm                 7      1h
$ kubectl -n confidential-containers-system get secret
NAME               TYPE     DATA   AGE
peer-pods-secret   Opaque   0      1h
ssh-key-secret     Opaque   1      1h

2 - AWS

Documentation for peerpods on AWS

Prerequisites

  • Set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for AWS cli access

  • Install packer by following the instructions in the following link

  • Install packer’s Amazon plugin packer plugins install github.com/hashicorp/amazon

Build Pod VM Image

Option-1: Modifying existing marketplace image

  • Set environment variables
export AWS_REGION="us-east-1" # mandatory
export PODVM_DISTRO=rhel # mandatory
export INSTANCE_TYPE=t3.small # optional, default is t3.small
export IMAGE_NAME=peer-pod-ami # optional
export VPC_ID=vpc-01234567890abcdef # optional, otherwise, it creates and uses the default vpc in the specific region
export SUBNET_ID=subnet-01234567890abcdef # must be set if VPC_ID is set

If you want to change the volume size of the generated AMI, then set the VOLUME_SIZE environment variable. For example if you want to set the volume size to 40 GiB, then do the following:

export VOLUME_SIZE=40
  • Create a custom AWS VM image based on Ubuntu 22.04 having kata-agent and other dependencies

NOTE: For setting up authenticated registry support read this documentation.

cd image
make image

You can also build the custom AMI by running the packer build inside a container:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
-f Dockerfile .

If you want to use an existing VPC_ID with public SUBNET_ID then use the following command:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
--build-arg VPC_ID=${VPC_ID} \
--build-arg SUBNET_ID=${SUBNET_ID}\
-f Dockerfile .

If you want to build a CentOS based custom AMI then you’ll need to first accept the terms by visiting this link

Once done, run the following command:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
--build-arg BINARIES_IMG=quay.io/confidential-containers/podvm-binaries-centos-amd64 \
--build-arg PODVM_DISTRO=centos \
-f Dockerfile .
  • Note down your newly created AMI_ID

Once the image creation is complete, you can use the following CLI command as well to get the AMI_ID. The command assumes that you are using the default AMI name: peer-pod-ami

aws ec2 describe-images --query "Images[*].[ImageId]" --filters "Name=name,Values=peer-pod-ami" --region ${AWS_REGION} --output text

Option-2: Using precreated QCOW2 image

  • Download QCOW2 image
mkdir -p qcow2-img && cd qcow2-img

curl -LO https://raw.githubusercontent.com/confidential-containers/cloud-api-adaptor/staging/podvm/hack/download-image.sh

bash download-image.sh quay.io/confidential-containers/podvm-generic-ubuntu-amd64:latest . -o podvm.qcow2
  • Convert QCOW2 image to RAW format You’ll need the qemu-img tool for conversion.
qemu-img convert -O raw podvm.qcow2 podvm.raw
  • Upload RAW image to S3 and create AMI You can use the following helper script to upload the podvm.raw image to S3 and create an AMI Note that AWS cli should be configured to use the helper script.
curl -L0 https://raw.githubusercontent.com/confidential-containers/cloud-api-adaptor/staging/aws/raw-to-ami.sh

bash raw-to-ami.sh podvm.raw <Some S3 Bucket Name> <AWS Region>

On success, the command will generate the AMI_ID, which needs to be used to set the value of PODVM_AMI_ID in peer-pods-cm configmap.

Running cloud-api-adaptor

3 - Azure

Cloud API Adaptor (CAA) on Azure

This documentation will walk you through setting up CAA (a.k.a. Peer Pods) on Azure Kubernetes Service (AKS). It explains how to deploy:

  • A single worker node Kubernetes cluster using Azure Kubernetes Service (AKS)
  • CAA on that Kubernetes cluster
  • An Nginx pod backed by CAA pod VM

Pre-requisites

  • Install Azure CLI by following instructions here.
  • Install kubectl by following the instructions here.
  • Ensure that the tools curl, git, jq and sipcalc are installed.

Azure Preparation

Azure login

There are a bunch of steps that require you to be logged into your Azure account:

az login

Retrieve your subscription ID:

export AZURE_SUBSCRIPTION_ID=$(az account show --query id --output tsv)

Set the region:

export AZURE_REGION="eastus"

Note: We selected the eastus region as it not only offers AMD SEV-SNP machines but also has prebuilt pod VM images readily available.

export AZURE_REGION="eastus2"

Note: We selected the eastus2 region as it not only offers Intel TDX machines but also has prebuilt pod VM images readily available.

export AZURE_REGION="eastus"

Note: We have chose region eastus because it has prebuilt pod VM images readily available.

Resource group

Note: Skip this step if you already have a resource group you want to use. Please, export the resource group name in the AZURE_RESOURCE_GROUP environment variable.

Create an Azure resource group by running the following command:

export AZURE_RESOURCE_GROUP="caa-rg-$(date '+%Y%m%b%d%H%M%S')"

az group create \
  --name "${AZURE_RESOURCE_GROUP}" \
  --location "${AZURE_REGION}"

Deploy Kubernetes using AKS

Make changes to the following environment variable as you see fit:

export CLUSTER_NAME="caa-$(date '+%Y%m%b%d%H%M%S')"
export AKS_WORKER_USER_NAME="azuser"
export AKS_RG="${AZURE_RESOURCE_GROUP}-aks"
export SSH_KEY=~/.ssh/id_rsa.pub

Note: Optionally, deploy the worker nodes into an existing Azure Virtual Network (VNet) and subnet by adding the following flag: --vnet-subnet-id $MY_SUBNET_ID.

Deploy AKS with single worker node to the same resource group you created earlier:

az aks create \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --node-resource-group "${AKS_RG}" \
  --name "${CLUSTER_NAME}" \
  --enable-oidc-issuer \
  --enable-workload-identity \
  --location "${AZURE_REGION}" \
  --node-count 1 \
  --node-vm-size Standard_F4s_v2 \
  --nodepool-labels node.kubernetes.io/worker= \
  --ssh-key-value "${SSH_KEY}" \
  --admin-username "${AKS_WORKER_USER_NAME}" \
  --os-sku Ubuntu

Download kubeconfig locally to access the cluster using kubectl:

az aks get-credentials \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --name "${CLUSTER_NAME}"

User assigned identity and federated credentials

CAA needs privileges to talk to Azure API. This privilege is granted to CAA by associating a workload identity to the CAA service account. This workload identity (a.k.a. user assigned identity) is given permissions to create VMs, fetch images and join networks in the next step.

Note: If you use an existing AKS cluster it might need to be configured to support workload identity and OpenID Connect (OIDC), please refer to the instructions in this guide.

Start by creating an identity for CAA:

export AZURE_WORKLOAD_IDENTITY_NAME="caa-${CLUSTER_NAME}"

az identity create \
  --name "${AZURE_WORKLOAD_IDENTITY_NAME}" \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --location "${AZURE_REGION}"
export USER_ASSIGNED_CLIENT_ID="$(az identity show \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --name "${AZURE_WORKLOAD_IDENTITY_NAME}" \
  --query 'clientId' \
  -otsv)"

Networking

The VMs that will host Pods will commonly require access to internet services, e.g. to pull images from a public OCI registry. A discrete subnet can be created next to the AKS cluster subnet in the same VNet. We then attach a NAT gateway with a public IP to that subnet:

export AZURE_VNET_NAME="$(az network vnet list -g ${AKS_RG} --query '[].name' -o tsv)"
export AKS_CIDR="$(az network vnet show -n $AZURE_VNET_NAME -g $AKS_RG --query "subnets[?name == 'aks-subnet'].addressPrefix" -o tsv)"
# 10.224.0.0/16
export MASK="${AKS_CIDR#*/}"
# 16
PEERPOD_CIDR="$(sipcalc $AKS_CIDR -n 2 | grep ^Network | grep -v current | cut -d' ' -f2)/${MASK}"
# 10.225.0.0/16
az network public-ip create -g "$AKS_RG" -n peerpod
az network nat gateway create -g "$AKS_RG" -l "$AZURE_REGION" --public-ip-addresses peerpod -n peerpod
az network vnet subnet create -g "$AKS_RG" --vnet-name "$AZURE_VNET_NAME" --nat-gateway peerpod --address-prefixes "$PEERPOD_CIDR" -n peerpod
export AZURE_SUBNET_ID="$(az network vnet subnet show -g "$AKS_RG" --vnet-name "$AZURE_VNET_NAME" -n peerpod --query id -o tsv)"

AKS resource group permissions

For CAA to be able to manage VMs assign the identity VM and Network contributor roles, privileges to spawn VMs in $AZURE_RESOURCE_GROUP and attach to a VNet in $AKS_RG.

az role assignment create \
  --role "Virtual Machine Contributor" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
  --role "Reader" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
  --role "Network Contributor" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AKS_RG}"

Create the federated credential for the CAA ServiceAccount using the OIDC endpoint from the AKS cluster:

export AKS_OIDC_ISSUER="$(az aks show \
  --name "${CLUSTER_NAME}" \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --query "oidcIssuerProfile.issuerUrl" \
  -otsv)"
az identity federated-credential create \
  --name "caa-${CLUSTER_NAME}" \
  --identity-name "${AZURE_WORKLOAD_IDENTITY_NAME}" \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --issuer "${AKS_OIDC_ISSUER}" \
  --subject system:serviceaccount:confidential-containers-system:cloud-api-adaptor \
  --audience api://AzureADTokenExchange

Deploy CAA

Note: If you are using Calico Container Network Interface (CNI) on the Kubernetes cluster, then, configure Virtual Extensible LAN (VXLAN) encapsulation for all inter workload traffic.

Download the CAA deployment artifacts

export CAA_VERSION="0.10.0"
curl -LO "https://github.com/confidential-containers/cloud-api-adaptor/archive/refs/tags/v${CAA_VERSION}.tar.gz"
tar -xvzf "v${CAA_VERSION}.tar.gz"
cd "cloud-api-adaptor-${CAA_VERSION}/src/cloud-api-adaptor"
export CAA_BRANCH="main"
curl -LO "https://github.com/confidential-containers/cloud-api-adaptor/archive/refs/heads/${CAA_BRANCH}.tar.gz"
tar -xvzf "${CAA_BRANCH}.tar.gz"
cd "cloud-api-adaptor-${CAA_BRANCH}/src/cloud-api-adaptor"

This assumes that you already have the code ready to use. On your terminal change directory to the Cloud API Adaptor’s code base.

CAA pod VM image

Export this environment variable to use for the peer pod VM:

export AZURE_IMAGE_ID="/CommunityGalleries/cococommunity-42d8482d-92cd-415b-b332-7648bd978eff/Images/peerpod-podvm-ubuntu2204-cvm-snp/Versions/${CAA_VERSION}"

An automated job builds the pod VM image each night at 00:00 UTC. You can use that image by exporting the following environment variable:

SUCCESS_TIME=$(curl -s \
  -H "Accept: application/vnd.github+json" \
  "https://api.github.com/repos/confidential-containers/cloud-api-adaptor/actions/workflows/azure-podvm-image-nightly-build.yml/runs?status=success" \
  | jq -r '.workflow_runs[0].updated_at')

export AZURE_IMAGE_ID="/CommunityGalleries/cocopodvm-d0e4f35f-5530-4b9c-8596-112487cdea85/Images/podvm_image0/Versions/$(date -u -jf "%Y-%m-%dT%H:%M:%SZ" "$SUCCESS_TIME" "+%Y.%m.%d" 2>/dev/null || date -d "$SUCCESS_TIME" +%Y.%m.%d)"

Above image version is in the format YYYY.MM.DD, so to use the latest image should be today’s date or yesterday’s date.

If you have made changes to the CAA code that affects the pod VM image and you want to deploy those changes then follow these instructions to build the pod VM image. Once image build is finished then export image id to the environment variable AZURE_IMAGE_ID.

CAA container image

Export the following environment variable to use the latest release image of CAA:

export CAA_IMAGE="quay.io/confidential-containers/cloud-api-adaptor"
export CAA_TAG="v0.10.0-amd64"

Export the following environment variable to use the image built by the CAA CI on each merge to main:

export CAA_IMAGE="quay.io/confidential-containers/cloud-api-adaptor"

Find an appropriate tag of pre-built image suitable to your needs here.

export CAA_TAG=""

Caution: You can also use the latest tag but it is not recommended, because of its lack of version control and potential for unpredictable updates, impacting stability and reproducibility in deployments.

If you have made changes to the CAA code and you want to deploy those changes then follow these instructions to build the container image. Once the image is built export the environment variables CAA_IMAGE and CAA_TAG.

Annotate Service Account

Annotate the CAA Service Account with the workload identity’s CLIENT_ID and make the CAA DaemonSet use workload identity for authentication:

cat <<EOF > install/overlays/azure/workload-identity.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cloud-api-adaptor-daemonset
  namespace: confidential-containers-system
spec:
  template:
    metadata:
      labels:
        azure.workload.identity/use: "true"
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cloud-api-adaptor
  namespace: confidential-containers-system
  annotations:
    azure.workload.identity/client-id: "$USER_ASSIGNED_CLIENT_ID"
EOF

Select peer-pods machine type

export AZURE_INSTANCE_SIZE="Standard_DC2as_v5"
export DISABLECVM="false"

Find more AMD SEV-SNP machine types on this Azure documentation.

export AZURE_INSTANCE_SIZE="Standard_DC2es_v5"
export DISABLECVM="false"

Find more Intel TDX machine types on this Azure documentation.

export AZURE_INSTANCE_SIZE="Standard_D2as_v5"
export DISABLECVM="true"

Populate the kustomization.yaml file

Run the following command to update the kustomization.yaml file:

cat <<EOF > install/overlays/azure/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../yamls
images:
- name: cloud-api-adaptor
  newName: "${CAA_IMAGE}"
  newTag: "${CAA_TAG}"
generatorOptions:
  disableNameSuffixHash: true
configMapGenerator:
- name: peer-pods-cm
  namespace: confidential-containers-system
  literals:
  - CLOUD_PROVIDER="azure"
  - AZURE_SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
  - AZURE_REGION="${AZURE_REGION}"
  - AZURE_INSTANCE_SIZE="${AZURE_INSTANCE_SIZE}"
  - AZURE_RESOURCE_GROUP="${AZURE_RESOURCE_GROUP}"
  - AZURE_SUBNET_ID="${AZURE_SUBNET_ID}"
  - AZURE_IMAGE_ID="${AZURE_IMAGE_ID}"
  - DISABLECVM="${DISABLECVM}"
secretGenerator:
- name: peer-pods-secret
  namespace: confidential-containers-system
- name: ssh-key-secret
  namespace: confidential-containers-system
  files:
  - id_rsa.pub
patchesStrategicMerge:
- workload-identity.yaml
EOF

The SSH public key should be accessible to the kustomization.yaml file:

cp $SSH_KEY install/overlays/azure/id_rsa.pub

Deploy CAA on the Kubernetes cluster

Deploy coco operator:

export COCO_OPERATOR_VERSION="0.10.0"
kubectl apply -k "github.com/confidential-containers/operator/config/release?ref=v${COCO_OPERATOR_VERSION}"
kubectl apply -k "github.com/confidential-containers/operator/config/samples/ccruntime/peer-pods?ref=v${COCO_OPERATOR_VERSION}"

Run the following command to deploy CAA:

kubectl apply -k "install/overlays/azure"

Generic CAA deployment instructions are also described here.

Run sample application

Ensure runtimeclass is present

Verify that the runtimeclass is created after deploying CAA:

kubectl get runtimeclass

Once you can find a runtimeclass named kata-remote then you can be sure that the deployment was successful. A successful deployment will look like this:

$ kubectl get runtimeclass
NAME          HANDLER       AGE
kata-remote   kata-remote   7m18s

Deploy workload

Create an nginx deployment:

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      runtimeClassName: kata-remote
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        imagePullPolicy: Always
EOF

Ensure that the pod is up and running:

kubectl get pods -n default

You can verify that the peer pod VM was created by running the following command:

az vm list \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --output table

Here you should see the VM associated with the pod nginx.

Note: If you run into problems then check the troubleshooting guide here.

Cleanup

If you wish to clean up the whole set up, you can delete the resource group by running the following command:

az group delete \
  --name "${AZURE_RESOURCE_GROUP}" \
  --yes --no-wait

4 - IBM Cloud

Documentation for peerpods on IBM Cloud

This guide describes how to set up a demo environment on IBM Cloud for peer pod VMs using the operator deployment approach.

The high level flow involved is:

  • Build and upload a peer pod custom image to IBM Cloud
  • Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure
  • Deploy Confidential-containers operator
  • Deploy and validate that the nginx demo works
  • Clean-up and deprovision

Pre-reqs

When building the peer pod VM image, it is simplest to use the container based approach, which only requires either docker, or podman, but it can also be built locally.

Note: the peer pod VM image build and upload is de-coupled from the cluster creation and operator deployment stage, so can be built on a different machine.

There are a number of packages that you will need to install in order to create the Kubernetes cluster and peer pod enable it:

  • Terraform, Ansible, the IBM Cloud CLI and kubectl are all required for the cluster creation and explained in the cluster pre-reqs guide.

In addition to this you will need to install jq

Tip: If you are using Ubuntu linux, you can run follow command:

$ sudo apt-get install jq

You will also require go and make to be installed.

Peer Pod VM Image

A peer pod VM image needs to be created as a VPC custom image in IBM Cloud in order to create the peer pod instances from. The peer pod VM image contains components like the agent protocol forwarder and Kata agent that communicate with the Kubernetes worker node and carry out the received instructions inside the peer pod.

Building a Peer Pod VM Image via Docker [Optional]

You may skip this step and use one of the release images, skip to Import Release VM Image but for the latest features you may wish to build your own.

You can do this by following the process document. If building within a container ensure that --build-arg CLOUD_PROVIDER=ibmcloud is set and --build-arg ARCH=s390x for an s390x architecture image.

Note: At the time of writing issue, #649 means when creating an s390x image you also need to add two extra build args: --build-arg UBUNTU_IMAGE_URL="" and --build-arg UBUNTU_IMAGE_CHECKSUM=""

Note: If building the peer pod qcow2 image within a VM, it may take a lot of resources e.g. 8 vCPU and 32GB RAM due to the nested virtualization performance limitations. When running without enough resources, the failure seen is similar to:

Build 'qemu.ubuntu' errored after 5 minutes 57 seconds: Timeout waiting for SSH.

Upload the built peer pod VM image to IBM Cloud

You can follow the process documented from the cloud-api-adaptor/ibmcloud/image to extract and upload the peer pod image you’ve just built to IBM Cloud as a custom image, noting to replace the quay.io/confidential-containers/podvm-ibmcloud-ubuntu-s390x reference with the local container image that you built above e.g. localhost/podvm_ibmcloud_s390x:latest.

This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be needed in the kustomize step later.

Import Release VM Image

Alternatively to use a pre-built peer pod VM image you can follow the process documented with the release images found at quay.io/confidential-containers/podvm-generic-ubuntu-<ARCH>. Running this command will require docker or podman, as per tools

 ./import.sh quay.io/confidential-containers/podvm-generic-ubuntu-s390x eu-gb --bucket example-bucket --instance example-cos-instance

This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be needed in later steps.

Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure

If you don’t have a Kubernetes cluster for testing, you can follow the open-source instructions to set up a basic cluster where the Kubernetes nodes run on IBM Cloud provided infrastructure.

Deploy PeerPod Webhook

Deploy cert-manager

  • Deploy cert-manager with:

    kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.9.1/cert-manager.yaml
    
  • Wait for the pods to all be in running state with:

    kubectl get pods -n cert-manager --watch
    

Deploy the peer-pods webhook

  • From within the root directory of the cloud-api-adaptor repository, deploy the webhook with:

    kubectl apply -f ./webhook/hack/webhook-deploy.yaml
    
  • Wait for the pods to all be in running state with:

    kubectl get pods -n peer-pods-webhook-system --watch
    
  • Advertise the extended resource kata.peerpods.io/vm. by running the following commands:

    pushd webhook/hack/extended-resources
    ./setup.sh
    popd
    

Deploy the Confidential-containers operator

The caa-provisioner-cli simplifies deploying the operator and the cloud-api-adaptor resources on to any cluster. See the test/tools/README.md for full instructions. To create an ibmcloud ready version follow these steps

# Starting from the cloud-api-adaptor root directory
pushd test/tools
make BUILTIN_CLOUD_PROVIDERS="ibmcloud" all
popd

This will create caa-provisioner-cli in the test/tools directory. To use the tool with an existing self-managed cluster you will need to setup a .properties file containing the relevant ibmcloud information to enable your cluster to create and use peer-pods. Use the following commands to generate the .properties file, if not using a selfmanaged cluster please update the terraform commands with the appropriate values manually.

export IBMCLOUD_API_KEY= # your ibmcloud apikey
export PODVM_IMAGE_ID= # the image id of the peerpod vm uploaded in the previous step
export PODVM_INSTANCE_PROFILE= # instance profile name that runs the peerpod (bx2-2x8 or bz2-2x8 for example)
export CAA_IMAGE_TAG= # cloud-api-adaptor image tag that supports this arch, see quay.io/confidential-containers/cloud-api-adaptor
pushd ibmcloud/cluster

cat <<EOF > ../../selfmanaged_cluster.properties
IBMCLOUD_PROVIDER="ibmcloud"
APIKEY="$IBMCLOUD_API_KEY"
PODVM_IMAGE_ID="$PODVM_IMAGE_ID"
INSTANCE_PROFILE_NAME="$PODVM_INSTANCE_PROFILE"
CAA_IMAGE_TAG="$CAA_IMAGE_TAG"
SSH_KEY_ID="$(terraform output --raw ssh_key_id)"
EOF

popd

This will create a selfmanaged_cluster.properties files in the cloud-api-adaptor root directory.

The final step is to run the caa-provisioner-cli to install the operator.

export CLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties file
export TEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"
# prevent the test from removing the cloud-api-adaptor resources from the cluster
export TEST_TEARDOWN="no"
pushd test/tools
./caa-provisioner-cli -action=install
popd

End-2-End Test Framework

To validate that a cluster has been setup properly, there is a suite of tests that validate peer-pods across different providers, the implementation of these tests can be found in test/e2e/common_suite_test.go).

Assuming CLOUD_PROVIDER and TEST_PROVISION_FILE are still set in your current terminal you can execute these tests from the cloud-api-adaptor root directory by running the following commands

export KUBECONFIG=$(pwd)/ibmcloud/cluster/config
make test-e2e

Uninstall and clean up

There are two options for cleaning up the environment once testing has finished, or if you want to re-install from a clean state:

  • If using a self-managed cluster you can delete the whole cluster following the Delete the cluster documentation and then start again.
  • If you instead just want to leave the cluster, but uninstall the Confidential Containers and peer pods feature, you can use the caa-provisioner-cli to remove the resources.
export CLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties file
export TEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"
pushd test/tools
./caa-provisioner-cli -action=uninstall
popd

5 - Libvirt

Documentation for peerpods on Libvirt

Introduction

This document contains instructions for using, developing and testing the cloud-api-adaptor with libvirt.

Creating an end-to-end environment for testing and development

In this section you will learn how to setup an environment in your local machine to run peer pods with the libvirt cloud API adaptor. Bear in mind that many different tools can be used to setup the environment and here we just make suggestions of tools that seems used by most of the peer pods developers.

Requirements

You must have a Linux/KVM system with libvirt installed and the following tools:

Assume that you have a ‘default’ network and storage pools created in libvirtd system instance (qemu:///system). However, if you have a different pool name then the scripts should be able to handle it properly.

Create the Kubernetes cluster

Use the kcli_cluster.sh script to create a simple two VMs (one control plane and one worker) cluster with the kcli tool, as:

./kcli_cluster.sh create

With kcli_cluster.sh you can configure the libvirt network and storage pools that the cluster VMs will be created, among other parameters. Run ./kcli_cluster.sh -h to see the help for further information.

If everything goes well you will be able to see the cluster running after setting your Kubernetes config with:

export KUBECONFIG=$HOME/.kcli/clusters/peer-pods/auth/kubeconfig

For example, shown below:

$ kcli list kube
+-----------+---------+-----------+-----------------------------------------+
|  Cluster  |   Type  |    Plan   |                  Vms                    |
+-----------+---------+-----------+-----------------------------------------+
| peer-pods | generic | peer-pods | peer-pods-ctlplane-0,peer-pods-worker-0 |
+-----------+---------+-----------+-----------------------------------------+

$ kubectl get nodes
NAME                 STATUS   ROLES                  AGE     VERSION
peer-pods-ctlplane-0 Ready    control-plane,master   6m8s    v1.25.3
peer-pods-worker-0   Ready    worker                 2m47s   v1.25.3

Prepare the Pod VM volume

In order to build the Pod VM without installing the build tools, you can use the Dockerfiles hosted on ../podvm directory to run the entire process inside a container. Refer to podvm/README.md for further details. Alternatively you can consume pre-built podvm images as explained here.

Next you will need to create a volume on libvirt’s system storage and upload the image content. That volume is used by the cloud-api-adaptor program to instantiate a new Pod VM. Run the following commands:

export IMAGE=<full-path-to-qcow2>

virsh -c qemu:///system vol-create-as --pool default --name podvm-base.qcow2 --capacity 20G --allocation 2G --prealloc-metadata --format qcow2
virsh -c qemu:///system vol-upload --vol podvm-base.qcow2 $IMAGE --pool default --sparse

You should see that the podvm-base.qcow2 volume was properly created:

$ virsh -c qemu:///system vol-info --pool default podvm-base.qcow2
Name:           podvm-base.qcow2
Type:           file
Capacity:       6.00 GiB
Allocation:     631.52 MiB

Install and configure Confidential Containers and cloud-api-adaptor in the cluster

The easiest way to install the cloud-api-adaptor along with Confidential Containers in the cluster is through the Kubernetes operator available in the install directory of this repository.

Start by creating a public/private RSA key pair that will be used by the cloud-api-provider program, running on the cluster workers, to connect with your local libvirtd instance without password authentication. Assume you are in the libvirt directory, do:

cd ../install/overlays/libvirt
ssh-keygen -f ./id_rsa -N ""
cat id_rsa.pub >> ~/.ssh/authorized_keys

Note: ensure that ~/.ssh/authorized_keys has the right permissions (read/write for the user only) otherwise the authentication can silently fail. You can run chmod 600 ~/.ssh/authorized_keys to set the right permissions.

You will need to figure out the IP address of your local host (e.g. 192.168.122.1). Then try to remote connect with libvirt to check the keys setup is fine, for example:

$ virsh -c "qemu+ssh://$USER@192.168.122.1/system?keyfile=$(pwd)/id_rsa" nodeinfo
CPU model:           x86_64
CPU(s):              12
CPU frequency:       1084 MHz
CPU socket(s):       1
Core(s) per socket:  6
Thread(s) per core:  2
NUMA cell(s):        1
Memory size:         32600636 KiB

Now you should finally install the Kubernetes operator in the cluster with the help of the install_operator.sh script. Ensure that you have your IP address exported in the environment, as shown below, then run the install script:

cd ../../../libvirt/
export LIBVIRT_IP="192.168.122.1"
export SSH_KEY_FILE="id_rsa"
./install_operator.sh

If everything goes well you will be able to see the operator’s controller manager and cloud-api-adaptor Pods running:

$ kubectl get pods -n confidential-containers-system
NAME                                              READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-5df7584679-5dbmr   2/2     Running   0          3m58s
cloud-api-adaptor-daemonset-vgj2s                 1/1     Running   0          3m57s

$ kubectl logs pod/cloud-api-adaptor-daemonset-vgj2s -n confidential-containers-system
+ exec cloud-api-adaptor libvirt -uri 'qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1' -data-dir /opt/data-dir -pods-dir /run/peerpod/pods -network-name default -pool-name default -socket /run/peerpod/hypervisor.sock
2022/11/09 18:18:00 [helper/hypervisor] hypervisor config {/run/peerpod/hypervisor.sock  registry.k8s.io/pause:3.7 /run/peerpod/pods libvirt}
2022/11/09 18:18:00 [helper/hypervisor] cloud config {qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1 default default /opt/data-dir}
2022/11/09 18:18:00 [helper/hypervisor] service config &{qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1 default default /opt/data-dir}

You will also notice that Kubernetes runtimeClass resources were created on the cluster, as for example:

$ kubectl get runtimeclass
NAME          HANDLER       AGE
kata-remote   kata-remote   7m18s

Create a sample peer-pods pod

At this point everything should be fine to get a sample Pod created. Let’s first list the running VMs so that we can later check the Pod VM will be really running. Notice below that we got only the cluster node VMs up:

$ virsh -c qemu:///system list
 Id   Name                   State
------------------------------------
 3    peer-pods-ctlplane-0   running
 4    peer-pods-worker-0     running

Create the sample_busybox.yaml file with the following content:

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: busybox
  name: busybox
spec:
  containers:
  - image: quay.io/prometheus/busybox
    name: busybox
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Never
  runtimeClassName: kata-remote

And create the Pod:

$ kubectl apply -f sample_busybox.yaml
pod/busybox created

$ kubectl wait --for=condition=Ready pod/busybox
pod/busybox condition met

Check that the Pod VM is up and running. See on the following listing that podvm-busybox-88a70031 was created:

$ virsh -c qemu:///system list
 Id   Name                     State
----------------------------------------
 5    peer-pods-ctlplane-0     running
 6    peer-pods-worker-0       running
 7    podvm-busybox-88a70031   running

You should also check that the container is running fine. For example, compare the kernels are different as shown below:

$ uname -r
5.17.12-100.fc34.x86_64

$ kubectl exec pod/busybox -- uname -r
5.4.0-131-generic

The peer-pods pod can be deleted as any regular pod. On the listing below the pod was removed and you can note that the Pod VM no longer exists on Libvirt:

$ kubectl delete -f sample_busybox.yaml
pod "busybox" deleted

$ virsh -c qemu:///system list
 Id   Name                 State
------------------------------------
 5    peer-pods-ctlplane-0 running
 6    peer-pods-worker-0   running

Delete Confidential Containers and cloud-api-adaptor from the cluster

You might want to reinstall the Confidential Containers and cloud-api-adaptor into your cluster. There are two options:

  1. Delete the Kubernetes cluster entirely and start over. In this case you should just run ./kcli_cluster.sh delete to wipe out the cluster created with kcli
  2. Uninstall the operator resources then install them again with the install_operator.sh script

Let’s show you how to delete the operator resources. On the listing below you can see the actual pods running on the confidential-containers-system namespace:

$ kubectl get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn   2/2     Running   0          20h
cc-operator-daemon-install-fkkzz                 1/1     Running   0          20h
cloud-api-adaptor-daemonset-libvirt-lxj7v        1/1     Running   0          20h

In order to remove the *-daemon-install-* and *-cloud-api-adaptor-daemonset-* pods, run the following command from the root directory:

CLOUD_PROVIDER=libvirt make delete

It can take some minutes to get those pods deleted, afterwards you will notice that only the controller-manager is still up. Below is shown how to delete that pod and associated resources as well:

$ kubectl get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn   2/2     Running   0          20h

$ kubectl delete -f install/yamls/deploy.yaml
namespace "confidential-containers-system" deleted
serviceaccount "cc-operator-controller-manager" deleted
role.rbac.authorization.k8s.io "cc-operator-leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-manager-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-metrics-reader" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-proxy-role" deleted
rolebinding.rbac.authorization.k8s.io "cc-operator-leader-election-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-manager-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-proxy-rolebinding" deleted
configmap "cc-operator-manager-config" deleted
service "cc-operator-controller-manager-metrics-service" deleted
deployment.apps "cc-operator-controller-manager" deleted
customresourcedefinition.apiextensions.k8s.io "ccruntimes.confidentialcontainers.org" deleted

$ kubectl get pods -n confidential-containers-system
No resources found in confidential-containers-system namespace.