This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Documentation

The documentation for each sub-project of the Confidential Containers project is available in the respective tabs, checkout the links below.

These slides provide a high-level overview of all the subprojects in the Confidential Containers.

1 - Overview

High level overview of Confidential Containers

1.1 - Trust Model

Documentation on the Confidential Containers trust model

1.1.1 - Trust Model for Confidential Containers

Definitions, references, and scope

A clear definition of trust for the confidential containers project is needed to ensure the components and architecture deliver the security principles expected for cloud native confidential computing. It provides the solid foundations and unifying security principles against which we can assess architecture and implementation ideas and discussions.

Trust Model Definition

The Trust Modeling for Security Architecture Development article defines Trust Modeling as:

A trust model identifies the specific mechanisms that are necessary to respond to a specific threat profile.

A trust model must include implicit or explicit validation of an entity’s identity or the characteristics necessary for a particular event or transaction to occur.

Trust Boundary

The trust model also helps determine the location and direction of the trust boundaries where a trust boundary describes a location where program data or execution changes its level of “trust”, or where two principals with different capabilities exchange data or commands. Specific to Confidential Containers is the trust boundary that corresponds to the boundary of the Trusted Execution Environment (TEE). The TEE side of the trust boundary will be hardened to prevent the violation of the trust boundary.

Required Documentation

In order to describe and understand particular threats we need to establish trust boundaries and trust models relating to the key aspects, components and actors involved in Cloud Native Confidential Computing. We explore trust using different orthogonal ways of considering cloud native approaches when they use an underlying TEE technology and identifying where there may be considerations to preserve the value of using a TEE.

Trust Model Considerations

Further documentation will highlight specific threat vectors in detail, considering risk, impact, mitigation etc as the project progresses. The Security Assurance section, Page 31, of Cloud Native Computing Foundation (CNCF) Cloud Native Security Paper will guide this more detailed threat vector effort.

Confidential Containers brings confidential computing into a cloud native context and should therefore refer to and build on trust and security models already defined.

For example:

The commonality between confidential containers project and confidential computing is to reduce the ability for unauthorised access to data and code inside TEEs sufficiently such that this path is not an economically or logically viable attack during execution (5.1 Goal within the CCC publication A Technical Analysis of Confidential Computing).

This means our trust and threat modelling should

  • Focus on which aspects of code and data have integrity and/or confidentiality protections.
  • Focus on enhancing existing Cloud Native models in the context of exploiting TEEs.
  • Consider existing Cloud Native technologies and the role they can play for confidential containers.
  • Consider additional technologies to fulfil a role in Cloud Native exploitation of TEEs.

Out of Scope

The following items are considered out-of-scope for the trust/threat modelling within confidential containers :

  • Vulnerabilities within the application/code which has been requested to run inside a TEE.
  • Availability part of the Confidentiality/Integrity/Availability in CIA Triad.
  • Software TEEs. At this time we are focused on hardware TEEs.
  • Certain security guarantees are defined by the underlying TEE and these may vary between TEEs and generations of the same TEE. We take these guarantees at face value and will only highlight them where they become relevant to the trust model or threats we consider.

Summary

In practice, those deploying workloads into TEE environments may have varying levels of trust in the personas who have privileges regarding orchestration or hosting the workload. This trust may be based on factors such as the relationship with the owner or operator of the host, the software and hardware it comprises, and the likelihood of physical, software, or social engineering compromise.

Confidential containers will have specific focus on preventing potential security threats at the TEE boundary and ensure privileges which are accepted within cloud native environment as crossing the boundary are mitigated from threats within the boundary. We cannot allow the security of the TEE to be under control of operations outside the TEE or from areas not trusted by the TEE.

1.1.2 - Personas

Description and discussion of relevant agents/actors in the context of Confidential Containers

Personas

Otherwise referred to as actors or agents, these are individuals or groups capable of carrying out a particular threat. In identifying personas we consider :

  • The Runtime Environment, Figure 5, Page 19 of CNCF Cloud Native Security Paper. This highlights three layers, Cloud/Environment, Workload Orchestration, Application.
  • The Kubernetes Overview of Cloud Native Security identifies the 4C’s of Cloud Native Security as Cloud, Cluster, Container and Code. However data is core to confidential containers rather than code.
  • The Confidential Computing Consortium paper A Technical Analysis of Confidential Computing defines Confidential Computing as the protection of data in use by performing computations in a hardware-based Trusted Execution Environment (TEE).

In considering personas we recognise that a trust boundary exists between each persona and we explore how the least privilege principle (as described on Page 40 of Cloud Native Security Paper ) should apply to any actions which cross these boundaries.

Confidential containers can provide enhancements to ensure that the expected code/containers are the only code that can operate over the data. However any vulnerabilities within this code are not mitigated by using confidential containers, the Cloud Native Security Whitepaper details Lifecycle aspects that relate to the security of the code being placed into containers such as Static/Dynamic Analysis, Security Tests, Code Review etc which must still be followed.

Personas model

Any of these personas could attempt to perform malicious actions:

Infrastructure Operator

This persona has privileges within the Cloud Infrastructure which includes the hardware and firmware used to provide compute, network and storage to the Cloud Native solution. They are responsible for availability of infrastructure used by the cloud native environment.

  • Have access to the physical hardware.
  • Have access to the processes involved in the deployment of compute/storage/memory used by any orchestration components and by the workload.
  • Have control over TEE hardware availability/type.
  • Responsibility for applying firmware updates to infrastructure including the TEE Technology.

Examples: Cloud Service Provider (CSP), Site Reliability Engineer, etc. (SRE)

Orchestration Operator

This persona has privileges within the Orchestration/Cluster. They are responsible for deploying a solution into a particular cloud native environment and managing the orchestration environment. For managed cluster this would also include the administration of the cluster control plane.

  • Control availability of service.
  • Control webhooks and deployment of workloads.
  • Control availability of cluster resources (data/networking/storage) and cluster services (Logging/Monitoring/Load Balancing) for the workloads.
  • Control the deployment of runtime artifacts required by the TEE during initialisation, before hosting the confidential workload.

Example: A Kubernetes administrator responsible for deploying pods to a cluster and maintaining the cluster.

Workload Provider

This persona designs and creates the orchestration objects comprising the solution (e.g. Kubernetes Pod spec, etc). These objects reference containers published by Container Image Providers. In some cases the Workload and Container Image Providers may be the same entity. The solution defined is intended to provide the Application or Workload which in turn provides value to the Data Owners (customers and clients). The Workload Provider and Data Owner could be part of same company/organisation but following the least privilege principle the Workload Provider should not be able to view or manipulate end user data without informed consent.

  • Need to prove to customer aspects of compliance.
  • Defines what the solution requires in order to run and maintain compliance (resources, utility containers/services, storage).
  • Chooses the method of verifying the container images (from those supported by Container Image Provider) and obtains artifacts needed to allow verification to be completed within the TEE.
  • Provide the boot images initially required by the TEE during initialisation or designates a trusted party to do so.
  • Provide the attestation verification service, or designate a trusted party to provide the attestation verification service.

Examples: 3rd party software vendor, CSP

Container Image Provider

This persona is responsible for the part of the supply chain that builds container images and provides them for use by the solution. Since a workload can be composed of multiple containers, there may be multiple container image providers, some will be closely connected to the workload provider (business logic containers), others more independent to the workload provider (side car containers). The container image provider is expected to use a mechanism to allow provenance of container image to be established when a workload pulls in these images at deployment time. This can take the form of signing or encrypting the container images.

  • Builds container images.
  • Owner of business logic containers. These may contain proprietary algorithms, models or secrets.
  • Signs or encrypts the images.
  • Defines the methods available for verifying the container images to be used.
  • Publishes the signature verification key (public key).
  • Provides any decryption keys through a secure channel (generally to a key management system controlled by a Key Broker Service).
  • Provides other required verification artifacts (secure channel may be considered).
  • Protects the keys used to sign or encrypt the container images.

It is recognised that hybrid options exist surrounding workload provider and container provider. For example the workload provider may choose to protect their supply chain by signing/encrypting their own container images after following the build patterns already established by the container image provider.

Example : Istio

Data Owner

Owner of data used, and manipulated by the application.

  • Concerned with visibility and integrity of their data.
  • Concerned with compliance and protection of their data.
  • Uses and shares data with solutions.
  • Wishes to ensure no visibility or manipulation of data is possible by Orchestration Operator or Cloud Operator personas.

Discussion

Data Owner vs. All Other Personas

The key trust relationship here is between the Data Owner and the other personas. The Data Owner trusts the code in the form of container images chosen by the Workload Provider to operate across their data, however they do not trust the Orchestration Operator or Cloud Operator with their data and wish to ensure data confidentiality.

Workload Provider vs. Container Image Provider

The Workload Provider is free to choose Container Image Providers that will provide not only the images they need but also support the verification method they require. A key aspect to this relationship is the Workload Provider applying Supply Chain Security practices (as described on Page 42 of Cloud Native Security Paper ) when considering Container Image Providers. So the Container Image Provider must support the Workload Providers ability to provide assurance to the Data Owner regarding integrity of the code.

With Confidential Containers we match the TEE boundary to the most restrictive boundary which is between the Workload Provider and the Orchestration Operator.

Orchestration Operator vs. Infrastructure Operator

Outside the TEE we distinguish between the Orchestration Operator and the Infrastructure Operator due to nature of how they can impact the TEE and the concerns of Workload Provider and Data Owner. Direct threats exist from the Orchestration Operator as some orchestration actions must be permitted to cross the TEE boundary otherwise orchestration cannot occur. A key goal is to deprivilege orchestration and restrict the Orchestration Operators privileges across the boundary. However indirect threats exist from the Infrastructure Operator who would not be permitted to exercise orchestration APIs but could exploit the low-level hardware or firmware capabilities to access or impact the contents of a TEE.

Workload Provider vs. Data Owner

Inside the TEE we need to be able to distinguish between the Workload Provider and Data Owner in recognition that the same workload (or parts such as logging/monitoring etc) can be re-used with different data sets to provide a service/solution. In the case of bespoke workload, the workload provider and Data Owner may be the same persona. As mentioned the Data Owner must have a level of trust in the Workload Provider to use and expose the data provided in an expected and approved manner. Page 10 of A Technical Analysis of Confidential Computing , suggests some approaches to establish trust between them.

The TEE boundary allows the introduction of secrets but just as we recognised the TEE does not provide protection from code vulnerabilities, we also recognised that a TEE cannot enforce complete distrust between Workload Provider and Data Owner. This means secrets within the TEE are at risk from both Workload Provider and Data Owner and trying to keep secrets which protect the workload (container encryption etc), separated from secrets to protect the data (data encryption) is not provided simply by using a TEE.

Recognising that Data Owner and Workload Provider are separate personas helps us to identify threats to both data and workload independently and to recognise that any solution must consider the potential independent nature of these personas. Two examples of trust between Data Owner and Workload Provider are :

  • AI Models which are proprietary and protected requires the workload to be encrypted and not shared with the Data Owner. In this case secrets private to the Workload Provider are needed to access the workload, secrets requiring access to the data are provided by the Data Owner while trusting the workload/model without having direct access to how the workload functions. The Data Owner completely trusts the workload and Workload Provider, whereas the Workload Provider does not trust the Data Owner with the full details of their workload.
  • Data Owner verifies and approves certain versions of a workload, the workload provides the data owner with secrets in order to fulfil this. These secrets are available in the TEE for use by the Data Owner to verify the workload, once achieved the data owner will then provide secrets and data into the TEE for use by the workload in full confidence of what the workload will do with their data. The Data Owner will independently verify versions of the workload and will only trust specific versions of the workload with the data whereas the Workload Provider completely trusts the Data Owner.

Data Owner vs. End User

We do not draw a distinction between data owner and end user though we do recognise that in some cases these may not be identical. For example data may be provided to a workload to allow analysis and results to be made available to an end user. The original data is never provided directly to the end user but the derived data is, in this case the data owner can be different from the end user and may wish to protect this data from the end user.

1.1.3 - Threat Vectors/Profiles

TODO An (inexhaustive) list of threat vectors against Confidential Containers

Links to further documentation detailing specific threats and how Confidential Containers uses the trust concepts described in the context of the Trust Model will be added here.

Current TODO List for Threats to be covered is tracked under Issues #2

2 - Demos

Demo you can reproduce to try Confidential Containers

2.1 - CCv0 Operator Demo

The demo shows CCv0 Kata runtime installation and configuration using the coco-operator.

Demo Video

Watch the demo in youtube

Demo Environment setup

Kubernetes cluster

Setup a two nodes Kubernetes cluster using Ubuntu 20.04. You can use your preferred Kubernetes setup tool. Here is an example using kcli.

Download ubuntu 20.04 image if not present by running the following command:

kcli download image ubuntu2004

Install the cluster:

kcli create kube generic -P image=ubuntu2004 -P workers=1 testk8s

Replace containerd

Replace containerd on the worker node by building a new containerd from the following branch: https://github.com/confidential-containers/containerd/tree/CC-main (build instructions)

Modify systemd configuration to use the new binary and restart containerd and kubelet.

Verify if the cluster nodes are all up

kubectl get nodes

Sample output from the demo environment:

$ kubectl get nodes
NAME                  STATUS   ROLES                  AGE   VERSION
cck8s-demo-master-0   Ready    control-plane,master   25d   v1.22.3
cck8s-demo-worker-0   Ready    worker                 25d   v1.22.3

Make sure at least one Kubernetes node in the cluster has the label node.kubernetes.io/worker=.

kubectl label node $NODENAME node.kubernetes.io/worker=

Operator Setup

RELEASE_VERSION="main"
kubectl apply -k "github.com/confidential-containers/operator/config/release?ref=${RELEASE_VERSION}"

The operator installs everything under the confidential-containers-system namespace:

Verify if the operator is running by running the following command:

kubectl get pods -n confidential-containers-system

Sample output from the demo environment:

$ kubectl get pods -n confidential-containers-system
NAME                                              READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-7f8d6dd988-t9zdm   2/2     Running   0          13s

Confidential Containers Runtime setup

Creating a CCruntime object sets up the container runtime. The default payload image sets up the CCv0 demo image of the kata-containers runtime.

RELEASE_VERSION="main"
kubectl apply -k "github.com/confidential-containers/operator/config/samples/ccruntime/default?ref=${RELEASE_VERSION}"

This will create an install daemonset targeting the worker nodes for installation. You can verify the status under the confidential-containers-system namespace.

$ kubectl get pods -n confidential-containers-system
NAME                                              READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-7f8d6dd988-t9zdm   2/2     Running   0          82s
cc-operator-daemon-install-p9ntc                  1/1     Running   0          45s

On successful installation, you’ll see the following runtimeClasses being setup:

$ kubectl get runtimeclasses.node.k8s.io
NAME        HANDLER     AGE
kata        kata        92s
kata-cc     kata-cc     92s
kata-qemu   kata-qemu   92s

kata-cc runtimeclass uses CCv0 specific configurations.

Now you can deploy the PODs targeting the specific runtimeclasses. The SSH demo can be used as a compatible workload.

2.2 - SSH Demo

SSH Demo to showcase encrypted memory provided by the TEE

To demonstrate confidential containers capabilities, we run a pod with SSH public key authentication.

Compared to the execution of and login to a shell on a pod, an SSH connection is cryptographically secured and requires a private key. It cannot be established by unauthorized parties, such as someone who controls the node. The container image contains the SSH host key that can be used for impersonating the host we will connect to. Because this container image is encrypted, and the key to decrypting this image is only provided in measurable ways (e.g. attestation or encrypted initrd), and because the pod/guest memory is protected, even someone who controls the node cannot steal this key.

Using a pre-provided container image

If you would rather build the image with your own keys, skip to Building the container image. The operator can be used to set up a compatible runtime.

A demo image is provided at docker.io/katadocker/ccv0-ssh. It is encrypted with Attestation Agent’s offline file system key broker and aa-offline_fs_kbc-keys.json as its key file. The private key for establishing an SSH connection to this container is given in ccv0-ssh. To use it with SSH, its permissions should be adjusted: chmod 600 ccv0-ssh. The host key fingerprint is SHA256:wK7uOpqpYQczcgV00fGCh+X97sJL3f6G1Ku4rvlwtR0.

All keys shown here are for demonstration purposes. To achieve actually confidential containers, use a hardware trusted execution environment and do not reuse these keys.

Continue at Connecting to the guest.

Building the container image

The image built should be encrypted. To receive a decryption key at run time, the Confidential Containers project utilizes the Attestation Agent.

Generating SSH keys

ssh-keygen -t ed25519 -f ccv0-ssh -P "" -C ""

generates an SSH key ccv0-ssh and the correspondent public key ccv0-ssh.pub.

Building the image

The provided Dockerfile expects ccv0-sh.pub to exist. Using Docker, you can build with

docker build --progress=plain -t ccv0-ssh .

Alternatively, Buildah can be used (buildah build or formerly buildah bud). The SSH host key fingerprint is displayed during the build.

Connecting to the guest

A Kubernetes YAML file specifying the kata runtime is included. If you use a self-built image, you should replace the image specification with the image you built. The default tag points to an amd64 image, an s390x tag is also available. With common CNI setups, on the same host, with the service running, you can connect via SSH with

ssh -i ccv0-ssh root@$(kubectl get service ccv0-ssh -o jsonpath="{.spec.clusterIP}")

You will be prompted about whether the host key fingerprint is correct. This fingerprint should match the one specified above/displayed in the Docker build.

crictl-compatible sandbox and container configurations are also included, which forward the pod SSH port (22) to 2222 on the host (use the -p flag in SSH).

3 - Use cases

Depiction of typical Confidential Container use cases and how they can be addressed using the project’s tools.

3.1 - Encrypted images

Procedures to encrypt and consume OCI images in a TEE

Context

A user might want to bundle sensitive data on an OCI (Docker) image. The image layers should only be accessible within a Trusted Execution Environment (TEE).

The project provides the means to encrypt an image with a symmetric key that is released to the TEE only after successful verification and appraisal in a Remote Attestation process. CoCo infrastructure components within the TEE will transparently decrypt the image layers as they are pulled from a registry without exposing the decrypted data outside the boundaries of the TEE.

Instructions

The following steps require a functional CoCo installation on a Kubernetes cluster. A Key Broker Client (KBC) has to be configured for TEEs to be able to retrieve confidential secrets. We assume cc_kbc as a KBC for the CoCo project’s Key Broker Service (KBS) in the following instructions, but image encryption should work with other Key Broker implementations in a similar fashion.

Please ensure you have a recent version of Skopeo (v1.14.2+) installed locally.

Encrypt an image

We extend public image with secret data.

docker build -t unencrypted - <<EOF
FROM nginx:stable
RUN echo "something confidential" > /secret
EOF

The encryption key needs to be a 32 byte sequence and provided to the encryption step as base64-encoded string.

KEY_FILE="image_key"
head -c 32 /dev/urandom | openssl enc > "$KEY_FILE"
KEY_B64="$(base64 < $KEY_FILE)"

The key id is a generic resource descriptor used by the key broker to look up secrets in its storage. For KBS this is composed of three segments: $repository_name/$resource_type/$resource_tag

KEY_PATH="/default/image_key/nginx"
KEY_ID="kbs://${KEY_PATH}"

The image encryption logic is bundled and invoked in a container:

git clone https://github.com/confidential-containers/guest-components.git
cd guest-components
docker build -t coco-keyprovider -f ./attestation-agent/docker/Dockerfile.keyprovider .

To access the image from within the container, Skopeo can be used to buffer the image in a directory, which is then made available to the container. Similarly, the resulting encrypted image will be put into an output directory.

mkdir -p oci/{input,output}
skopeo copy docker-daemon:unencrypted:latest dir:./oci/input
docker run -v "${PWD}/oci:/oci" coco-keyprovider /encrypt.sh -k "$KEY_B64" -i "$KEY_ID" -s dir:/oci/input -d dir:/oci/output

We can inspect layer annotations to confirm the expected encryption was applied:

skopeo inspect dir:./oci/output | jq '.LayersData[0].Annotations["org.opencontainers.image.enc.keys.provider.attestation-agent"] | @base64d | fromjson'
{
  "kid": "kbs:///default/image_key/nginx",
  "wrapped_data": "lGaLf2Ge5bwYXHO2g2riJRXyr5a2zrhiXLQnOzZ1LKEQ4ePyE8bWi1GswfBNFkZdd2Abvbvn17XzpOoQETmYPqde0oaYAqVTMcnzTlgdYYzpWZcb3X0ymf9bS0gmMkqO3dPH+Jf4axXuic+ITOKy7MfSVGTLzay6jH/PnSc5TJ2WuUJY2rRtNaTY65kKF2K9YP6mtYBqcHqvPDlFiVNNeTAGv2w1zwaMlgZaSHV+Z1y+xxbOV5e98bxuo6861rMchjCiE7FY37PHD3a5ISogq90=",
  "iv": "Z8bGQL7r6qxSpd4L",
  "wrap_type": "A256GCM"
}

Finally the resulting encrypted image can be provisioned to an image registry.

ENCRYPTED_IMAGE=some-private.registry.io/coco/nginx:encrypted
skopeo copy dir:./oci/output "docker://${ENCRYPTED_IMAGE}"

Provision image key

Prior to launching a Pod the image key needs to be provisioned to the Key Broker’s repository. For a KBS deployment on Kubernetes using the local filesystem as repository storage it would work like this:

kubectl exec deploy/kbs -- mkdir -p "/opt/confidential-containers/kbs/repository/$(dirname "$KEY_PATH")"
cat "$KEY_FILE" | kubectl exec -i deploy/kbs -- tee "/opt/confidential-containers/kbs/repository/${KEY_PATH}" > /dev/null

Launch a Pod

In this example we default to the Cloud API Adaptor runtime, adjust this depending on the CoCo installation.

kubectl get runtimeclass -o jsonpath='{.items[].handler}'
kata-remote
CC_RUNTIMECLASS=kata-remote

We create a simple deployment using our encrypted image. As the image is being pulled and the CoCo components in the TEE encounter the layer annotations that we saw above, the image key will be retrieved from the Key Broker using the annotated Key ID and the layers will be decrypted transparently and the container should come up.

cat <<EOF> nginx-encrypted.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx-encrypted
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        io.containerd.cri.runtime-handler: ${CC_RUNTIMECLASS}
    spec:
      runtimeClassName: ${CC_RUNTIMECLASS}
      containers:
      - image: ${ENCRYPTED_IMAGE}
        name: nginx
EOF
kubectl apply -f nginx-encrypted.yaml

We can confirm that the image key has been retrieved from KBS.

kubectl logs -f deploy/kbs | grep "$KEY_PATH"
[2024-01-23T10:24:52Z INFO  actix_web::middleware::logger] 10.244.0.1 "GET /kbs/v0/resource/default/image_key/nginx HTTP/1.1" 200 530 "-" "attestation-agent-kbs-client/0.1.0" 0.000670

4 - Kata Containers

Documentation for Kata Containers pertaining to Confidential Containers

4.1 - Kata Confidential Containers

Documentation about Kata Containers in the context of Confidential Computing

CI | Publish Kata Containers payload Kata Containers Nightly CI

Welcome to Kata Containers!

This repository is the home of the Kata Containers code for the 2.0 and newer releases.

If you want to learn about Kata Containers, visit the main Kata Containers website.

Introduction

Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload isolation and security advantages of VMs.

License

The code is licensed under the Apache 2.0 license. See the license file for further details.

Platform support

Kata Containers currently runs on 64-bit systems supporting the following technologies:

Architecture Virtualization technology
x86_64, amd64 Intel VT-x, AMD SVM
aarch64 ("arm64") ARM Hyp
ppc64le IBM Power
s390x IBM Z & LinuxONE SIE

Hardware requirements

The Kata Containers runtime provides a command to determine if your host system is capable of running and creating a Kata Container:

kata-runtime check

Notes:

  • This command runs a number of checks including connecting to the network to determine if a newer release of Kata Containers is available on GitHub. If you do not wish this to check to run, add the --no-network-checks option.

  • By default, only a brief success / failure message is printed. If more details are needed, the --verbose flag can be used to display the list of all the checks performed.

  • If the command is run as the root user additional checks are run (including checking if another incompatible hypervisor is running). When running as root, network checks are automatically disabled.

Getting started

See the installation documentation.

Documentation

See the official documentation including:

Configuration

Kata Containers uses a single configuration file which contains a number of sections for various parts of the Kata Containers system including the runtime, the agent and the hypervisor.

Hypervisors

See the hypervisors document and the Hypervisor specific configuration details.

Community

To learn more about the project, its community and governance, see the community repository. This is the first place to go if you wish to contribute to the project.

Getting help

See the community section for ways to contact us.

Raising issues

Please raise an issue in this repository.

Note: If you are reporting a security issue, please follow the vulnerability reporting process

Developers

See the developer guide.

Components

Main components

The table below lists the core parts of the project:

Component Type Description
runtime core Main component run by a container manager and providing a containerd shimv2 runtime implementation.
runtime-rs core The Rust version runtime.
agent core Management process running inside the virtual machine / POD that sets up the container environment.
dragonball core An optional built-in VMM brings out-of-the-box Kata Containers experience with optimizations on container workloads
documentation documentation Documentation common to all components (such as design and install documentation).
tests tests Excludes unit tests which live with the main code.

Additional components

The table below lists the remaining parts of the project:

Component Type Description
packaging infrastructure Scripts and metadata for producing packaged binaries
(components, hypervisors, kernel and rootfs).
kernel kernel Linux kernel used by the hypervisor to boot the guest image. Patches are stored here.
osbuilder infrastructure Tool to create “mini O/S” rootfs and initrd images and kernel for the hypervisor.
kata-debug infrastructure Utility tool to gather Kata Containers debug information from Kubernetes clusters.
agent-ctl utility Tool that provides low-level access for testing the agent.
kata-ctl utility Tool that provides advanced commands and debug facilities.
log-parser-rs utility Tool that aid in analyzing logs from the kata runtime.
trace-forwarder utility Agent tracing helper.
runk utility Standard OCI container runtime based on the agent.
ci CI Continuous Integration configuration files and scripts.
katacontainers.io Source for the katacontainers.io site.

Packaging and releases

Kata Containers is now available natively for most distributions.

Metrics tests

See the metrics documentation.

Glossary of Terms

See the glossary of terms related to Kata Containers.

5 - Trustee

Trusted Components for Attestation and Secret Management

Trustee contains tools and components for attesting confidential guests and providing secrets to them. Collectively, these components are known as Trustee. Trustee typically operates on behalf of the “workload provider” / “data owner” and interacts remotely with guest components.

Trustee is developed for the Confidential Containers project, but can be used with a wide variety of applications and hardware platforms.

Architecture

Trustee is flexible and can be deployed in several different configurations. This figure shows one common way to deploy these components in conjunction with certain guest components.

flowchart LR
    AA -- attests guest ----> KBS
    CDH -- requests resource --> KBS
    subgraph Guest
        CDH <.-> AA
    end
    subgraph Trustee
        KBS -- validates evidence --> AS
        RVPS -- provides reference values--> AS
    end
    client-tool -- configures --> KBS

Legend

  • CDH: Confidential Data Hub
  • AA: Attestation Agent
  • KBS: Key Broker Service
  • RVPS: Reference Value Provider Service
  • AS: Attestation Service

5.1 - Key Broker Service (KBS)

This service facilitates remote attestation and secret delivery

The Confidential Containers Key Broker Service (KBS) facilitates remote attestation and secret delivery. The KBS is an implementation of a Relying Party from the Remote ATtestation ProcedureS (RATS) Architecture. The KBS itself does not validate attestation evidence. Instead, it relies on the Attestation-Service (AS) to verify TEE evidence.

In conjunction with the AS or Intel Trust Authority (ITA), the KBS supports the following TEEs:

  • AMD SEV-SNP
  • AMD SEV-SNP on Azure with vTPM
  • Intel TDX
  • Intel TDX on Azure with vTPM
  • Intel SGX
  • ARM CCA
  • Hygon CSV

Deployment Configurations

The KBS can be deployed in several different environments, including as part of a docker compose cluster, part of a Kubernetes cluster or without any containerization. Additionally, the KBS can interact with other attestation components in different ways. This section focuses on the different ways the KBS can interact with other components.

Background Check Mode

Background check mode is a more straightforward and simple way to configure the Key Broker Service (KBS) and Attestation-Service (AS). The term “Background Check” is from the RATS architecture. In background check mode, the KBS directly forwards the hardware evidence of a confidential guest to the AS to validate. Once the validation passes, the KBS will release secrets to the confidential guest.

flowchart LR
    AA -- attests guest --> KBS
    CDH -- requests resource ----> KBS
    subgraph Guest
        AA <.-> CDH
    end
    subgraph Trustee
        KBS -- validates evidence --> AS
    end

In background check mode, the KBS is the relying party and the AS is the verifier.

Passport Mode

Passport mode decouples the provisioning of resources from the validation of evidence. In background check mode these tasks are already handled by separate components, but in passport mode they are decoupled even more. The term “Passport” is from the RATS architecture.

In passport mode, there are two Key Broker Services (KBSes), one that uses a KBS to verify the evidence and a second to provision resources.

flowchart LR
    CDH -- requests resource ----> KBS2
    AA -- attests guest --> KBS1
    subgraph Guest
        CDH <.-> AA
    end
    subgraph Trustee 1
        KBS1 -- validates evidence --> AS
    end
    subgraph Trustee 2
        KBS2
    end

In the RATS passport model the client typically connects directly to the verifier to get an attestation token (a passport). In CoCo we do not support direct connections to the AS, so KBS1 serves as an intermediary. Together KBS1 and the AS represent the verifier. KBS2 is the relying party.

Passport mode is good for use cases when resource provisioning and attestation are handled by separate entities.

5.1.1 - KBS backed by AKV

This documentation describes how to mount secrets stored in Azure Key Vault into a KBS deployment

Premise

AKS

We assume an AKS cluster configured with Workload Identity and Key Vault Secrets Provider. The former provides a KBS pod with the privileges to access an Azure Key Vault (AKV) instance. The latter is an implementation of Kubernetes’ Secret Store CSI Driver, mapping secrets from external key vaults into pods. The guides below provide instructions on how to configure a cluster accordingly:

AKV

There should be an AKV instance that has been configured with role based access control (RBAC), containing two secrets named coco_one coco_two for the purpose of the example. Find out how to configure your instance for RBAC in the guide below.

Provide access to Key Vault keys, certificates, and secrets with an Azure role-based access control

Note: You might have to toggle between Access Policy and RBAC modes to create your secrets on the CLI or via the Portal if your user doesn’t have the necessary role assignments.

CoCo

While the steps describe a deployment of KBS, the configuration of a Confidential Containers environment is out of scope for this document. CoCo should be configured with KBS as a Key Broker Client (KBC) and the resulting KBS deployment should be available and configured for confidential pods.

Azure environment

Configure your Resource group, Subscription and AKS cluster name. Adjust accordingly:

export SUBSCRIPTION_ID="$(az account show --query id -o tsv)"
export RESOURCE_GROUP=my-group
export KEYVAULT_NAME=kbs-secrets
export CLUSTER_NAME=coco

Instructions

Create Identity

Create a User managed identity for KBS:

az identity create --name kbs -g "$RESOURCE_GROUP"
export KBS_CLIENT_ID="$(az identity show -g "$RESOURCE_GROUP" --name kbs --query clientId -o tsv)"
export KBS_TENANT_ID=$(az aks show --name "$CLUSTER_NAME" --resource-group "$RESOURCE_GROUP" --query identity.tenantId -o tsv)

Assign a role to access secrets:

export KEYVAULT_SCOPE=$(az keyvault show --name "$KEYVAULT_NAME" --query id -o tsv)
az role assignment create --role "Key Vault Administrator" --assignee "$KBS_CLIENT_ID" --scope "$KEYVAULT_SCOPE"

Namespace

By default KBS is deployed into a coco-tenant Namespace:

export NAMESPACE=coco-tenant
kubectl create namespace $NAMESPACE

KBS identity and Service Account

Workload Identity provides individual pods with IAM privileges to access Azure infrastructure resources. An azure identity is bridged to a Service Account using OIDC and Federated Credentials. Those are scoped to a Namespace, we assume we deploy the Service Account and KBS into the default Namespace, adjust accordingly if necessary.

export AKS_OIDC_ISSUER="$(az aks show --resource-group "$RESOURCE_GROUP" --name "$CLUSTER_NAME" --query "oidcIssuerProfile.issuerUrl" -o tsv)"
az identity federated-credential create \
 --name kbsfederatedidentity \
 --identity-name kbs \
 --resource-group "$RESOURCE_GROUP" \
 --issuer "$AKS_OIDC_ISSUER" \
 --subject "system:serviceaccount:${NAMESPACE}:kbs"

Create a Service Account object and annotate it with the identity’s client id.

cat <<EOF> service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    azure.workload.identity/client-id: ${KBS_CLIENT_ID}
  name: kbs
  namespace: ${NAMESPACE}
EOF
kubectl apply -f service-account.yaml

Secret Provider Class

A Secret Provider Class specifies a set of secrets that should be made available to k8s workloads.

cat <<EOF> secret-provider-class.yaml
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: ${KEYVAULT_NAME}
  namespace: ${NAMESPACE}
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    clientID: ${KBS_CLIENT_ID}
    keyvaultName: ${KEYVAULT_NAME}
    objects: |
      array:
      - |
        objectName: coco_one
        objectType: secret
      - |
        objectName: coco_two
        objectType: secret
    tenantId: ${KBS_TENANT_ID}
EOF
kubectl create -f secret-provider-class.yaml

Deploy KBS

The default KBS deployment needs to be extended with label annotations and CSI volume. The secrets are mounted into the storage hierarchy default/akv.

git clone https://github.com/confidential-containers/kbs.git
cd kbs
git checkout v0.8.2
cd kbs/config/kubernetes
mkdir akv
cat <<EOF> akv/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: coco-tenant

resources:
- ../base

patches:
- path: patch.yaml
  target:
    group: apps
    kind: Deployment
    name: kbs
    version: v1
EOF
cat <<EOF> akv/patch.yaml
- op: add
  path: /spec/template/metadata/labels/azure.workload.identity~1use
  value: "true"
- op: add
  path: /spec/template/spec/serviceAccountName
  value: kbs
- op: add
  path: /spec/template/spec/containers/0/volumeMounts/-
  value:
    name: secrets
    mountPath: /opt/confidential-containers/kbs/repository/default/akv
    readOnly: true
- op: add
  path: /spec/template/spec/volumes/-
  value:
    name: secrets
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: ${KEYVAULT_NAME}
EOF
kubectl apply -k akv/

Test

The KBS pod should be running, the pod events should give indication of possible errors. From a confidential pod the AKV secrets should be retrievable via Confidential Data Hub:

$ kubectl exec -it deploy/nginx-coco -- curl http://127.0.0.1:8006/cdh/resource/default/akv/coco_one
a secret

5.2 - Attestation Service (AS)

This service verifies TEE evidence

The Attestation Service (AS or CoCo-AS) verifies hardware evidence. The AS was designed to be used with the Key Broker Service (KBS) for Confidential Containers, but it can be used in a wide variety of situations. The AS can be used anytime TEE evidence needs to be validated.

Today, the AS can validate evidence from the following TEEs:

  • Intel TDX
  • Intel SGX
  • AMD SEV-SNP
  • ARM CCA
  • Hygon CSV
  • Intel TDX with vTPM on Azure
  • AMD SEV-SNP with vTPM on Azure

Overview

                                      ┌───────────────────────┐
┌───────────────────────┐ Evidence    │  Attestation Service  │
│                       ├────────────►│                       │
│ Verification Demander │             │ ┌────────┐ ┌──────────┴───────┐
│    (Such as KBS)      │             │ │ Policy │ │ Reference Value  │◄───Reference Value
│                       │◄────────────┤ │ Engine │ │ Provider Service │
└───────────────────────┘ Attestation │ └────────┘ └──────────┬───────┘
                        Results Token │                       │
                                      │ ┌───────────────────┐ │
                                      │ │  Verifier Drivers │ │
                                      │ └───────────────────┘ │
                                      │                       │
                                      └───────────────────────┘

The Attestation Service (AS) has a simple API. It receives attestation evidence and returns an attestation token containing the results of a two-step verification process. The AS can be consumed directly as a Rust crate (library) or built as a standalone service, exposing a REST or gRPC API. In Confidential Containers, the client of the AS is the Key Broker Service (KBS), but the evidence originates from the Attestation Agent inside the guest.

The AS has a two-step verification process.

  1. Verify the format and provenance of evidence itself (e.g. check the signature of the evidence).
  2. Appraise the claims presented in the evidence (e.g. check that measurements match reference values).

The first step is accomplished by one of the platform-specific Verifier Drivers. The second step is driven by the Policy Engine with help from the Reference Value Provider Service (RVPS).

5.3 - Reference Value Provider Service (RVPS)

This service manages reference values used to verify TEE evidence

Reference Value Provider Service (RVPS) is a component to receive software supply chain provenances / metadata, verify them and extract the reference values. All the reference values are stored inside RVPS. When Attestation Service (AS) queries specific software claims, RVPS will response with related reference values.

Architecture

RVPS contains the following components:

  • Pre-Processor: Pre-Processor contains a set of *wares (like middleware). These wares can process the input Message and then deliver it to the Extractors.

  • Extractors: Extractors has sub-modules to process different type of provenance. Each sub-module will consume the input Message, and then generate an output Reference Value.

  • Store: Store is a trait object, which can provide key-value like API. All verified reference values will be stored in the Store. When requested by Attestation Service (AS), related reference value will be provided.

Message Flow

The message flow of RVPS is like the following figure:

Message

A protocol helps to distribute provenance of binaries. It will be received and processed by RVPS, then RVPS will generate Reference Value if working correctly.

{
    "version": <VERSION-NUMBER-STRING>,
    "type": <TYPE-OF-THE-PROVENANCE-STRING>,
    "provenance": #provenance,
}
  • "version": This field is the version of this message, making extensibility possible.
  • "type": This field specifies the concrete type of the provenance the message carries.
  • "provenance": This field is the main content passed to RVPS. This field contains the payload to be decrypted by RVPS. The meaning of the provenance depends on the type and concrete Extractor which process this.

Trust Digests

It is the reference values really requested and used by Attestation Service to compare with the gathered evidence generated from HW TEE. They are usually digests. To avoid ambiguity, they are named trust digests rather than reference values.

5.4 - KBS Client Tool

Simple tool to test or configure Key Broker Service and Attestation Service

This is a simple client for the Key Broker Client (KBS) that facilitates testing of the KBS and other basic attestation flows.

You can run this tool inside of a TEE to make a request with real attestation evidence. You can also provide pre-existing evidence or use the sample attester as a fallback.

The client tool can also be used to provision the KBS/AS with resources and policies.

6 - Guest Components

Confidential Container Tools and Components

FOSSA Status

This repository includes tools and components for confidential container images.

  • Attestation Agent: An agent for facilitating attestation protocols. Can be built as a library to run in a process-based enclave or built as a process that runs inside a confidential vm.

  • image-rs: Rust implementation of the container image management library.

  • ocicrypt-rs: Rust implementation of the OCI image encryption library.

  • api-server-rest](api-server-rest): CoCo Restful API server.

License

FOSSA Status

6.1 - API Server Rest

Documentation for CoCo Restful API Server

CoCo guest components use lightweight ttRPC for internal communication to reduce the memory footprint and dependency. But many internal services also needed by containers like get_resource, get_evidence and get_token, we export these services with restful API, now CoCo containers can easy access these API with http client. Here are some examples, for detail info, please refer rest API

$ ./api-server-rest --features=all
Starting API server on 127.0.0.1:8006
API Server listening on http://127.0.0.1:8006
$ curl http://127.0.0.1:8006/cdh/resource/default/key/1
12345678901234567890123456xxxx
$ curl http://127.0.0.1:8006/aa/evidence\?runtime_data\=xxxx
{"svn":"1","report_data":"eHh4eA=="}
$ curl http://127.0.0.1:8006/aa/token\?token_type\=kbs
{"token":"eyJhbGciOiJFi...","tee_keypair":"-----BEGIN... "}

6.2 - Attestation Agent

Documentation for Attestation Agent

Attestation Agent (AA for short) is a service function set for attestation procedure in Confidential Containers. It provides kinds of service APIs that need to make requests to the Relying Party (Key Broker Service) in Confidential Containers, and performs an attestation and establishes connection between the Key Broker Client (KBC) and corresponding KBS, so as to obtain the trusted services or resources of KBS.

Current consumers of AA include:

Components

The main body of AA is a rust library crate, which contains KBC modules used to communicate with various KBS. In addition, this project also provides a gRPC service application, which allows callers to call the services provided by AA through gRPC.

Library crate

Import AA in Cargo.toml of your project with specific KBC(s):

attestation-agent = { git = "https://github.com/confidential-containers/guest-components", features = ["sample_kbc"] }

Note: When the version is stable, we will release AA on https://crate.io.

gRPC Application

Here are the steps of building and running gRPC application of AA:

Build

Build and install with default KBC modules:

git clone https://github.com/confidential-containers/guest-components
cd guest-components/attestation-agent
make && make install

or explicitly specify the KBS modules it contains. Taking sample_kbc as example:

make KBC=sample_kbc

Musl

To build and install with musl, just run:

make LIBC=musl && make install

Openssl support

To build and install with openssl support (which is helpful in specific machines like s390x)

make OPENSSL=1 && make install

Run

For help information, just run:

attestation-agent --help

Start AA and specify the endpoint of AA’s gRPC service:

attestation-agent --keyprovider_sock 127.0.0.1:50000 --getresource_sock 127.0.0.1:50001

Or start AA with default keyprovider address (127.0.0.1:50000) and default getresource address (127.0.0.1:50001):

attestation-agent

If you want to see the runtime log:

RUST_LOG=attestation_agent attestation-agent --keyprovider_sock 127.0.0.1:50000 --getresource_sock 127.0.0.1:50001

ttRPC

To build and install ttRPC Attestation Agent, just run:

make ttrpc=true && make install

ttRPC AA now only support Unix Socket, for example:

attestation-agent --keyprovider_sock unix:///tmp/keyprovider.sock --getresource_sock unix:///tmp/getresource.sock

Supported KBC modules

AA provides a flexible KBC module mechanism to support different KBS protocols required to make the communication between KBC and KBS. If the KBC modules currently supported by AA cannot meet your use requirement (e.g, need to use a new KBS protocol), you can write a new KBC module complying with the KBC development GUIDE. Welcome to contribute new KBC module to this project!

List of supported KBC modules:

KBC module name README KBS protocol Maintainer
sample_kbc Null Null Attestation Agent Authors
offline_fs_kbc Offline file system KBC Null IBM
eaa_kbc EAA KBC EAA protocol Alibaba Cloud
offline_sev_kbc Offline SEV KBC Null IBM
online_sev_kbc Online SEV KBC simple-kbs IBM
cc_kbc CC KBC CoCo KBS protocol CoCo Community

CC KBC

CC KBC supports different kinds of hardware TEE attesters, now

Attester name Info
tdx-attester Intel TDX
sgx-attester Intel SGX DCAP
snp-attester AMD SEV-SNP
az-snp-vtpm-attester Azure SEV-SNP CVM

To build cc kbc with all available attesters and install, use

make KBC=cc_kbc && make install

Tools

6.3 - Confidential Data Hub

Documentation for Confidential Data Hub

Confidential Data Hub is a service running inside guest to provide resource related APIs.

Build

Build and install with default KBC modules:

git clone https://github.com/confidential-containers/guest-components
cd guest-components/confidential-data-hub
make

or explicitly specify the confidential resource provider and KMS plugin, please refer to Supported Features

make RESOURCE_PROVIDER=kbs PROVIDER=aliyun

Supported Features

Confidential resource providers (flag RESOURCE_PROVIDER)

Feature name Note
kbs For TDX/SNP/Azure-SNP-vTPM based on KBS Attestation Protocol
sev For SEV based on efi secret pre-attestation

Note: offline-fs is built-in, we do not need to manually enable. If no RESOURCE_PROVIDER is given, all features will be enabled.

KMS plugins (flag PROVIDER)

Feature name Note
aliyun Use aliyun KMS suites to unseal secrets, etc.

Note: If no PROVIDER is given, all features will be enabled.

6.4 - image-rs

Documentation for image-rs

Container Images Rust Crate

Documentation

Design document

CCv1 Image Security Design document

6.5 - ocicrypt-rs

Documentation for ocicrypt-rs

ocicrypt-rs

This repo contains the rust version of the containers/ocicrypt library.

7 - Cloud API Adaptor

Documentation for Cloud API Adaptor a.k.a Peer Pods

Introduction

This repository contains the implementation of Kata remote hypervisor. Kata remote hypervisor enables creation of Kata VMs on any environment without requiring baremetal servers or nested virtualization support.

Goals

  • Accept requests from Kata shim to create/delete Kata VM instances without requiring nested virtualization support.
  • Manage VM instances in the cloud to run pods using cloud (virtualization) provider APIs
  • Forward communication between kata shim on a worker node VM and kata agent on a pod VM
  • Provide a mechanism to establish a network tunnel between a worker and pod VMs to Kubernetes pod network

Architecture

The background and description of the components involved in ‘peer pods’ can be found in the architecture documentation.

Components

Installation

Please refer to the instructions mentioned in the following doc.

Supported Providers

  • aws
  • azure
  • ibmcloud
  • libvirt
  • vsphere

Adding a new provider

Please refer to the instructions mentioned in the following doc.

Contribution

This project uses the Apache 2.0 license. Contribution to this project requires the DCO 1.1 process to be followed.

Collaborations

7.1 - Cloud API Adaptor Troubleshooting

Generic troubleshooting steps after installation of Cloud API Adaptor

Application pod created but it stays in ContainerCreating state

Let’s start by looking at the pods deployed in the confidential-containers-system namespace:

$ kubectl get pods -n confidential-containers-system -o wide
NAME                                              READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
cc-operator-controller-manager-76755f9c96-pjj92   2/2     Running   0          1h    10.244.0.14   aks-nodepool1-22620003-vmss000000   <none>           <none>
cc-operator-daemon-install-79c2b                  1/1     Running   0          1h    10.244.0.16   aks-nodepool1-22620003-vmss000000   <none>           <none>
cc-operator-pre-install-daemon-gsggj              1/1     Running   0          1h    10.244.0.15   aks-nodepool1-22620003-vmss000000   <none>           <none>
cloud-api-adaptor-daemonset-2pjbb                 1/1     Running   0          1h    10.224.0.4    aks-nodepool1-22620003-vmss000000   <none>           <none>

It is possible that the cloud-api-adaptor-daemonset is not deployed correctly. To see what is wrong with it run the following command and look at the events to get insights:

$ kubectl -n confidential-containers-system describe ds cloud-api-adaptor-daemonset
Name:           cloud-api-adaptor-daemonset
Selector:       app=cloud-api-adaptor
Node-Selector:  node-role.kubernetes.io/worker=
...
Events:
  Type    Reason            Age    From                  Message
  ----    ------            ----   ----                  -------
  Normal  SuccessfulCreate  8m13s  daemonset-controller  Created pod: cloud-api-adaptor-daemonset-2pjbb

But if the cloud-api-adaptor-daemonset is up and in the Running state, like shown above then look at the pods’ logs, for more insights:

kubectl -n confidential-containers-system logs daemonset/cloud-api-adaptor-daemonset

Note: This is a single node cluster. So there is only one pod named cloud-api-adaptor-daemonset-*. But if you are running on a multi-node cluster then look for the node your workload fails to come up and only see the logs of corresponding CAA pod.

If the problem hints that something is wrong with the configuration then look at the configmaps or secrets needed to run CAA:

$ kubectl -n confidential-containers-system get cm
NAME                         DATA   AGE
cc-operator-manager-config   1      1h
kube-root-ca.crt             1      1h
peer-pods-cm                 7      1h
$ kubectl -n confidential-containers-system get secret
NAME               TYPE     DATA   AGE
peer-pods-secret   Opaque   0      1h
ssh-key-secret     Opaque   1      1h

7.2 - AWS

Documentation for peerpods on AWS

Prerequisites

  • Set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for AWS cli access

  • Install packer by following the instructions in the following link

  • Install packer’s Amazon plugin packer plugins install github.com/hashicorp/amazon

Build Pod VM Image

Option-1: Modifying existing marketplace image

  • Set environment variables
export AWS_REGION="us-east-1" # mandatory
export PODVM_DISTRO=rhel # mandatory
export INSTANCE_TYPE=t3.small # optional, default is t3.small
export IMAGE_NAME=peer-pod-ami # optional
export VPC_ID=vpc-01234567890abcdef # optional, otherwise, it creates and uses the default vpc in the specific region
export SUBNET_ID=subnet-01234567890abcdef # must be set if VPC_ID is set

If you want to change the volume size of the generated AMI, then set the VOLUME_SIZE environment variable. For example if you want to set the volume size to 40 GiB, then do the following:

export VOLUME_SIZE=40
  • Create a custom AWS VM image based on Ubuntu 22.04 having kata-agent and other dependencies

NOTE: For setting up authenticated registry support read this documentation.

cd image
make image

You can also build the custom AMI by running the packer build inside a container:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
-f Dockerfile .

If you want to use an existing VPC_ID with public SUBNET_ID then use the following command:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
--build-arg VPC_ID=${VPC_ID} \
--build-arg SUBNET_ID=${SUBNET_ID}\
-f Dockerfile .

If you want to build a CentOS based custom AMI then you’ll need to first accept the terms by visiting this link

Once done, run the following command:

docker build -t aws \
--secret id=AWS_ACCESS_KEY_ID \
--secret id=AWS_SECRET_ACCESS_KEY \
--build-arg AWS_REGION=${AWS_REGION} \
--build-arg BINARIES_IMG=quay.io/confidential-containers/podvm-binaries-centos-amd64 \
--build-arg PODVM_DISTRO=centos \
-f Dockerfile .
  • Note down your newly created AMI_ID

Once the image creation is complete, you can use the following CLI command as well to get the AMI_ID. The command assumes that you are using the default AMI name: peer-pod-ami

aws ec2 describe-images --query "Images[*].[ImageId]" --filters "Name=name,Values=peer-pod-ami" --region ${AWS_REGION} --output text

Option-2: Using precreated QCOW2 image

  • Download QCOW2 image
mkdir -p qcow2-img && cd qcow2-img

curl -LO https://raw.githubusercontent.com/confidential-containers/cloud-api-adaptor/staging/podvm/hack/download-image.sh

bash download-image.sh quay.io/confidential-containers/podvm-generic-ubuntu-amd64:latest . -o podvm.qcow2
  • Convert QCOW2 image to RAW format You’ll need the qemu-img tool for conversion.
qemu-img convert -O raw podvm.qcow2 podvm.raw
  • Upload RAW image to S3 and create AMI You can use the following helper script to upload the podvm.raw image to S3 and create an AMI Note that AWS cli should be configured to use the helper script.
curl -L0 https://raw.githubusercontent.com/confidential-containers/cloud-api-adaptor/staging/aws/raw-to-ami.sh

bash raw-to-ami.sh podvm.raw <Some S3 Bucket Name> <AWS Region>

On success, the command will generate the AMI_ID, which needs to be used to set the value of PODVM_AMI_ID in peer-pods-cm configmap.

Running cloud-api-adaptor

7.3 - Azure

Cloud API Adaptor (CAA) on Azure

This documentation will walk you through setting up CAA (a.k.a. Peer Pods) on Azure Kubernetes Service (AKS). It explains how to deploy:

  • A single worker node Kubernetes cluster using Azure Kubernetes Service (AKS)
  • CAA on that Kubernetes cluster
  • An Nginx pod backed by CAA pod VM

Pre-requisites

  • Install Azure CLI by following instructions here.
  • Install kubectl by following the instructions here.
  • Ensure that the tools curl, git and jq are installed.

Azure Preparation

Azure login

There are a bunch of steps that require you to be logged into your Azure account:

az login

Retrieve your subscription ID:

export AZURE_SUBSCRIPTION_ID=$(az account show --query id --output tsv)

Set the region:

export AZURE_REGION="eastus"

Note: We selected the eastus region as it not only offers AMD SEV-SNP machines but also has prebuilt pod VM images readily available.

export AZURE_REGION="eastus2"

Note: We selected the eastus2 region as it not only offers Intel TDX machines but also has prebuilt pod VM images readily available.

export AZURE_REGION="eastus"

Note: We have chose region eastus because it has prebuilt pod VM images readily available.

Resource group

Note: Skip this step if you already have a resource group you want to use. Please, export the resource group name in the AZURE_RESOURCE_GROUP environment variable.

Create an Azure resource group by running the following command:

export AZURE_RESOURCE_GROUP="caa-rg-$(date '+%Y%m%b%d%H%M%S')"

az group create \
  --name "${AZURE_RESOURCE_GROUP}" \
  --location "${AZURE_REGION}"

Deploy Kubernetes using AKS

Make changes to the following environment variable as you see fit:

export CLUSTER_NAME="caa-$(date '+%Y%m%b%d%H%M%S')"
export AKS_WORKER_USER_NAME="azuser"
export AKS_RG="${AZURE_RESOURCE_GROUP}-aks"
export SSH_KEY=~/.ssh/id_rsa.pub

Note: Optionally, deploy the worker nodes into an existing Azure Virtual Network (VNet) and subnet by adding the following flag: --vnet-subnet-id $SUBNET_ID.

Deploy AKS with single worker node to the same resource group you created earlier:

az aks create \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --node-resource-group "${AKS_RG}" \
  --name "${CLUSTER_NAME}" \
  --enable-oidc-issuer \
  --enable-workload-identity \
  --location "${AZURE_REGION}" \
  --node-count 1 \
  --node-vm-size Standard_F4s_v2 \
  --nodepool-labels node.kubernetes.io/worker= \
  --ssh-key-value "${SSH_KEY}" \
  --admin-username "${AKS_WORKER_USER_NAME}" \
  --os-sku Ubuntu

Download kubeconfig locally to access the cluster using kubectl:

az aks get-credentials \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --name "${CLUSTER_NAME}"

User assigned identity and federated credentials

CAA needs privileges to talk to Azure API. This privilege is granted to CAA by associating a workload identity to the CAA service account. This workload identity (a.k.a. user assigned identity) is given permissions to create VMs, fetch images and join networks in the next step.

Note: If you use an existing AKS cluster it might need to be configured to support workload identity and OpenID Connect (OIDC), please refer to the instructions in this guide.

Start by creating an identity for CAA:

export AZURE_WORKLOAD_IDENTITY_NAME="caa-identity"

az identity create \
  --name "${AZURE_WORKLOAD_IDENTITY_NAME}" \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --location "${AZURE_REGION}"
export USER_ASSIGNED_CLIENT_ID="$(az identity show \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --name "${AZURE_WORKLOAD_IDENTITY_NAME}" \
  --query 'clientId' \
  -otsv)"

AKS resource group permissions

For CAA to be able to manage VMs assign the identity VM and Network contributor roles, privileges to spawn VMs in $AZURE_RESOURCE_GROUP and attach to a VNet in $AKS_RG.

az role assignment create \
  --role "Virtual Machine Contributor" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
  --role "Reader" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AZURE_RESOURCE_GROUP}"
az role assignment create \
  --role "Network Contributor" \
  --assignee "$USER_ASSIGNED_CLIENT_ID" \
  --scope "/subscriptions/${AZURE_SUBSCRIPTION_ID}/resourcegroups/${AKS_RG}"

Create the federated credential for the CAA ServiceAccount using the OIDC endpoint from the AKS cluster:

export AKS_OIDC_ISSUER="$(az aks show \
  --name "$CLUSTER_NAME" \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --query "oidcIssuerProfile.issuerUrl" \
  -otsv)"
az identity federated-credential create \
  --name caa-fedcred \
  --identity-name caa-identity \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --issuer "${AKS_OIDC_ISSUER}" \
  --subject system:serviceaccount:confidential-containers-system:cloud-api-adaptor \
  --audience api://AzureADTokenExchange

AKS subnet ID

Fetch the AKS created VNet name:

export AZURE_VNET_NAME=$(az network vnet list \
  --resource-group "${AKS_RG}" \
  --query "[0].name" \
  --output tsv)

Export the subnet ID to be used for CAA DaemonSet deployment:

export AZURE_SUBNET_ID=$(az network vnet subnet list \
  --resource-group "${AKS_RG}" \
  --vnet-name "${AZURE_VNET_NAME}" \
  --query "[0].id" \
  --output tsv)

Deploy CAA

Note: If you are using Calico Container Network Interface (CNI) on the Kubernetes cluster, then, configure Virtual Extensible LAN (VXLAN) encapsulation for all inter workload traffic.

Download the CAA deployment artifacts

export CAA_VERSION="0.8.2"
curl -LO "https://github.com/confidential-containers/cloud-api-adaptor/archive/refs/tags/v${CAA_VERSION}.tar.gz"
tar -xvzf "v${CAA_VERSION}.tar.gz"
cd "cloud-api-adaptor-${CAA_VERSION}/src/cloud-api-adaptor"
export CAA_BRANCH="main"
curl -LO "https://github.com/confidential-containers/cloud-api-adaptor/archive/refs/heads/${CAA_BRANCH}.tar.gz"
tar -xvzf "${CAA_BRANCH}.tar.gz"
cd "cloud-api-adaptor-${CAA_BRANCH}/src/cloud-api-adaptor"

This assumes that you already have the code ready to use. On your terminal change directory to the Cloud API Adaptor’s code base.

CAA pod VM image

Export this environment variable to use for the peer pod VM:

export AZURE_IMAGE_ID="/CommunityGalleries/cococommunity-42d8482d-92cd-415b-b332-7648bd978eff/Images/peerpod-podvm-ubuntu2204-cvm-snp/Versions/${CAA_VERSION}"

An automated job builds the pod VM image each night at 00:00 UTC. You can use that image by exporting the following environment variable:

SUCCESS_TIME=$(curl -s \
  -H "Accept: application/vnd.github+json" \
  "https://api.github.com/repos/confidential-containers/cloud-api-adaptor/actions/workflows/azure-podvm-image-nightly-build.yml/runs?status=success" \
  | jq -r '.workflow_runs[0].updated_at')

export AZURE_IMAGE_ID="/CommunityGalleries/cocopodvm-d0e4f35f-5530-4b9c-8596-112487cdea85/Images/podvm_image0/Versions/$(date -u -jf "%Y-%m-%dT%H:%M:%SZ" "$SUCCESS_TIME" "+%Y.%m.%d" 2>/dev/null || date -d "$SUCCESS_TIME" +%Y.%m.%d)"

Above image version is in the format YYYY.MM.DD, so to use the latest image should be today’s date or yesterday’s date.

If you have made changes to the CAA code that affects the pod VM image and you want to deploy those changes then follow these instructions to build the pod VM image. Once image build is finished then export image id to the environment variable AZURE_IMAGE_ID.

CAA container image

Export the following environment variable to use the latest release image of CAA:

export CAA_IMAGE="quay.io/confidential-containers/cloud-api-adaptor"
export CAA_TAG="v0.8.2-amd64"

Export the following environment variable to use the image built by the CAA CI on each merge to main:

export CAA_IMAGE="quay.io/confidential-containers/cloud-api-adaptor"

Find an appropriate tag of pre-built image suitable to your needs here.

export CAA_TAG=""

Caution: You can also use the latest tag but it is not recommended, because of its lack of version control and potential for unpredictable updates, impacting stability and reproducibility in deployments.

If you have made changes to the CAA code and you want to deploy those changes then follow these instructions to build the container image. Once the image is built export the environment variables CAA_IMAGE and CAA_TAG.

Annotate Service Account

Annotate the CAA Service Account with the workload identity’s CLIENT_ID and make the CAA DaemonSet use workload identity for authentication:

cat <<EOF > install/overlays/azure/workload-identity.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: cloud-api-adaptor-daemonset
  namespace: confidential-containers-system
spec:
  template:
    metadata:
      labels:
        azure.workload.identity/use: "true"
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: cloud-api-adaptor
  namespace: confidential-containers-system
  annotations:
    azure.workload.identity/client-id: "$USER_ASSIGNED_CLIENT_ID"
EOF

Select peer-pods machine type

export AZURE_INSTANCE_SIZE="Standard_DC2as_v5"
export DISABLECVM="false"

Find more AMD SEV-SNP machine types on this Azure documentation.

export AZURE_INSTANCE_SIZE="Standard_DC2es_v5"
export DISABLECVM="false"

Find more Intel TDX machine types on this Azure documentation.

export AZURE_INSTANCE_SIZE="Standard_D2as_v5"
export DISABLECVM="true"

Populate the kustomization.yaml file

Run the following command to update the kustomization.yaml file:

cat <<EOF > install/overlays/azure/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../yamls
images:
- name: cloud-api-adaptor
  newName: "${CAA_IMAGE}"
  newTag: "${CAA_TAG}"
generatorOptions:
  disableNameSuffixHash: true
configMapGenerator:
- name: peer-pods-cm
  namespace: confidential-containers-system
  literals:
  - CLOUD_PROVIDER="azure"
  - AZURE_SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
  - AZURE_REGION="${AZURE_REGION}"
  - AZURE_INSTANCE_SIZE="${AZURE_INSTANCE_SIZE}"
  - AZURE_RESOURCE_GROUP="${AZURE_RESOURCE_GROUP}"
  - AZURE_SUBNET_ID="${AZURE_SUBNET_ID}"
  - AZURE_IMAGE_ID="${AZURE_IMAGE_ID}"
  - DISABLECVM="${DISABLECVM}"
secretGenerator:
- name: peer-pods-secret
  namespace: confidential-containers-system
- name: ssh-key-secret
  namespace: confidential-containers-system
  files:
  - id_rsa.pub
patchesStrategicMerge:
- workload-identity.yaml
EOF

The SSH public key should be accessible to the kustomization.yaml file:

cp $SSH_KEY install/overlays/azure/id_rsa.pub

Deploy CAA on the Kubernetes cluster

Deploy coco operator:

export COCO_OPERATOR_VERSION="0.8.0"
kubectl apply -k "github.com/confidential-containers/operator/config/release?ref=v${COCO_OPERATOR_VERSION}"
kubectl apply -k "github.com/confidential-containers/operator/config/samples/ccruntime/peer-pods?ref=v${COCO_OPERATOR_VERSION}"

Run the following command to deploy CAA:

kubectl apply -k "install/overlays/azure"

Generic CAA deployment instructions are also described here.

Run sample application

Ensure runtimeclass is present

Verify that the runtimeclass is created after deploying CAA:

kubectl get runtimeclass

Once you can find a runtimeclass named kata-remote then you can be sure that the deployment was successful. A successful deployment will look like this:

$ kubectl get runtimeclass
NAME          HANDLER       AGE
kata-remote   kata-remote   7m18s

Deploy workload

Create an nginx deployment:

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      runtimeClassName: kata-remote
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        imagePullPolicy: Always
EOF

Ensure that the pod is up and running:

kubectl get pods -n default

You can verify that the peer pod VM was created by running the following command:

az vm list \
  --resource-group "${AZURE_RESOURCE_GROUP}" \
  --output table

Here you should see the VM associated with the pod nginx.

Note: If you run into problems then check the troubleshooting guide here.

Cleanup

If you wish to clean up the whole set up, you can delete the resource group by running the following command:

az group delete \
  --name "${AZURE_RESOURCE_GROUP}" \
  --yes --no-wait

7.4 - IBM Cloud

Documentation for peerpods on IBM Cloud

This guide describes how to set up a demo environment on IBM Cloud for peer pod VMs using the operator deployment approach.

The high level flow involved is:

  • Build and upload a peer pod custom image to IBM Cloud
  • Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure
  • Deploy Confidential-containers operator
  • Deploy and validate that the nginx demo works
  • Clean-up and deprovision

Pre-reqs

When building the peer pod VM image, it is simplest to use the container based approach, which only requires either docker, or podman, but it can also be built locally.

Note: the peer pod VM image build and upload is de-coupled from the cluster creation and operator deployment stage, so can be built on a different machine.

There are a number of packages that you will need to install in order to create the Kubernetes cluster and peer pod enable it:

  • Terraform, Ansible, the IBM Cloud CLI and kubectl are all required for the cluster creation and explained in the cluster pre-reqs guide.

In addition to this you will need to install jq

Tip: If you are using Ubuntu linux, you can run follow command:

$ sudo apt-get install jq

You will also require go and make to be installed.

Peer Pod VM Image

A peer pod VM image needs to be created as a VPC custom image in IBM Cloud in order to create the peer pod instances from. The peer pod VM image contains components like the agent protocol forwarder and Kata agent that communicate with the Kubernetes worker node and carry out the received instructions inside the peer pod.

Building a Peer Pod VM Image via Docker [Optional]

You may skip this step and use one of the release images, skip to Import Release VM Image but for the latest features you may wish to build your own.

You can do this by following the process document. If building within a container ensure that --build-arg CLOUD_PROVIDER=ibmcloud is set and --build-arg ARCH=s390x for an s390x architecture image.

Note: At the time of writing issue, #649 means when creating an s390x image you also need to add two extra build args: --build-arg UBUNTU_IMAGE_URL="" and --build-arg UBUNTU_IMAGE_CHECKSUM=""

Note: If building the peer pod qcow2 image within a VM, it may take a lot of resources e.g. 8 vCPU and 32GB RAM due to the nested virtualization performance limitations. When running without enough resources, the failure seen is similar to:

Build 'qemu.ubuntu' errored after 5 minutes 57 seconds: Timeout waiting for SSH.

Upload the built peer pod VM image to IBM Cloud

You can follow the process documented from the cloud-api-adaptor/ibmcloud/image to extract and upload the peer pod image you’ve just built to IBM Cloud as a custom image, noting to replace the quay.io/confidential-containers/podvm-ibmcloud-ubuntu-s390x reference with the local container image that you built above e.g. localhost/podvm_ibmcloud_s390x:latest.

This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be needed in the kustomize step later.

Import Release VM Image

Alternatively to use a pre-built peer pod VM image you can follow the process documented with the release images found at quay.io/confidential-containers/podvm-generic-ubuntu-<ARCH>. Running this command will require docker or podman, as per tools

 ./import.sh quay.io/confidential-containers/podvm-generic-ubuntu-s390x eu-gb --bucket example-bucket --instance example-cos-instance

This script will end with the line: Image <image-name> with id <image-id> is available. The image-id field will be needed in later steps.

Create a ‘self-managed’ Kubernetes cluster on IBM Cloud provided infrastructure

If you don’t have a Kubernetes cluster for testing, you can follow the open-source instructions to set up a basic cluster where the Kubernetes nodes run on IBM Cloud provided infrastructure.

Deploy PeerPod Webhook

Deploy cert-manager

  • Deploy cert-manager with:

    kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.9.1/cert-manager.yaml
    
  • Wait for the pods to all be in running state with:

    kubectl get pods -n cert-manager --watch
    

Deploy the peer-pods webhook

  • From within the root directory of the cloud-api-adaptor repository, deploy the webhook with:

    kubectl apply -f ./webhook/hack/webhook-deploy.yaml
    
  • Wait for the pods to all be in running state with:

    kubectl get pods -n peer-pods-webhook-system --watch
    
  • Advertise the extended resource kata.peerpods.io/vm. by running the following commands:

    pushd webhook/hack/extended-resources
    ./setup.sh
    popd
    

Deploy the Confidential-containers operator

The caa-provisioner-cli simplifies deploying the operator and the cloud-api-adaptor resources on to any cluster. See the test/tools/README.md for full instructions. To create an ibmcloud ready version follow these steps

# Starting from the cloud-api-adaptor root directory
pushd test/tools
make BUILTIN_CLOUD_PROVIDERS="ibmcloud" all
popd

This will create caa-provisioner-cli in the test/tools directory. To use the tool with an existing self-managed cluster you will need to setup a .properties file containing the relevant ibmcloud information to enable your cluster to create and use peer-pods. Use the following commands to generate the .properties file, if not using a selfmanaged cluster please update the terraform commands with the appropriate values manually.

export IBMCLOUD_API_KEY= # your ibmcloud apikey
export PODVM_IMAGE_ID= # the image id of the peerpod vm uploaded in the previous step
export PODVM_INSTANCE_PROFILE= # instance profile name that runs the peerpod (bx2-2x8 or bz2-2x8 for example)
export CAA_IMAGE_TAG= # cloud-api-adaptor image tag that supports this arch, see quay.io/confidential-containers/cloud-api-adaptor
pushd ibmcloud/cluster

cat <<EOF > ../../selfmanaged_cluster.properties
IBMCLOUD_PROVIDER="ibmcloud"
APIKEY="$IBMCLOUD_API_KEY"
PODVM_IMAGE_ID="$PODVM_IMAGE_ID"
INSTANCE_PROFILE_NAME="$PODVM_INSTANCE_PROFILE"
CAA_IMAGE_TAG="$CAA_IMAGE_TAG"
SSH_KEY_ID="$(terraform output --raw ssh_key_id)"
EOF

popd

This will create a selfmanaged_cluster.properties files in the cloud-api-adaptor root directory.

The final step is to run the caa-provisioner-cli to install the operator.

export CLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties file
export TEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"
# prevent the test from removing the cloud-api-adaptor resources from the cluster
export TEST_TEARDOWN="no"
pushd test/tools
./caa-provisioner-cli -action=install
popd

End-2-End Test Framework

To validate that a cluster has been setup properly, there is a suite of tests that validate peer-pods across different providers, the implementation of these tests can be found in test/e2e/common_suite_test.go).

Assuming CLOUD_PROVIDER and TEST_PROVISION_FILE are still set in your current terminal you can execute these tests from the cloud-api-adaptor root directory by running the following commands

export KUBECONFIG=$(pwd)/ibmcloud/cluster/config
make test-e2e

Uninstall and clean up

There are two options for cleaning up the environment once testing has finished, or if you want to re-install from a clean state:

  • If using a self-managed cluster you can delete the whole cluster following the Delete the cluster documentation and then start again.
  • If you instead just want to leave the cluster, but uninstall the Confidential Containers and peer pods feature, you can use the caa-provisioner-cli to remove the resources.
export CLOUD_PROVIDER=ibmcloud
# must be run from the directory containing the properties file
export TEST_PROVISION_FILE="$(pwd)/selfmanaged_cluster.properties"
pushd test/tools
./caa-provisioner-cli -action=uninstall
popd

7.5 - Libvirt

Documentation for peerpods on Libvirt

Introduction

This document contains instructions for using, developing and testing the cloud-api-adaptor with libvirt.

Creating an end-to-end environment for testing and development

In this section you will learn how to setup an environment in your local machine to run peer pods with the libvirt cloud API adaptor. Bear in mind that many different tools can be used to setup the environment and here we just make suggestions of tools that seems used by most of the peer pods developers.

Requirements

You must have a Linux/KVM system with libvirt installed and the following tools:

Assume that you have a ‘default’ network and storage pools created in libvirtd system instance (qemu:///system). However, if you have a different pool name then the scripts should be able to handle it properly.

Create the Kubernetes cluster

Use the kcli_cluster.sh script to create a simple two VMs (one control plane and one worker) cluster with the kcli tool, as:

./kcli_cluster.sh create

With kcli_cluster.sh you can configure the libvirt network and storage pools that the cluster VMs will be created, among other parameters. Run ./kcli_cluster.sh -h to see the help for further information.

If everything goes well you will be able to see the cluster running after setting your Kubernetes config with:

export KUBECONFIG=$HOME/.kcli/clusters/peer-pods/auth/kubeconfig

For example, shown below:

$ kcli list kube
+-----------+---------+-----------+-----------------------------------------+
|  Cluster  |   Type  |    Plan   |                  Vms                    |
+-----------+---------+-----------+-----------------------------------------+
| peer-pods | generic | peer-pods | peer-pods-ctlplane-0,peer-pods-worker-0 |
+-----------+---------+-----------+-----------------------------------------+

$ kubectl get nodes
NAME                 STATUS   ROLES                  AGE     VERSION
peer-pods-ctlplane-0 Ready    control-plane,master   6m8s    v1.25.3
peer-pods-worker-0   Ready    worker                 2m47s   v1.25.3

Prepare the Pod VM volume

In order to build the Pod VM without installing the build tools, you can use the Dockerfiles hosted on ../podvm directory to run the entire process inside a container. Refer to podvm/README.md for further details. Alternatively you can consume pre-built podvm images as explained here.

Next you will need to create a volume on libvirt’s system storage and upload the image content. That volume is used by the cloud-api-adaptor program to instantiate a new Pod VM. Run the following commands:

export IMAGE=<full-path-to-qcow2>

virsh -c qemu:///system vol-create-as --pool default --name podvm-base.qcow2 --capacity 20G --allocation 2G --prealloc-metadata --format qcow2
virsh -c qemu:///system vol-upload --vol podvm-base.qcow2 $IMAGE --pool default --sparse

You should see that the podvm-base.qcow2 volume was properly created:

$ virsh -c qemu:///system vol-info --pool default podvm-base.qcow2
Name:           podvm-base.qcow2
Type:           file
Capacity:       6.00 GiB
Allocation:     631.52 MiB

Install and configure Confidential Containers and cloud-api-adaptor in the cluster

The easiest way to install the cloud-api-adaptor along with Confidential Containers in the cluster is through the Kubernetes operator available in the install directory of this repository.

Start by creating a public/private RSA key pair that will be used by the cloud-api-provider program, running on the cluster workers, to connect with your local libvirtd instance without password authentication. Assume you are in the libvirt directory, do:

cd ../install/overlays/libvirt
ssh-keygen -f ./id_rsa -N ""
cat id_rsa.pub >> ~/.ssh/authorized_keys

Note: ensure that ~/.ssh/authorized_keys has the right permissions (read/write for the user only) otherwise the authentication can silently fail. You can run chmod 600 ~/.ssh/authorized_keys to set the right permissions.

You will need to figure out the IP address of your local host (e.g. 192.168.122.1). Then try to remote connect with libvirt to check the keys setup is fine, for example:

$ virsh -c "qemu+ssh://$USER@192.168.122.1/system?keyfile=$(pwd)/id_rsa" nodeinfo
CPU model:           x86_64
CPU(s):              12
CPU frequency:       1084 MHz
CPU socket(s):       1
Core(s) per socket:  6
Thread(s) per core:  2
NUMA cell(s):        1
Memory size:         32600636 KiB

Now you should finally install the Kubernetes operator in the cluster with the help of the install_operator.sh script. Ensure that you have your IP address exported in the environment, as shown below, then run the install script:

cd ../../../libvirt/
export LIBVIRT_IP="192.168.122.1"
export SSH_KEY_FILE="id_rsa"
./install_operator.sh

If everything goes well you will be able to see the operator’s controller manager and cloud-api-adaptor Pods running:

$ kubectl get pods -n confidential-containers-system
NAME                                              READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-5df7584679-5dbmr   2/2     Running   0          3m58s
cloud-api-adaptor-daemonset-vgj2s                 1/1     Running   0          3m57s

$ kubectl logs pod/cloud-api-adaptor-daemonset-vgj2s -n confidential-containers-system
+ exec cloud-api-adaptor libvirt -uri 'qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1' -data-dir /opt/data-dir -pods-dir /run/peerpod/pods -network-name default -pool-name default -socket /run/peerpod/hypervisor.sock
2022/11/09 18:18:00 [helper/hypervisor] hypervisor config {/run/peerpod/hypervisor.sock  registry.k8s.io/pause:3.7 /run/peerpod/pods libvirt}
2022/11/09 18:18:00 [helper/hypervisor] cloud config {qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1 default default /opt/data-dir}
2022/11/09 18:18:00 [helper/hypervisor] service config &{qemu+ssh://wmoschet@192.168.122.1/system?no_verify=1 default default /opt/data-dir}

You will also notice that Kubernetes runtimeClass resources were created on the cluster, as for example:

$ kubectl get runtimeclass
NAME          HANDLER       AGE
kata-remote   kata-remote   7m18s

Create a sample peer-pods pod

At this point everything should be fine to get a sample Pod created. Let’s first list the running VMs so that we can later check the Pod VM will be really running. Notice below that we got only the cluster node VMs up:

$ virsh -c qemu:///system list
 Id   Name                   State
------------------------------------
 3    peer-pods-ctlplane-0   running
 4    peer-pods-worker-0     running

Create the sample_busybox.yaml file with the following content:

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: busybox
  name: busybox
spec:
  containers:
  - image: quay.io/prometheus/busybox
    name: busybox
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Never
  runtimeClassName: kata-remote

And create the Pod:

$ kubectl apply -f sample_busybox.yaml
pod/busybox created

$ kubectl wait --for=condition=Ready pod/busybox
pod/busybox condition met

Check that the Pod VM is up and running. See on the following listing that podvm-busybox-88a70031 was created:

$ virsh -c qemu:///system list
 Id   Name                     State
----------------------------------------
 5    peer-pods-ctlplane-0     running
 6    peer-pods-worker-0       running
 7    podvm-busybox-88a70031   running

You should also check that the container is running fine. For example, compare the kernels are different as shown below:

$ uname -r
5.17.12-100.fc34.x86_64

$ kubectl exec pod/busybox -- uname -r
5.4.0-131-generic

The peer-pods pod can be deleted as any regular pod. On the listing below the pod was removed and you can note that the Pod VM no longer exists on Libvirt:

$ kubectl delete -f sample_busybox.yaml
pod "busybox" deleted

$ virsh -c qemu:///system list
 Id   Name                 State
------------------------------------
 5    peer-pods-ctlplane-0 running
 6    peer-pods-worker-0   running

Delete Confidential Containers and cloud-api-adaptor from the cluster

You might want to reinstall the Confidential Containers and cloud-api-adaptor into your cluster. There are two options:

  1. Delete the Kubernetes cluster entirely and start over. In this case you should just run ./kcli_cluster.sh delete to wipe out the cluster created with kcli
  2. Uninstall the operator resources then install them again with the install_operator.sh script

Let’s show you how to delete the operator resources. On the listing below you can see the actual pods running on the confidential-containers-system namespace:

$ kubectl get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn   2/2     Running   0          20h
cc-operator-daemon-install-fkkzz                 1/1     Running   0          20h
cloud-api-adaptor-daemonset-libvirt-lxj7v        1/1     Running   0          20h

In order to remove the *-daemon-install-* and *-cloud-api-adaptor-daemonset-* pods, run the following command from the root directory:

CLOUD_PROVIDER=libvirt make delete

It can take some minutes to get those pods deleted, afterwards you will notice that only the controller-manager is still up. Below is shown how to delete that pod and associated resources as well:

$ kubectl get pods -n confidential-containers-system
NAME                                             READY   STATUS    RESTARTS   AGE
cc-operator-controller-manager-fbb5dcf9d-h42nn   2/2     Running   0          20h

$ kubectl delete -f install/yamls/deploy.yaml
namespace "confidential-containers-system" deleted
serviceaccount "cc-operator-controller-manager" deleted
role.rbac.authorization.k8s.io "cc-operator-leader-election-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-manager-role" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-metrics-reader" deleted
clusterrole.rbac.authorization.k8s.io "cc-operator-proxy-role" deleted
rolebinding.rbac.authorization.k8s.io "cc-operator-leader-election-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-manager-rolebinding" deleted
clusterrolebinding.rbac.authorization.k8s.io "cc-operator-proxy-rolebinding" deleted
configmap "cc-operator-manager-config" deleted
service "cc-operator-controller-manager-metrics-service" deleted
deployment.apps "cc-operator-controller-manager" deleted
customresourcedefinition.apiextensions.k8s.io "ccruntimes.confidentialcontainers.org" deleted

$ kubectl get pods -n confidential-containers-system
No resources found in confidential-containers-system namespace.