Releases

Release v0.8.0

Release Notes Confidential Containers v0.8.0

Please see the quickstart guide for details on how to try out Confidential Containers.

Please refer to our Acronyms and Glossary pages for a definition of the acronyms used in this document.

What’s new

  • Upstream containerd supported by all deployment types except enclave-cc.
    • This release includes the Nydus snapshotter (for the first time) to support upstream containerd.
    • In this release images are still pulled inside the guest.
    • Nydus snapshotter requires the following annotation for each pod io.containerd.cri.runtime-handler: <runtime-class>.
    • Support for Nydus snapshotter in peer pods is still experimental. To avoid using it with peer pods do not set above annotation.
    • Nydus snapshotter support in general is still evolving. See limitations section below for details.
  • A new component, the Confidential Data Hub (CDH) is now deployed inside the guest.
    • CDH is an evolution of the Attestation Agent that supports advanced features.
    • CDH supports sealed Kubernetes secrets which are managed by the control plane, but securely unwrapped inside the enclave.
    • CDH supports connections to both KBS and KMS.
  • New architecture of Attestation Agent and CDH allows a client to deploy multiple KBSes.
    • One KBS can be used for validating evidence with the Attestation Service while another can provide resources.
  • Pulling from an authenticated registry now requires imagePullSecrets.

Peer Pods

  • peerpod-ctl tool has been expanded.
    • Can check and clean old peerpod objects
    • Adds SSH authentication support to libvirt provider
    • Supports IBM cloud
  • Support for secure key release at runtime and image decryption via remote attestation on AKS
  • Added AMD SEV and IBM s390x support for the Libvirt provider
  • Container registry authentication now bootstrapped from userdata.
  • Enabled public IP usage for pod VM on AWS and PowerVS providers
  • webhook: added IBM ppc64le platform support
  • Support adding custom tags to podvm instances
  • Switched to launching CVM by default on AWS and Azure providers
  • Added rollingUpdate strategy in cloud-api-adaptor daemonset
  • Disabled secureboot by default

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV(-ES)
  • Intel SGX

The following platforms are untested or partially supported:

  • IBM Secure Execution (SE) on IBM zSystems (s390x) running LinuxONE
  • AMD SEV-SNP
  • ARM CCA

Limitations

The following are known limitations of this release:

  • Nydus snapshotter support is not mature.
    • Nydus snapshot sometimes conflicts with existing node configuration.
    • You may need to remove existing container images/snapshots before installing Nydus snapshotter.
    • Nydus snapshotter may not support pulling one image with multiple runtime handler annotations even across different pods.
    • Host pulling with Nydus snapshotter is not yet enabled.
    • Nydus snapshotter is not supported with enclave-cc.
  • Pulling container images inside guest may have negative performance implications including greater resource usage and slower startup.
  • crio support is still evolving.
  • Platform support is rapidly changing
    • Image signature validation with AMD SEV-ES is not covered by CI.
  • SELinux is not supported on the host and must be set to permissive if in use.
  • The generic KBS does not yet supported all platforms.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Not all image repositories support encrypted container images. Complete integration with Kubernetes is still in progress.
    • OpenShift support is not yet complete.
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which improved to 69% at the time of this release.
    • Vulnerability reporting mechanisms still need to be created. Public github issues are still appropriate for this release until private reporting is established.
  • Container metadata such as environment variables are not measured.
  • Kata Agent does not validate mount requests. A malicious host might be able to mount a shared filesystem into the PodVM.

CVE Fixes

None

Release v0.7.0

Release Notes Confidential Containers v0.7.0

Please see the quickstart guide for details on how to try out Confidential Containers.

Please refer to our Acronyms and Glossary pages for a definition of the acronyms used in this document.

What’s new

  • Flexible instance types/profiles support for peer-pods
  • Ability to use CSI Persistent Volume with peer-pods on Azure and IBM Cloud
  • EAA-KBC/Verdictd support removed from enclave-cc
  • Baremetal SNP without attestation available via operator
  • Guest components (attestation-agent, image-rs and ocicrypt-rs) merged into one repository
  • Documentation and community repositories merged together

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV(-ES)
  • Intel SGX

The following platforms are untested or partially supported:

  • IBM Secure Execution (SE) on IBM zSystems (s390x) running LinuxONE
  • AMD SEV-SNP

The following platforms are in development:

  • ARM CCA

Limitations

The following are known limitations of this release:

  • Platform support is rapidly changing
    • Image signature validation with AMD SEV-ES is not covered by CI.
  • SELinux is not supported on the host and must be set to permissive if in use.
  • The generic KBS does not yet supported all platforms.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Not all image repositories support encrypted container images.
  • CoCo currently requires a custom build of containerd, which is installed by the operator.
    • Codepath for pulling images will change significantly in future releases.
    • crio is only supported with cloud-api-adaptor.
  • Complete integration with Kubernetes is still in progress.
    • OpenShift support is not yet complete.
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container images must be downloaded separately (inside guest) for each pod. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which remained at 64% at the time of this release.
    • Vulnerability reporting mechanisms still need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.6.0

Release Notes Confidential Containers v0.6.0

Please see the quickstart guide for details on how to try out Confidential Containers.

Please refer to our Acronyms and Glossary pages for a definition of the acronyms used in this document.

What’s new

  • Support for attesting pod VMs with Azure vTPMs on SEV-SNP
  • Support for using Project Amber as an attestation service
  • Support for Cosign signature validation with s390x
  • Pulling guest images with many layers can no longer cause guest CPU starvation.
  • Attestation Service upgraded to avoid several security issues in Go packages.
  • CC-KBC & KBS support with SGX attester/verifier for Occlum and CI for enclave-cc

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV(-ES)
  • Intel SGX

The following platforms are untested or partially supported:

  • IBM Secure Execution (SE) on IBM zSystems (s390x) running LinuxONE
  • AMD SEV-SNP

The following platforms are in development:

  • ARM CCA

Limitations

The following are known limitations of this release:

  • Platform support is rapidly changing
    • Image signature validation with AMD SEV-ES is not covered by CI.
  • SELinux is not supported on the host and must be set to permissive if in use.
  • The generic KBS does not yet supported all platforms.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Not all image repositories support encrypted container images.
  • CoCo currently requires a custom build of containerd, which is installed by the operator.
    • Codepath for pulling images will change significantly in future releases.
    • crio is only supported with cloud-api-adaptor.
  • Complete integration with Kubernetes is still in progress.
    • OpenShift support is not yet complete.
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container images must be downloaded separately (inside guest) for each pod. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which remained at 64% at the time of this release.
    • Vulnerability reporting mechanisms still need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.5.0

Release Notes Confidential Containers v0.5.0

Please see the quickstart guide for details on how to try out Confidential Containers.

Please refer to our Acronyms and Glossary pages for a definition of the acronyms used in this document.

What’s new

  • Process-based isolation is now fully supported with SGX hardware added to enclave-cc CI

  • Remote hypervisor support added to the CoCo operator, which helps to enable creating containers as ‘peer pods’, either locally, or on Cloud Service Provider Infrastructure. See README for more information and installation instructions.

  • KBS Resource URI Scheme is published to identify all confidential resources.

  • Different KBCs now share image encryption format allowing for interchangeable use.

  • Generic Key Broker System (KBS) is now supported. This includes the KBS itself, which relies on the Attestation Service (AS) for attestation evidence verification. Reference Values are provided to the AS by the Reference Value Provider Service (RVPS). Currently only TDX and a sample mode are supported with generic KBS. Other platforms are in development.

  • SEV configuration can be set with annotations.

  • SEV-ES is now tested in the CI.

  • Some developmental SEV-SNP components can be manually enabled to test SNP containers without attestation.

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV(-ES)
  • Intel SGX

The following platforms are untested or partially supported:

  • IBM Secure Execution (SE) on IBM zSystems (s390x) running LinuxONE

The following platforms are in development:

  • AMD SEV-SNP

Limitations

The following are known limitations of this release:

  • Platform support is currently limited, and rapidly changing
    • Image signature validation with AMD SEV-ES is not covered by CI.
    • s390x does not support cosign signature validation
  • SELinux is not supported on the host and must be set to permissive if in use.
  • Attestation and key brokering support varies by platform.
    • The generic KBS is only supported on TDX. Other platforms have different solutions.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Image repository support for encrypted images is unequal
  • CoCo currently requires a custom build of containerd
    • The CoCo operator will deploy the correct version of containerd for you
    • Changes are required to delegate PullImage to the agent in the virtual machine
    • The required changes are not part of the vanilla containerd
    • The final form of the required changes in containerd is expected to be different
    • crio is not supported
  • CoCo is not fully integrated with the orchestration ecosystem (Kubernetes, OpenShift)
    • OpenShift support is not yet complete.
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container image sharing is not possible in this release
    • Container images are downloaded by the guest (with encryption), not by the host
    • As a result, the same image will be downloaded separately by every pod using it, not shared between pods on the same host. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which increased from 49% to 64% at the time of this release.
    • All CoCo repos now have automated tests, including linting, incorporated into CI.
    • Vulnerability reporting mechanisms still need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.4.0

Release Notes Confidential Containers v0.4.0

Please see the quickstart guide for details on how to try out Confidential Containers.

Please refer to our Acronyms and Glossary pages for a definition of the acronyms used in this document.

What’s new

  • This release focused on reducing technical debt. You will not observe as many new features in this release but you will be running on top of more robust code.
  • Skopeo and umoci dependencies are removed with our image-rs component fully integrated
  • Improved CI for SEV
  • Improved container support for enclave-cc / SGX

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV

The following platforms are untested or partially supported:

  • Intel SGX
  • AMD SEV-ES
  • IBM Secure Execution (SE) on IBM zSystems (s390x) running LinuxONE

The following platforms are in development:

  • AMD SEV-SNP

Limitations

The following are known limitations of this release:

  • Platform support is currently limited, and rapidly changing
    • AMD SEV-ES is not tested in the CI.
    • Image signature validation has not been tested with AMD SEV.
    • s390x does not support cosign signature validation
  • SELinux is not supported on the host and must be set to permissive if in use.
  • Attestation and key brokering support is still under development
    • The disk-based key broker client (KBC) is used for non-tee testing, but is not suitable for production, except with encrypted VM images.
    • Currently, there are two key broker services (KBS) that can be used:
      • simple-kbs: simple key broker service for SEV(-ES).
      • Verdictd: An external project with which Attestation Agent can conduct remote attestation communication and key acquisition via EAA KBC
    • The full-featured generic KBS and the corresponding KBC are still in the development stage.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Image repository support for encrypted images is unequal
  • CoCo currently requires a custom build of containerd
    • The CoCo operator will deploy the correct version of containerd for you
    • Changes are required to delegate PullImage to the agent in the virtual machine
    • The required changes are not part of the vanilla containerd
    • The final form of the required changes in containerd is expected to be different
    • crio is not supported
  • CoCo is not fully integrated with the orchestration ecosystem (Kubernetes, OpenShift)
    • OpenShift is a non-starter at the moment due to its dependency on CRI-O
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container image sharing is not possible in this release
    • Container images are downloaded by the guest (with encryption), not by the host
    • As a result, the same image will be downloaded separately by every pod using it, not shared between pods on the same host. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which increased to 49% at the time of this release.
    • The main gaps are in test coverage, both general and security tests.
    • Vulnerability reporting mechanisms also need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.3.0

Release Notes Confidential Containers v0.3.0

Code Freeze: January 13th, 2023

Please see the quickstart guide for details on how to try out Confidential Containers

What’s new

  • Support for pulling images from authenticated container registries. See design info.
  • Significantly reduced resource requirements for image pulling
  • Attestation support for AMD SEV-ES
  • kata-qemu-tdx supports and has been tested with Verdictd
  • Support for get_resource endpoint with SEV(-ES)
  • Enabled cosign signature support in enclave-cc / SGX
  • SEV attestation bug fixes
  • Measured rootfs now works with kata-clh, kata-qemu, kata-clh-tdx, and kata-qemu-tdx runtime classes.
  • IBM zSystems / LinuxONE (s390x) enablement and CI verification on non-TEE environments
  • Enhanced docs, config, CI pipeline and test coverage for enclave-cc / SGX

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV

The following platforms are untested or partially supported:

  • Intel SGX
  • AMD SEV-ES
  • IBM Secure Execution (SE) on IBM zSystems & LinuxONE

The following platforms are in development:

  • AMD SEV-SNP

Limitations

The following are known limitations of this release:

  • Platform support is currently limited, and rapidly changing
    • AMD SEV-ES is not tested in the CI.
    • Image signature validation has not been tested with AMD SEV.
    • s390x does not support cosign signature validation
  • SELinux is not supported on the host and must be set to permissive if in use.
  • Attestation and key brokering support is still under development
    • The disk-based key broker client (KBC) is used for non-tee testing, but is not suitable for production, except with encrypted VM images.
    • Currently, there are two KBS that can be used:
      • simple-kbs: simple key broker service (KBS) for SEV(-ES).
      • Verdictd: An external project with which Attestation Agent can conduct remote attestation communication and key acquisition via EAA KBC
    • The full-featured generic KBS and the corresponding KBC are still in the development stage.
    • For developers, other KBCs can be experimented with.
    • AMD SEV must use a KBS even for unencrypted images.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Image repository support for encrypted images is unequal
  • CoCo currently requires a custom build of containerd
    • The CoCo operator will deploy the correct version of containerd for you
    • Changes are required to delegate PullImage to the agent in the virtual machine
    • The required changes are not part of the vanilla containerd
    • The final form of the required changes in containerd is expected to be different
    • crio is not supported
  • CoCo is not fully integrated with the orchestration ecosystem (Kubernetes, OpenShift)
    • OpenShift is a non-starter at the moment due to its dependency on CRI-O
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container image sharing is not possible in this release
    • Container images are downloaded by the guest (with encryption), not by the host
    • As a result, the same image will be downloaded separately by every pod using it, not shared between pods on the same host. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which increased to 49% at the time of this release.
    • The main gaps are in test coverage, both general and security tests.
    • Vulnerability reporting mechanisms also need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.2.0

Release Notes Confidential Containers v0.2.0

Confidential Containers has adopted a six-week release cadence. This is our first release on this schedule. This release mainly features incremental improvements to our build system and tests as well as minor features, adjustments, and cleanup.

Please see the quickstart guide for details on how to try out Confidential Containers

What’s new

  • Kata CI uses existing Kata tooling to build components.
  • Kata CI caches build environments for components.
  • Pod VM can be launched with measured boot. See more info
  • Incremental advances in signature support including verification of cosign-signed images.
  • Enclave-cc added to operator, providing initial SGX support.
  • KBS no longer required to use unencrypted images with SEV.
  • More rigorous versioning of sub-projects

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV

The following platforms are untested or partially supported:

  • Intel SGX
  • AMD SEV-ES
  • IBM Z SE

The following platforms are in development:

  • AMD SEV-SNP

Limitations

The following are known limitations of this release:

  • Platform support is currently limited, and rapidly changing
    • s390x is not supported by the CoCo operator
    • AMD SEV-ES has not been tested.
    • AMD SEV does not support container image signature validation.
    • s390x does not support cosign signature validation
  • SELinux is not supported on the host and must be set to permissive if in use.
  • Attestation and key brokering support is still under development
    • The disk-based key broker client (KBC) is used for non-tee testing, but is not suitable for production, except with encrypted VM images.
    • Currently, there are two KBS that can be used:
      • simple-kbs: simple key broker service (KBS) for SEV(-ES).
      • Verdictd: An external project with which Attestation Agent can conduct remote attestation communication and key acquisition via EAA KBC
    • The full-featured generic KBS and the corresponding KBC are still in the development stage.
    • For developers, other KBCs can be experimented with.
    • AMD SEV must use a KBS even for unencrypted images.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Image repository support for encrypted images is unequal
  • CoCo currently requires a custom build of containerd
    • The CoCo operator will deploy the correct version of containerd for you
    • Changes are required to delegate PullImage to the agent in the virtual machine
    • The required changes are not part of the vanilla containerd
    • The final form of the required changes in containerd is expected to be different
    • crio is not supported
  • CoCo is not fully integrated with the orchestration ecosystem (Kubernetes, OpenShift)
    • OpenShift is a non-starter at the moment due to its dependency on CRI-O
    • Existing APIs do not fully support the CoCo security and threat model. More info
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container image sharing is not possible in this release
    • Container images are downloaded by the guest (with encryption), not by the host
    • As a result, the same image will be downloaded separately by every pod using it, not shared between pods on the same host. More info
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which increased to 46% at the time of this release.
    • The main gaps are in test coverage, both general and security tests.
    • Vulnerability reporting mechanisms also need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None

Release v0.1.0

Release Notes Confidential Containers v0.1.0

This is the first full release of Confidential Containers. The goal of this release is to provide a stable, simple, and well-documented base for the Confidential Containers project. The Confidential Containers operator is the focal point of the release. The operator allows users to install Confidential Containers on an existing Kubernetes cluster. This release also provides core Confidential Containers features, such as being able to run encrypted containers on Intel-TDX and AMD-SEV.

Please see the quickstart guide for details on how to try out Confidential Containers"

Hardware Support

Confidential Containers is tested with attestation on the following platforms:

  • Intel TDX
  • AMD SEV

The following platforms are untested or partially supported:

  • AMD SEV-ES
  • IBM Z SE

The following platforms are in development:

  • Intel SGX
  • AMD SEV-SNP

Limitations

The following are known limitations of this release:

  • Platform support is currently limited, and rapidly changing
    • S390x is not supported by the CoCo operator
    • AMD SEV-ES has not been tested.
    • AMD SEV does not support container image signature validation.
  • Attestation and key brokering support is still under development
    • The disk-based key broker client (KBC) is used when there is no HW support, but is not suitable for production (except with encrypted VM images).
    • Currently, there are two KBS that can be used:
      • simple-kbs: simple key broker service (KBS) for SEV(-ES).
      • Verdictd: An external project with which Attestation Agent can conduct remote attestation communication and key acquisition via EAA KBC
    • The full-featured generic KBS and the corresponding KBC are still in the development stage.
    • For developers, other KBCs can be experimented with.
    • AMD SEV must use a KBS even for unencrypted images.
  • The format of encrypted container images is still subject to change
    • The oci-crypt container image format itself may still change
    • The tools to generate images are not in their final form
    • The image format itself is subject to change in upcoming releases
    • Image repository support for encrypted images is unequal
  • CoCo currently requires a custom build of containerd
    • The CoCo operator will deploy the correct version of containerd for you
    • Changes are required to delegate PullImage to the agent in the virtual machine
    • The required changes are not part of the vanilla containerd
    • The final form of the required changes in containerd is expected to be different
    • crio is not supported
  • CoCo is not fully integrated with the orchestration ecosystem (Kubernetes, OpenShift)
    • OpenShift is a non-started at the moment due to their dependency on CRIO
    • Existing APIs do not fully support the CoCo security and threat model
    • Some commands accessing confidential data, such as kubectl exec, may either fail to work, or incorrectly expose information to the host
    • Container image sharing is not possible in this release
    • Container images are downloaded by the guest (with encryption), not by the host
    • As a result, the same image will be downloaded separately by every pod using it, not shared between pods on the same host.
  • The CoCo community aspires to adopting open source security best practices, but not all practices are adopted yet.
    • We track our status with the OpenSSF Best Practices Badge, which was at 43% at the time of this release.
    • The main gaps are in test coverage, both general and security tests.
    • Vulnerability reporting mechanisms also need to be created. Public github issues are still appropriate for this release until private reporting is established.

CVE Fixes

None - This is our first release.

Confidential Containers without confidential hardware

How to use Confidential Containers without confidential hardware

Note This blog post was originally published here based on the very first versions of Confidential Containers (CoCo) which at that time was just a Proof-of-Concept (PoC) project. Since then the project evolved a lot: we managed to merge the work to the Kata Containers mainline, removed a code branch of Containerd, many new features were introduced/improved, new sub-projects emerged and the community finally reached its maturity. Thus, this new version of that blog post revisits the installation and use of CoCo on workstations without confidential hardware, taking into consideration the changes since the early versions of the project.

Introduction

The Confidential Containers (CoCo) project aims to implement a cloud-native solution for confidential computing using the most advanced trusted execution environments (TEE) technologies available from hardware vendors like AMD, IBM and Intel.

The community recognizes that not every developer has access to TEE-capable machines and we don’t want this to be a blocker for contributions. So version 0.10.0 and later come with a custom runtime that lets developers play with CoCo on either a simple virtual or bare-metal machine.

In this tutorial you will learn:

  • How to install CoCo and create a simple confidential pod on Kubernetes
  • The main features that keep your pod confidential

Since we will be using a custom runtime environment without confidential hardware, we will not be able to do real attestation implemented by CoCo, but instead will use a sample verifier, so the pod created won’t be strictly “confidential”.

A brief introduction to Confidential Containers

Confidential Containers is a sandbox project of the Cloud Native Computing Foundation (CNCF) that enables cloud-native confidential computing by taking advantage of a variety of hardware platforms and technologies, such as Intel SGX, Intel TDX, AMD SEV-SNP and IBM Secure Execution for Linux. The project aims to integrate hardware and software technologies to deliver a seamless experience to users running applications on Kubernetes.

For a high level overview of the CoCo project, please see: What is the Confidential Containers project?

What is required for this tutorial?

As mentioned above, you don’t need TEE-capable hardware for this tutorial. You will only be required to have:

  • Ubuntu 22.04 virtual or bare-metal machine with a minimum of 8GB RAM and 4 vcpus
  • Kubernetes 1.30.1 or above

It is beyond the scope of this blog to tell you how to install Kubernetes, but there are some details that should be taken into consideration:

  1. CoCo v0.10.0 was tested on Continuous Integration (CI) with Kubernetes installed via kubeadm (see here how to create a cluster with that tool). Also, some community members reported that it works fine in Kubernetes over kind.
  2. Containerd is supported and CRI-O still lack some features (e.g. encrypted images). On CI, most of the tests were executed on Kubernetes configured with Containerd, so this is the chosen container runtime for this blog.
  3. Ensure that your cluster nodes are not tainted with NoSchedule, otherwise the installation will fail. This is very common on single-node Kubernetes installed with kubeadm.
  4. Ensure that the worker nodes where CoCo will be installed have SELinux disabled as this is a current limitation (refer to the v0.10.0 limitations for further details).

How to install Confidential Containers

The CoCo runtime is bundled in a Kubernetes operator that should be deployed on your cluster.

In this section you will learn how to get the CoCo operator installed.

First, you should have the node.kubernetes.io/worker= label on all the cluster nodes that you want the runtime installed on. This is how the cluster admin instructs the operator controller about what nodes, in a multi-node cluster, need the runtime. Use the command kubectl label node NODE_NAME "node.kubernetes.io/worker=" as on the listing below to add the label:

$ kubectl get nodes
NAME    	STATUS   ROLES       	AGE   VERSION
coco-demo   Ready	control-plane   87s   v1.30.1
$ kubectl label node "coco-demo" "node.kubernetes.io/worker="
node/coco-demo labeled

Once the target worker nodes are properly labeled, the next step is to install the operator controller. You should first ensure that SELinux is disabled or in permissive mode, however, because the operator controller will attempt to restart services in your system and SELinux may deny that. Using the following sequence of commands we set SELinux to permissive and install the operator controller:

$ sudo setenforce 0
$ kubectl apply -k github.com/confidential-containers/operator/config/release?ref=v0.10.0

This will create a series of resources in the confidential-containers-system namespace. In particular, it creates a deployment with pods that all need to be running before you continue the installation, as shown below:

$ kubectl get pods -n confidential-containers-system
NAME                                          	READY   STATUS	RESTARTS   AGE
cc-operator-controller-manager-557b5cbdc5-q7wk7   2/2 	Running   0      	2m42s

The operator controller is capable of managing the installation of different CoCo runtimes through Kubernetes custom resources. In v0.10.0 release, the following runtimes are supported:

  • ccruntime - the default, Kata Containers based implementation of CoCo. This is the runtime that we will use here.
  • enclave-cc - provides process-based isolation using Intel SGX

Now it is time to install the ccruntime runtime. You should run the following commands and wait a few minutes while it downloads and installs Kata Containers and configures your node for CoCo:

$ kubectl apply -k github.com/confidential-containers/operator/config/samples/ccruntime/default?ref=v0.10.0
ccruntime.confidentialcontainers.org/ccruntime-sample created
$ kubectl get pods -n confidential-containers-system --watch
NAME                                          	READY   STATUS	RESTARTS   AGE
cc-operator-controller-manager-557b5cbdc5-q7wk7   2/2 	Running   0      	26m
cc-operator-daemon-install-q27qz              	1/1 	Running   0      	8m10s
cc-operator-pre-install-daemon-d55v2          	1/1 	Running   0      	8m35s

You can notice that it will get installed a couple of Kubernetes runtimeclasses as shown on the listing below. Each class defines a container runtime configuration as, for example, kata-qemu-tdx should be used to launch QEMU/KVM for Intel TDX hardware (similarly kata-qemu-snp for AMD SEV-SNP). For the purpose of creating a confidential pod in a non-TEE environment we will be using the kata-qemu-coco-dev runtime class.

$ kubectl get runtimeclasses
NAME             	HANDLER          	AGE
kata             	kata-qemu        	26m
kata-clh         	kata-clh         	26m
kata-qemu        	kata-qemu        	26m
kata-qemu-coco-dev    kata-qemu-coco-dev    26m
kata-qemu-sev    	kata-qemu-sev    	26m
kata-qemu-snp    	kata-qemu-snp    	26m
kata-qemu-tdx    	kata-qemu-tdx    	26m

Creating your first confidential pod

In this section we will create the bare-minimum confidential pod using a regular busybox image. Later on we will show how to use encrypted container images.

You should create the coco-demo-01.yaml file with the content:

---
apiVersion: v1
kind: Pod
metadata:
  name: coco-demo-01
  annotations:
    "io.containerd.cri.runtime-handler": "kata-qemu-coco-dev"
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
    - name: busybox
      image: quay.io/prometheus/busybox:latest
      imagePullPolicy: Always
      command:
        - sleep
        - "infinity"
  restartPolicy: Never

Then you should apply that manifest and wait for the pod to be RUNNING as shown below:

$ kubectl apply -f coco-demo-01.yaml
pod/coco-demo-01 created
$ kubectl get pods
NAME       	READY   STATUS	RESTARTS   AGE
coco-demo-01   1/1 	Running   0      	24s

Congrats! Your first Confidential Containers pod has been created and you don’t need confidential hardware!

A view of what’s going on behind the scenes

In this section we’ll show you some concepts and details of the CoCo implementation that can be demonstrated with this simple coco-demo-01 pod. Later we should be creating more complex and interesting examples.

Containers inside a confidential virtual machine (CVM)

Our confidential containers implementation is built on Kata Containers, whose most notable feature is running the containers in a virtual machine (VM), so the created demo pod is naturally isolated from the host kernel.

Currently CoCo supports launching pods with QEMU only, despite Kata Containers supporting other hypervisors. An instance of QEMU was launched to run the coco-demo-01, as you can see below:

$ ps aux | grep /opt/kata/bin/qemu-system-x86_64
root   	15892  0.8  3.6 2648004 295424 ?  	Sl   20:36   0:04 /opt/kata/bin/qemu-system-x86_64 -name sandbox-baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612 -uuid e8a3fb26-eafa-4d6b-b74e-93d0314b6e35 -machine q35,accel=kvm,nvdimm=on -cpu host,pmu=off -qmp unix:fd=3,server=on,wait=off -m 2048M,slots=10,maxmem=8961M -device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=off,addr=2,io-reserve=4k,mem-reserve=1m,pref64-reserve=1m -device virtio-serial-pci,disable-modern=true,id=serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/console.sock,server=on,wait=off -device nvdimm,id=nv0,memdev=mem0,unarmed=on -object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-ubuntu-latest-confidential.image,size=268435456,readonly=on -device virtio-scsi-pci,id=scsi0,disable-modern=true -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0 -device vhost-vsock-pci,disable-modern=true,vhostfd=4,id=vsock-1515224306,guest-cid=1515224306 -netdev tap,id=network-0,vhost=on,vhostfds=5,fds=6 -device driver=virtio-net-pci,netdev=network-0,mac=6a:e6:eb:34:52:32,disable-modern=true,mq=on,vectors=4 -rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic --no-reboot -object memory-backend-ram,id=dimm1,size=2048M -numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-6.7-136-confidential -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 console=hvc0 console=hvc1 quiet systemd.show_status=false panic=1 nr_cpus=4 selinux=0 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none -pidfile /run/vc/vm/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/pid -smp 1,cores=1,threads=1,sockets=4,maxcpus=4

The launched kernel (/opt/kata/share/kata-containers/vmlinuz-6.7-136-confidential) and guest image (/opt/kata/share/kata-containers/kata-ubuntu-latest-confidential.image), as well as QEMU (/opt/kata/bin/qemu-system-x86_64) were all installed on the host system by the CoCo operator runtime.

If you run uname -a inside the coco-demo-01 and compare with the value obtained from the host then you will notice the container is isolated by a different kernel, as shown below:

$ kubectl exec coco-demo-01 -- uname -a
Linux 6.7.0 #1 SMP Mon Sep  9 09:48:13 UTC 2024 x86_64 GNU/Linux
$ uname -a
Linux coco-demo 5.15.0-97-generic #107-Ubuntu SMP Wed Feb 7 13:26:48 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

If you were running on a platform with supported TEE then you would be able to check if the VM is enabled with confidential features, for example, memory and registers state encryption as well as hardware-based measurement and attestation.

Inside the VM, there is an agent (Kata Agent) process which responds to requests from the Kata Containers runtime to manage the containers’ lifecycle. In the next sections, we explain how that agent cooperates with other elements of the architecture to increase the confidentiality of the workload.

The host cannot see the container image

Oversimplifying, in a normal Kata Containers pod the container image is pulled by the container runtime on the host and is mounted inside the VM. The CoCo implementation changes that behavior through a chain of delegations so that the image is directly pulled from the guest VM, resulting in the host having no access to its content (except for some metadata).

If you have the ctr command in your environment then you can check that only the quay.io/prometheus/busybox’s manifest was cached in containerd’s storage as well as no rootfs directory exists in /run/kata-containers/shared/sandboxes/<pod id> as shown below:

$ sudo ctr -n "k8s.io" image check name==quay.io/prometheus/busybox:latest
REF                           	TYPE                                                  	DIGEST                                                              	STATUS       	SIZE        	UNPACKED
quay.io/prometheus/busybox:latest application/vnd.docker.distribution.manifest.list.v2+json sha256:dfa54ef35e438b9e71ac5549159074576b6382f95ce1a434088e05fd6b730bc4 incomplete (1/3) 1.0 KiB/1.2 MiB false
$ sudo find /run/kata-containers/shared/sandboxes/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/
/run/kata-containers/shared/sandboxes/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/
/run/kata-containers/shared/sandboxes/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/mounts
/run/kata-containers/shared/sandboxes/baabb31ff0c798a31bca7373f2abdbf2936375a5729a3599799c0a225f3b9612/shared

It is worth mentioning that not caching on the host has its downside, as images cannot be shared across pods, thus impacting containers brings up performance. This is an area that the CoCo community will be addressing with a better solution in upcoming releases.

Going towards confidentiality

In this section we will increase the complexity of the pod and the configuration of CoCo to showcase more features.

Adding Kata Containers agent policies

Points if you noticed on the coco-demo-01 pod example that the host owner can execute arbitrary commands in the container, potentially stealing sensitive data, which obviously goes against the confidentiality mantra of “never trust the host”. Hopefully this operation and others default behaviors can be configured by using the Kata Containers agent policy mechanism.

As an example, let’s show how to block the ExecProcessRequest endpoint of the kata-agent to deny the execution of commands in the container. First you need to encode in base64 a Rego policy file as shown below:

$ curl -s https://raw.githubusercontent.com/kata-containers/kata-containers/refs/heads/main/src/kata-opa/allow-all-except-exec-process.rego | base64 -w 0
IyBDb3B5cmlnaHQgKGMpIDIwMjMgTWljcm9zb2Z0IENvcnBvcmF0aW9uCiMKIyBTUERYLUxpY2Vuc2UtSWRlbnRpZmllcjogQXBhY2hlLTIuMAojCgpwYWNrYWdlIGFnZW50X3BvbGljeQoKZGVmYXVsdCBBZGRBUlBOZWlnaGJvcnNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBBZGRTd2FwUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ2xvc2VTdGRpblJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IENvcHlGaWxlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlQ29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlU2FuZGJveFJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IERlc3Ryb3lTYW5kYm94UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR2V0TWV0cmljc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IEdldE9PTUV2ZW50UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR3Vlc3REZXRhaWxzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTGlzdEludGVyZmFjZXNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBMaXN0Um91dGVzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTWVtSG90cGx1Z0J5UHJvYmVSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBPbmxpbmVDUFVNZW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBQYXVzZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFB1bGxJbWFnZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlYWRTdHJlYW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVTdGFsZVZpcnRpb2ZzU2hhcmVNb3VudHNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXNlZWRSYW5kb21EZXZSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXN1bWVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTZXRHdWVzdERhdGVUaW1lUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2V0UG9saWN5UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2lnbmFsUHJvY2Vzc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFN0YXJ0Q29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhcnRUcmFjaW5nUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhdHNDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTdG9wVHJhY2luZ1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFR0eVdpblJlc2l6ZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUVwaGVtZXJhbE1vdW50c1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUludGVyZmFjZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZVJvdXRlc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFdhaXRQcm9jZXNzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgV3JpdGVTdHJlYW1SZXF1ZXN0IDo9IHRydWUKCmRlZmF1bHQgRXhlY1Byb2Nlc3NSZXF1ZXN0IDo9IGZhbHNlCg==

Then you pass the policy to the runtime via io.katacontainers.config.agent.policy pod annotation. You should create the coco-demo-02.yaml file with the content:

---
apiVersion: v1
kind: Pod
metadata:
  name: coco-demo-02
  annotations:
	"io.containerd.cri.runtime-handler": "kata-qemu-coco-dev"
	io.katacontainers.config.agent.policy: IyBDb3B5cmlnaHQgKGMpIDIwMjMgTWljcm9zb2Z0IENvcnBvcmF0aW9uCiMKIyBTUERYLUxpY2Vuc2UtSWRlbnRpZmllcjogQXBhY2hlLTIuMAojCgpwYWNrYWdlIGFnZW50X3BvbGljeQoKZGVmYXVsdCBBZGRBUlBOZWlnaGJvcnNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBBZGRTd2FwUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ2xvc2VTdGRpblJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IENvcHlGaWxlUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlQ29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgQ3JlYXRlU2FuZGJveFJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IERlc3Ryb3lTYW5kYm94UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR2V0TWV0cmljc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IEdldE9PTUV2ZW50UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgR3Vlc3REZXRhaWxzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTGlzdEludGVyZmFjZXNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBMaXN0Um91dGVzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgTWVtSG90cGx1Z0J5UHJvYmVSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBPbmxpbmVDUFVNZW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBQYXVzZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFB1bGxJbWFnZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFJlYWRTdHJlYW1SZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZW1vdmVTdGFsZVZpcnRpb2ZzU2hhcmVNb3VudHNSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXNlZWRSYW5kb21EZXZSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBSZXN1bWVDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTZXRHdWVzdERhdGVUaW1lUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2V0UG9saWN5UmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU2lnbmFsUHJvY2Vzc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFN0YXJ0Q29udGFpbmVyUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhcnRUcmFjaW5nUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgU3RhdHNDb250YWluZXJSZXF1ZXN0IDo9IHRydWUKZGVmYXVsdCBTdG9wVHJhY2luZ1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFR0eVdpblJlc2l6ZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUNvbnRhaW5lclJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUVwaGVtZXJhbE1vdW50c1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZUludGVyZmFjZVJlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFVwZGF0ZVJvdXRlc1JlcXVlc3QgOj0gdHJ1ZQpkZWZhdWx0IFdhaXRQcm9jZXNzUmVxdWVzdCA6PSB0cnVlCmRlZmF1bHQgV3JpdGVTdHJlYW1SZXF1ZXN0IDo9IHRydWUKCmRlZmF1bHQgRXhlY1Byb2Nlc3NSZXF1ZXN0IDo9IGZhbHNlCg==
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
	- name: busybox
    image: quay.io/prometheus/busybox:latest
    imagePullPolicy: Always
    command:
      - sleep
      - "infinity"
  restartPolicy: Never

Create the pod, wait for it to be RUNNING, then check that kubectl cannot exec in the container. As a matter of comparison, run exec on coco-demo-01 as shown below:

$ kubectl apply -f coco-demo-02.yaml
pod/coco-demo-02 created
$ kubectl get pod
NAME       	READY   STATUS	RESTARTS   AGE
coco-demo-01   1/1 	Running   0      	16h
coco-demo-02   1/1 	Running   0      	41s
$ kubectl exec coco-demo-02 -- uname -a
error: Internal error occurred: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "69dc5c8c5b36f0afa330e9ddcc022117203493826cfd95c7900807d6cad985fd": cannot enter container 2da76aa4f403fb234507a23d852528d839eca20ad364219306815ca90478e00f, with err rpc error: code = PermissionDenied desc = "ExecProcessRequest is blocked by policy: ": unknown
$ kubectl exec coco-demo-01 -- uname -a
Linux 6.7.0 #1 SMP Mon Sep  9 09:48:13 UTC 2024 x86_64 GNU/Linux

Great! Now you know how to set a policy for the kata-agent. You can read more about that feature here.

But you might be asking yourself: “what if a malicious agent modifies or simply drops the policy annotation?” The short answer is: “let’s attest its integrity!”. In the next section we will be talking about the role of remote attestation on CoCo.

Attesting them all!

Since the early versions of the CoCo project, attestation has certainly changed and that has received a lot of attention from the community. The attestation area is quite complex to explain it fully in this blog, so we recommend that you read the following documents before continue:

Preparing an KBS

In this blog we will be deploying a development/test version of the Key Broker Server (KBS) on the same Kubernetes as CoCo runtime is running. In production, KBS should be running on a trusted environment instead, and possibly you want to use our trustee operator or maybe even your own implementation.

The following instructions will end up with KBS installed on your cluster and having its service exposed via nodeport. For further information about deploying the KBS on Kubernetes, see this README. So do:

$ git clone https://github.com/confidential-containers/trustee --single-branch -b v0.10.1
$ cd trustee/kbs/config/kubernetes
$ echo "somesecret" > overlays/$(uname -m)/key.bin
$ export DEPLOYMENT_DIR=nodeport
$ ./deploy-kbs.sh
<output omitted>
$ export KBS_PRIVATE_KEY="${PWD}/base/kbs.key"

Wait the KBS deployment be ready and running just like below:

$ kubectl -n coco-tenant get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
kbs	1/1 	1        	1       	26m

You will need the KBS host and port to configure the pod. These values can be obtained like in below listing:

$ export KBS_HOST=$(kubectl get nodes -o jsonpath='{.items[0].status.addresses[?(@.type=="InternalIP")].address}' -n coco-tenant)
$ export KBS_PORT=$(kubectl get svc "kbs" -n "coco-tenant" -o jsonpath='{.spec.ports[0].nodePort}')

At this point KBS is up and running but lacks policies and resources. To facilitate its configuration we will be using the kbs-client tool. Use the oras tool to download a build of kbs-client:

$ curl -LOs "https://github.com/oras-project/oras/releases/download/v1.2.0/oras_1.2.0_linux_amd64.tar.gz"
$ tar xvzf oras_1.2.0_linux_amd64.tar.gz
$ ./oras pull ghcr.io/confidential-containers/staged-images/kbs-client:sample_only-x86_64-linux-gnu-68607d4300dda5a8ae948e2562fd06d09cbd7eca
$ chmod +x kbs-client

The KBS address is passed to the kata-agent within the confidential VM via kernel_params annotation (io.katacontainers.config.hypervisor.kernel_params). You should set the agent.aa_kbc_params parameter to cc_kbc::http://host:port where host:port represents the address.

As an example, create coco-demo-03.yaml with the content below but replace http://192.168.122.153:31491 with the http://$KBS_HOST:$KBS_PORT value that matches the KBS address of your installation:

---
apiVersion: v1
kind: Pod
metadata:
  name: coco-demo-03
  annotations:
	"io.containerd.cri.runtime-handler": "kata-qemu-coco-dev"
	io.katacontainers.config.hypervisor.kernel_params: " agent.aa_kbc_params=cc_kbc::http://192.168.122.153:31491"
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
	- name: busybox
    image: quay.io/prometheus/busybox:latest
    imagePullPolicy: Always
    command:
      - sleep
      - "infinity"
  restartPolicy: Never

If you apply coco-demo-03.yaml then the pod should run as expected, but nothing interesting really (apparently) happens. However, we have everything in place to dive in some features on the next sections.

Getting resources from the Confidential Data Hub

The Confidential Data Hub (CDH) is a service running inside the confidential VM that shares the same networking namespace of the pod, meaning that its REST API can be accessed from within a confidential container. Among the services provided by CDH, there is the sealed secrets unsealing (not covered on this blog) and getting resources from KBS. In this section we will show the latter as a tool for learning more about the attestation flow.

Let’s add the to-be-fetched resource to the KBS first. Think of that resource as a secret key required to unencrypt an important file for data processing. Using kbs-client, do the following (KBS_HOST, KBS_PORT and KBS_PRIVATE_KEY are previously defined variables):

$ echo "MySecretKey" > secret.txt
$ ./kbs-client --url "http://$KBS_HOST:$KBS_PORT" config --auth-private-key "$KBS_PRIVATE_KEY" set-resource --path default/secret/1 --resource-file secret.txt
Set resource success
 resource: TXlTZWNyZXRLZXkK

The CDH service listens at 127.0.0.1:8006 address. So you just need to modify the coco-demo-03.yaml file to run the wget -O- http://127.0.0.1:8006/cdh/resource/default/secret/1 command:

---
apiVersion: v1
kind: Pod
metadata:
  name: coco-demo-04
  annotations:
	"io.containerd.cri.runtime-handler": "kata-qemu-coco-dev"
	io.katacontainers.config.hypervisor.kernel_params: " agent.aa_kbc_params=cc_kbc::http://192.168.122.153:31491"
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
	- name: busybox
    image: quay.io/prometheus/busybox:latest
    imagePullPolicy: Always
    command:
      - sh
      - -c
      - |
        wget -O- http://127.0.0.1:8006/cdh/resource/default/secret/1; sleep infinity        
  restartPolicy: Never

Apply coco-demo-04.yaml and wait for it to get into RUNNING state. Checking the pod logs you will notice that wget failed to fetch the secret:

$ kubectl apply -f coco-demo-04.yaml
pod/coco-demo-04 created
$ kubectl wait --for=condition=Ready pod/coco-demo-04
pod/coco-demo-04 condition met
$ kubectl logs pod/coco-demo-04
Connecting to 127.0.0.1:8006 (127.0.0.1:8006)
wget: server returned error: HTTP/1.1 500 Internal Server Error

Looking at the KBS logs we can find that the problem was caused by Resource not permitted denial:

$ kubectl logs -l app=kbs -n coco-tenant
Defaulted container "kbs" out of: kbs, copy-config (init)
[2024-11-07T20:04:32Z INFO  kbs::http::resource] Get resource from kbs:///default/secret/1
[2024-11-07T20:04:32Z ERROR kbs::http::error] Resource not permitted.
[2024-11-07T20:04:32Z INFO  actix_web::middleware::logger] 10.244.0.1 "GET /kbs/v0/resource/default/secret/1 HTTP/1.1" 401 112 "-" "attestation-agent-kbs-client/0.1.0" 0.000351
[2024-11-07T20:04:32Z INFO  kbs::http::attest] Auth API called.
[2024-11-07T20:04:32Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000125
[2024-11-07T20:04:32Z INFO  kbs::http::attest] Attest API called.
[2024-11-07T20:04:32Z INFO  attestation_service] Sample Verifier/endorsement check passed.
[2024-11-07T20:04:32Z INFO  attestation_service] Policy check passed.
[2024-11-07T20:04:32Z INFO  attestation_service] Attestation Token (Simple) generated.
[2024-11-07T20:04:32Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 200 2171 "-" "attestation-agent-kbs-client/0.1.0" 0.001940

You need to configure a permissive resources policy in the KBS because you aren’t running on a real TEE, hence the attestation verification failed. See some sample policies for more examples. Create an sample permissive resources_policy.rego file (Note: do not use this in production as it doesn’t validate confidential hardware) with content:

package policy

default allow = false

allow {
    input["tee"] == "sample"
}

The GetResource request to CDH is an attested operation. The policy in resources_policy.rego will deny access to any resources by default, but release it in case the request came from a sample TEE.

Apply the resources_policy.rego policy to the KBS, then respin the coco-demo-04 pod, and you will see MySecretKey is now fetched:

$ ./kbs-client --url "http://$KBS_HOST:$KBS_PORT" config --auth-private-key "$KBS_PRIVATE_KEY" set-resource-policy --policy-file resources_policy.rego
Set resource policy success
 policy: cGFja2FnZSBwb2xpY3kKCmRlZmF1bHQgYWxsb3cgPSBmYWxzZQoKYWxsb3cgewogICAgaW5wdXRbInRlZSJdID09ICJzYW1wbGUiCn0K
$ kubectl apply -f coco-demo-04.yaml
pod/coco-demo-04 created
$ kubectl wait --for=condition=Ready pod/coco-demo-04
pod/coco-demo-04 condition met
$ kubectl logs pod/coco-demo-04
Connecting to 127.0.0.1:8006 (127.0.0.1:8006)
writing to stdout
-                	100% |********************************|	12  0:00:00 ETA
written to stdout
MySecretKey

In the KBS log messages below you can see that the Attestation Service (AS) was involved in the request. A sample verifier was invoked in the place of a real hardware-oriented one for the sake of emulating the verification process. The generated attestation token (see Attestation Token (Simple) generated message in the log) is passed all the way back to the CDH on the confidential VM, which then can finally request the resource (the Get resource from kbs:///default/secret/1 message) from KBS.

$ kubectl logs -l app=kbs -n coco-tenant
Defaulted container "kbs" out of: kbs, copy-config (init)
[2024-11-07T22:04:22Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/auth HTTP/1.1" 200 74 "-" "attestation-agent-kbs-client/0.1.0" 0.000185
[2024-11-07T22:04:22Z INFO  kbs::http::attest] Attest API called.
[2024-11-07T22:04:22Z INFO  attestation_service] Sample Verifier/endorsement check passed.
[2024-11-07T22:04:22Z INFO  attestation_service] Policy check passed.
[2024-11-07T22:04:22Z INFO  attestation_service] Attestation Token (Simple) generated.
[2024-11-07T22:04:22Z INFO  actix_web::middleware::logger] 10.244.0.1 "POST /kbs/v0/attest HTTP/1.1" 200 2171 "-" "attestation-agent-kbs-client/0.1.0" 0.001931
[2024-11-07T22:04:22Z WARN  kbs::token::coco] No Trusted Certificate in Config, skip verification of JWK cert of Attestation Token
[2024-11-07T22:04:22Z INFO  kbs::http::resource] Get resource from kbs:///default/secret/1
[2024-11-07T22:04:22Z INFO  kbs::http::resource] Resource access request passes policy check.
[2024-11-07T22:04:22Z INFO  actix_web::middleware::logger] 10.244.0.1 "GET /kbs/v0/resource/default/secret/1 HTTP/1.1" 200 504 "-" "attestation-agent-kbs-client/0.1.0" 0.001104

What we have shown in this section is obviously a simplification of the workflows involving the guest (CDH, attestation agent, etc..) and server-side (KBS, attestation service, etc…) components. The most important lesson here is that even if you are running from your workstation without any TEE hardware, the workflows will still be exercised just by using a sample attestation verifier that always returns true.

Using encrypted container images

Let us show one more example of another feature that you can give it a try on your workstation; support for running workloads from encrypted container images.

For this example we will use an image already encrypted (ghcr.io/confidential-containers/test-container:multi-arch-encrypted) that is used in CoCo’s development tests. Create the coco-demo-05.yaml file with following content:

---
apiVersion: v1
kind: Pod
metadata:
  name: coco-demo-05
  annotations:
	"io.containerd.cri.runtime-handler": "kata-qemu-coco-dev"
	io.katacontainers.config.hypervisor.kernel_params: " agent.aa_kbc_params=cc_kbc::http://192.168.122.153:31491"
spec:
  runtimeClassName: kata-qemu-coco-dev
  containers:
    - name: ssh-demo
      image: ghcr.io/confidential-containers/test-container:multi-arch-encrypted
      imagePullPolicy: Always
      command:
        - sleep
        - "infinity"
  restartPolicy: Never

Apply the pod, wait a little bit and you will see it failed to start with StartError status:

$ kubectl describe pods/coco-demo-05
Name:            	coco-demo-05
Namespace:       	default
Priority:        	0
Runtime Class Name:  kata-qemu-coco-dev
Service Account: 	default
Node:            	coco-demo/192.168.122.153
Start Time:      	Mon, 11 Nov 2024 15:39:08 +0000
Labels:          	<none>
Annotations:     	io.containerd.cri.runtime-handler: kata-qemu-coco-dev
                 	io.katacontainers.config.hypervisor.kernel_params:  agent.aa_kbc_params=cc_kbc::http://192.168.122.153:31491
Status:          	Failed
IP:              	10.244.0.15
IPs:
  IP:  10.244.0.15
Containers:
  ssh-demo:
	Container ID:  containerd://a94522e7f9ed08b9384874be7b696fbd25998ed0fc24f7c13fc7f8167fb06c80
	Image:     	ghcr.io/confidential-containers/test-container:multi-arch-encrypted
	Image ID:  	ghcr.io/confidential-containers/test-container@sha256:96d19d2729d83379c8ddc6b2b9551d2dbe6797632c6eb7e6f50cbadc283bfdf6
	Port:      	<none>
	Host Port: 	<none>
	Command:
  	sleep
  	infinity
	State:  	Terminated
  	Reason:   StartError
  	Message:  failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key

Caused by:
	no suitable key found for decrypting layer key:
 	keyprovider: failed to unwrap key by ttrpc

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: <unknown>
  12: <unknown>
  13: <unknown>
  14: <unknown>
  15: <unknown>: unknown
  	Exit Code:	128
  	Started:  	Thu, 01 Jan 1970 00:00:00 +0000
  	Finished: 	Mon, 11 Nov 2024 15:39:19 +0000
	Ready:      	False
	Restart Count:  0
	Environment:	<none>
	Mounts:
  	/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qkvv4 (ro)
Conditions:
  Type                    	Status
  PodReadyToStartContainers   False
  Initialized             	True
  Ready                   	False
  ContainersReady         	False
  PodScheduled            	True
Volumes:
  kube-api-access-qkvv4:
	Type:                	Projected (a volume that contains injected data from multiple sources)
	TokenExpirationSeconds:  3607
	ConfigMapName:       	kube-root-ca.crt
	ConfigMapOptional:   	<nil>
	DownwardAPI:         	true
QoS Class:               	BestEffort
Node-Selectors:          	katacontainers.io/kata-runtime=true
Tolerations:             	node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                         	node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type 	Reason 	Age	From           	Message
  ---- 	------ 	----   ----           	-------
  Normal   Scheduled  2m42s  default-scheduler  Successfully assigned default/coco-demo-05 to coco-demo
  Normal   Pulling	2m39s  kubelet        	Pulling image "ghcr.io/confidential-containers/test-container:multi-arch-encrypted"
  Normal   Pulled 	2m35s  kubelet        	Successfully pulled image "ghcr.io/confidential-containers/test-container:multi-arch-encrypted" in 3.356s (3.356s including waiting). Image size: 5581550 bytes.
  Normal   Created	2m35s  kubelet        	Created container ssh-demo
  Warning  Failed 	2m32s  kubelet        	Error: failed to create containerd task: failed to create shim task: failed to handle layer: failed to get decrypt key

Caused by:
	no suitable key found for decrypting layer key:
 	keyprovider: failed to unwrap key by ttrpc

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: <unknown>
  12: <unknown>
  13: <unknown>
  14: <unknown>
  15: <unknown>: unknown

The reason why it failed is because the decryption key wasn’t found in the KBS. So let’s insert the key:

$ echo "HUlOu8NWz8si11OZUzUJMnjiq/iZyHBJZMSD3BaqgMc=" | base64 -d > image_key.txt
$ ./kbs-client --url "http://$KBS_HOST:$KBS_PORT" config --auth-private-key "$KBS_PRIVATE_KEY" set-resource --path default/key/ssh-demo --resource-file image_key.txt
Set resource success
 resource: HUlOu8NWz8si11OZUzUJMnjiq/iZyHBJZMSD3BaqgMc=

Then restart the coco-demo-05 pod and it should get running just fine.

As demonstrated by the listing below, you can inspect the image with skopeo. Note that each of its layers is encrypted (MIMEType is tar+gzip+encrypted) and annotated with org.opencontainers.image.enc.* tags. In particular, the org.opencontainers.image.enc.keys.provider.attestation-agent annotation encodes the decryption key path (e.g. kbs:///default/key/ssh-demo) in the KBS:

$ skopeo inspect --raw docker://ghcr.io/confidential-containers/test-container:multi-arch-encrypted
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.oci.image.index.v1+json",
   "manifests": [
    {
     	"mediaType": "application/vnd.oci.image.manifest.v1+json",
     	"size": 4976,
     	"digest": "sha256:ac0ed3364d54120a1025c74f555b21cb378d4d0f62a398c5ce3e1b89fa4ca637",
     	"platform": {
          "architecture": "amd64",
          "os": "linux"
     	}
  	},
    {
     	"mediaType": "application/vnd.oci.image.manifest.v1+json",
     	"size": 4976,
     	"digest": "sha256:b9b29ea19f50c2ca814d4e0b72d1183c474b1e6d75cee61fb1ac6777bc38f688",
     	"platform": {
          "architecture": "s390x",
          "os": "linux"
    }
    }
   ]
$ skopeo inspect --raw docker://ghcr.io/confidential-containers/test-container@sha256:ac0ed3364d54120a1025c74f555b21cb378d4d0f62a398c5ce3e1b89fa4ca637 | jq -r '.layers[0]'
{
  "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip+encrypted",
  "digest": "sha256:766192d6b7d02409620f7c1f09bfa11fc273a2ec2e6383a56fb5fcf52d94d73e",
  "size": 2829647,
  "annotations": {
	"org.opencontainers.image.enc.keys.provider.attestation-agent": "eyJraWQiOiJrYnM6Ly8vZGVmYXVsdC9rZXkvc3NoLWRlbW8iLCJ3cmFwcGVkX2RhdGEiOiJxdkY5Rkh4eXZFRDZBd21IYXpGOTV1d01MUjdNb2ZoUHRHaGE4MWZyRGNrb21UNEUvekFkSHB4c1ZJcnpud0JKR3J1bVB3NDFIVUpwN0RNME9qUzlmMFhtQ1dMUWtTYkZ1Y280eGQyMFdoMFBreDdJQmdGTDduMnhKbkZiQ2V5NFNhRktXZDZ4MDJRNVd5VkVvekU3V1h1R2wwaHVNVGJHb3UxV3JIa0FzTVZwRTlYejVGcVNoTlMvUFZ4aTAyUTVGK2d6RGJZRENIb2crZ2ZqQlhNTkdlS2hNSXF6ZnM0LzhSUDRzZ1RaelV3Z3ZXMTFKUmQ2WVhHM1ZySW01NGVxbW1Pci94OG8yM2hFakIvWS85TzhvTHc9IiwiaXYiOiJoTXNWcE1ZZXRwWUtOK0pzIiwid3JhcF90eXBlIjoiQTI1NkdDTSJ9",
	"org.opencontainers.image.enc.pubopts": "eyJjaXBoZXIiOiJBRVNfMjU2X0NUUl9ITUFDX1NIQTI1NiIsImhtYWMiOiJCWXg2dUg2MEZMc2lNMUc2RUk4KzZzTlQ5QlRMN2lVamtvRWlmNVVBM09nPSIsImNpcGhlcm9wdGlvbnMiOnt9fQ=="
  }
}
$ skopeo inspect --raw docker://ghcr.io/confidential-containers/test-container@sha256:ac0ed3364d54120a1025c74f555b21cb378d4d0f62a398c5ce3e1b89fa4ca637 | jq -r '.layers[0].annotations["org.opencontainers.image.enc.keys.provider.attestation-agent"]' | base64 -d
{"kid":"kbs:///default/key/ssh-demo","wrapped_data":"qvF9FHxyvED6AwmHazF95uwMLR7MofhPtGha81frDckomT4E/zAdHpxsVIrznwBJGrumPw41HUJp7DM0OjS9f0XmCWLQkSbFuco4xd20Wh0Pkx7IBgFL7n2xJnFbCey4SaFKWd6x02Q5WyVEozE7WXuGl0huMTbGou1WrHkAsMVpE9Xz5FqShNS/PVxi02Q5F+gzDbYDCHog+gfjBXMNGeKhMIqzfs4/8RP4sgTZzUwgvW11JRd6YXG3VrIm54eqmmOr/x8o23hEjB/Y/9O8oLw=","iv":"hMsVpMYetpYKN+Js","wrap_type":"A256GCM"}

If you are curious about encryption of container images for CoCo, please refer to the Keyprovider tool.

Closing remarks

There are many features not covered in this blog that in common leverage the attestation mechanism implemented in CoCo. All these features can be tested/fixed/developed in a workstation without confidential hardware as long as the kata-qemu-coco-dev runtimeclass is employed. Thanks to all the sample and mocking implementations that our community built to overcome the dependency on specialized hardware.

Summary

In this tutorial, we have taken you through the process of deploying CoCo on a Kubernetes cluster and creating your first pod.

We have installed CoCo with a special runtime that allows you to create the pod with an encrypted image support but without having to use any confidential hardware. We also showed you some fundamental concepts of attestation and high level details of its implementation.

Policing a Sandbox

How can we provide integrity guarantees for dynamic workloads in the Confidential Containers stack?
The image is a detailed black-and-white technical drawing of a sandbox area. It features a rectangular sandbox filled with sand, with a shovel and bucket inside. To the right of the sandbox, there is a lifeguard chair. Above the sandbox, there is an overhead view of a life preserver ring. In the background, there are architectural plans and measurements for constructing the sandbox area, including various dimensions and structural details.

In a previous article we discussed how we can establish confidence in the integrity of an OS image for a confidential Guest, that is supposed to host a collocated set (Pod) of confidential containers. The topic of this article will cover the integrity of the more dynamic part of a confidential container’s lifecycle. We’ll refer to this phase as “runtime”, even though from the perspective of Container workload it might well be before the actual execution of an entrypoint or command.

The image is a diagram illustrating the components of a sandbox environment, divided into dynamic and static components. It uses color coding to differentiate between various elements: two blue squares labeled ‘Pod Sandbox’ and ‘Container’ represent the dynamic sandbox, while three green squares labeled ‘Base OS,’ ‘Kata Agent,’ and ‘Guest Components’ represent the static components. A legend at the bottom explains the color coding.

A Confidential Containers (CoCo) OS image contains static components like the Kernel and a root filesystem with a container runtime (e.g kata-agent) and auxiliary binaries to support remote attestation. We’ve seen that those can be covered in a comprehensive measurement that will remain stable across different instantiations of the same image, hosting a variety of Confidential Container workloads.

Why it’s hard

The CoCo project decided to use a Kubernetes Pod as its abstraction for confidential containerized workloads. This is a great choice from a user’s perspective, since it’s a well-known and well-supported paradigm to group resources and a user simply has to specify a specific runtimeClass for their Pod to launch it in a confidential TEE (at least that’s the premise).

For the implementers of such a solution, this choice comes with a few challenges. The most prominent one is the dynamic nature of a Pod. A Pod in OCI term is a Sandbox, in which one or more containers can be imperatively created, deleted, updated via RPC calls to the container runtime. So, instead of a concrete software that acts in reasonably predictable ways we give guarantees about something that is inherently dynamic.

This would be the sequence of RPC calls that are issued to a Kata agent in the Guest VM (for brevity we’ll refer to it as Agent in the text below), if we launch a simple Nginx Pod. There are 2 containers being launched, because a Pod includes the implicit pause container:

create_sandbox
get_guest_details
copy_file
create_container
start_container
wait_process
copy_file
...
copy_file
create_container
stats_container
start_container
stats_container

Furthermore, a Pod is a Kubernetes resource which adds a few hard-to-predict dynamic properties, examples would be SERVICE_* environment variables or admission controllers that modify a Pod spec before it’s launched. The former is maybe tolerable (although it’s not hard to come up with a scenario in which the injection of a malicious environment variable would undermine confidentiality), the latter is definitely problematic. If we assume a Pod spec to express the intent of the user to launch a given workload, we can’t blindly trust the Kubernetes Control Plane to respect that intent when deploying a CoCo Pod.

Restricting the Container Environment

In order to preserve the integrity of a Confidential Pod, we need to observe closely what’s happening in the Sandbox to ensure nothing unintended will be executed in it that would undermine confidentiality. There are various options to do that:

  1. A locked down Kubernetes Control Plane that only allows a specific set of operations on a Pod. It’s tough and implementation-heavy since the Kubernetes API is very expressive and it’s hard to predict all the ways in which a Pod spec can be modified to launch unintended workloads, but there is active research in this area.

    This could be combined with a secure channel between the user and the runtime in the TEE, that allows users to perform certain administrative tasks (e.g. view logs) from which the k8s control plane is locked out.

  2. There could be a log of all the changes that are applied to the sandbox. We can record RPC calls and their request payload into a runtime registers of a hardware TEE (e.g. TDX RTMRs or TPM PCRs), that are included in the hardware evidence. This would allow us to replay the sequence of events that led to the current state of the sandbox and verify that it’s in line with the user’s intent, before we release a confidential secret to the workload.

    However not all TEEs provide facilities for such runtime measurements and as we pointed out above: the sequence of RPC calls might be predictable, but the payload is determined by Kubernetes environment that cannot be easily predicted.

  3. We can use a combination of the above two approaches. A policy can describe a set of invariants that we expect to hold true for a Pod (e.g. a specific image layer digest) and relax certain dynamic properties that are deemed acceptable (e.g. the `SERVICE_* environment variables) or we can just flat-out reject calls to a problematic RPC endpoint (e.g. exec in container). The policy is enforced by the container runtime in the TEE on every RPC invocation.

    This is elegant as such a policy engine and core policy fragments can be developed alongside the Agent’s API, unburdening the user from understanding the intricacies of the Agent’s API. To be effective an event log as describe in option #2 would not just need to cover the API but also the underlying semantics of this API.

The image is a diagram illustrating the interaction between different components in a containerized environment. It includes two dashed-line squares labeled ‘Container’ at the top, a green and yellow block at the bottom left labeled ‘Kata Agent’ (green) and ‘Policy’ (yellow), and arrows labeled ‘RPC + Payload’ pointing between these elements.

Kata-Containers currently features an implementation of a policy engine using the popular Rego language. Convenience tooling can assist and automate aspects of authoring a policy for a workload. The following would be an example policy (hand-crafted for brevity, real policy bodies would be larger) in which we allow the launch of specific OCI images, the execution of certain commands, Kata management endpoints, but disallow pretty much everything else during runtime:

"""
package agent_policy

import future.keywords.in
import future.keywords.if
import future.keywords.every

default CopyFileRequest := true
default DestroySandboxRequest := true
default CreateSandboxRequest := true
default GuestDetailsRequest := true
default ReadStreamRequest := true
default RemoveContainerRequest := true
default SignalProcessRequest := true
default StartContainerRequest := true
default StatsContainerRequest := true
default WaitProcessRequest := true

default CreateContainerRequest := false
default ExecProcessRequest := false

CreateContainerRequest if {
	every storage in input.storages {
        some allowed_image in policy_data.allowed_images
        storage.source == allowed_image
    }
}

ExecProcessRequest if {
    input_command = concat(" ", input.process.Args)
	some allowed_command in policy_data.allowed_commands
	input_command == allowed_command
}

policy_data := {
	"allowed_commands": [
		"whoami",
		"false",
		"curl -s http://127.0.0.1:8006/aa/token?token_type=kbs",
	],
	"allowed_images": [
		"pause",
		"docker.io/library/nginx@sha256:e56797eab4a5300158cc015296229e13a390f82bfc88803f45b08912fd5e3348",
	],
}

Policies are in many cases dynamic and specific to the workload. Kata ships the genpolicy tool that will generate a reasonable default policy based on a given k8s manifest, which can be further refined by the user. A dynamic policy cannot be bundled in the rootfs, at least not fully, since it needs to be tailored to the workload. This implies we need to provide the Guest VM with the policy at launch time, in a way that allows us to trust the policy to be genuine and unaltered. In the next section we’ll discuss how we can achieve this.

Init-Data

Confidential Containers need to access configuration data that for practical reasons cannot be baked into the OS image. This data includes URIs and certificates required to access Attestation and Key Broker Services, as well as the policy that is supposed to be enforced by the policy engine. This data is not secret, but maintaining its integrity is crucial for the confidentiality of the workload.

In the CoCo project this data is referred to as Init-Data. Init-data is specified as file/content dictionary in the TOML language, optimized for easy authoring and human readability. Below is a (shortened) example of a typical Init-Data block, containing some pieces of metadata, configuration for CoCo guest components and a policy in Rego language:

algorithm = "sha256"
version = "0.1.0"

[data]
"aa.toml" = '''
[token_configs]
[token_configs.kbs]
url = 'http://my-as:8080'
cert = """
-----BEGIN CERTIFICATE-----
MIIDEjCCAfqgAwIBAgIUZYcKIJD3QB/LG0FnacDyR1KhoikwDQYJKoZIhvcNAQEL
...
4La0LJGguzEN7y9P59TS4b3E9xFyTg==
-----END CERTIFICATE-----
"""
'''

"cdh.toml"  = '''
socket = 'unix:///run/confidential-containers/cdh.sock'
credentials = []
...
'''

"policy.rego" = '''
package agent_policy
...
'''

A user is supposed to specify Init-Data to a Confidential Guest in the form of a base64-encoded string in a specific Pod annotation. The Kata Containers runtime will then pass this data on to the Agent in the Guest VM, which will decode the Init-Data and use it to configure the runtime environment of the workload. Crucially, since Init-Data is not trusted at launch we need a way to establish that the policy has not been tampered with in the process.

Integrity of Init-Data

The Init-Data body that was illustrated above contains a metadata header which specifies a hash algorithm that is supposed to be used to verify the integrity of the Init-Data. Establishing trust in provided Init-Data is not completely trivial.

Let’s start with a naive approach anyway: Upon retrieval and before applying Init-Data in the guest we can calculate a hash of that Init-Data body and stash the measurement away somewhere in encrypted and integrity protected memory. Later we could append it to the TEE’s hardware evidence as an additional fact about the environment. An Attestation-Service would take that additional fact into account and refuse to release a secret to a confidential workload if e.g. a too permissive policy was applied.

Misdemeanor in the Sandbox

We have to take a step back and look at the bigger picture to understand why this is problematic. In CoCo we are operating a sandbox, i.e. a rather liberal playground for all sorts of containers. This is by design, we want to allow users to migrate existing containerized workloads with as little friction as possible into a TEE. Now we have to assume that some of the provisioned workloads might be malicious and attempting to access secrets they should not have access to. Confidential Computing is also an effort in declaring explicit boundaries.

There are pretty strong claims that VM-based Confidential Computing is secure, because it builds on the proven isolation properties of hardware-based Virtual Machines. Those have been battle-tested in (hostile) multi-tenant environments for decades and the confidentiality boundary between a Host and Confidential Guest VM is defined along those lines.

Now, Kata Containers do provide an isolation mechanism. There is a jail for containers that employs all sorts of Linux technologies (seccomp, Namespaces, Cgroups, …) to prevent a container from breaking out of its confinement. However, containing containers is a hard problem and regularly new ways of container’s escaping their jail are discovered and exploited (adding VM-based isolation to Containers is one of the defining features for Kata Containers, after all).

Circling back to the Init-Data Measurement

The measurement is a prerequisite for accessing a confidential secret. If we keep such a record in the memory of a CoCo management process within the Guest, this would have implications for the Trust Model: A Hardware Root-of-Trust module is indispensable for Confidential Computing. A key property of that module is the strong isolation from the Guest OS. Through clearly defined interfaces it can record measurements of the guest’s software stack. Those measurements are either static or extend-only. A process in the guest VM cannot alter them freely.

The image is a diagram illustrating the interaction between different components in a virtualized environment. It uses color coding to differentiate between elements: yellow for ‘Policy,’ green for ‘Guest OS,’ pink for ‘Hardware Root of Trust,’ and blue for ‘VM Host.’ The diagram shows communication flows with dashed arrows, indicating how the ‘Policy’ within the ‘Guest OS’ interacts with the ‘Hardware Root of Trust’ through the ‘VM Host.

A measurement record in the guest VM’s software stack is not able to provide similar isolation. A process in the guest, like a malicious container, would be able to tamper with such a record and deceive a Relying Party in a Remote Attestation in order to get access to restricted secrets. A user would not only have to trust the CoCo stack to perform the correct measurements before launching a container, but they would also have to trust this stack to not be vulnerable to sandbox escapes. This is a pretty big ask.

Hence a pure software approach to establish trust in Init-Data is not desirable. We want to move the trust boundary back to the TEE and link Init-Data measurements to the TEE’s hardware evidence. There are generally 2 options to establish such a link, which one of those is chosen depends on the capabilities of the TEE:

Host-Data

Host-Data is a field in a TEE’s evidence that is passed into a confidential Guest from its Host verbatim. It’s not secret, but its integrity is guaranteed, as its part of the TEE-signed evidence body. We are generalising the term Host-Data from SEV-SNP here, a similar concept exists in other TEEs with different names. Host-Data can hold a limited amount of bytes, typically in the 32 - 64 byte range. This is enough to hold a hash of the Init-Data, calculated at the launch of the Guest. This hash can be used to verify the integrity of the Init-Data in the guest, by comparing the measurement (hash) of the Init-Data in the guest with the host-provided hash in the Host-Data field. If the hashes match, the Init-Data is considered to be intact.

Example: Producing a SHA256 digest of the Init-Data file

openssl dgst -sha256 --binary init-data.toml | xxd -p -c32
bdc9a7390bb371258fb7fb8be5a8de5ced6a07dd077d1ce04ec26e06eaf68f60

Runtime Measurements

Instead of seeding the Init-Data hash into a Host-Data field at launch, we can also extend the TEE evidence with a runtime measurement of the Init-Data directly, if the TEE allows for it. This measurement is then a part of the TEE’s evidence and can be verified as part of the TEE’s remote attestation process.

Example: Extending an empty SHA256 runtime measurement register with the digest of an Init-Data file

dd if=/dev/zero of=zeroes bs=32 count=1
openssl dgst -sha256 --binary init-data.toml > init-data.digest
openssl dgst -sha256 --binary <(cat zeroes init-data.digest) | xxd -p -c32
7aaf19294adabd752bf095e1f076baed85d4b088fa990cb575ad0f3e0569f292

Glueing Things Together

Finally, in practice a workflow would look like the steps depicted below. Note that the concrete implementation of the individual steps might vary in future revisions of CoCo (as of this writing v0.10.0 has just been released), so this is not to be taken as a reference but merely to illustrate the concept. There are practical considerations, like limitations in the size of a Pod annotation, or how Init-Data can be provisioned into a guest that might alter details of the workflow in the future.

Creating a Manifest

kubectl’s --dry-run option can be used to produce a JSON manifest for a Pod deployment, using the allow-listed image from the policy example above. We are using jq to specify a CoCo runtime class:

kubectl create deployment \
	--image="docker.io/library/nginx@sha256:e56797eab4a5300158cc015296229e13a390f82bfc88803f45b08912fd5e3348" \
	nginx-cc \
	--dry-run=client \
	-o json \
	jq '.spec.template.spec.runtimeClassName = "kata-cc"' \
	> nginx-cc.json

An Init-Data file is authored, then encoded in base64 and added to the Pod annotation before the deployment is triggered:

vim init-data.toml
INIT_DATA_B64="$(cat "init-data.toml" | base64 -w0)"
cat nginx-cc.yaml | jq \
	--arg initdata "$INIT_DATA_B64" \
	'.spec.template.metadata.annotations = { "io.katacontainers.config.runtime.cc_init_data": $initdata }' \
	| kubecl apply -f -

Testing the Policy

If the Pod came up successfully, it passed the initial policy check for the image already.

kubectl get pod
NAME                         READY   STATUS        RESTARTS   AGE
nginx-cc-694cc48b65-lklj7    1/1     Running       0          83s

According to the policy only certain commands are allowed to be executed in the container. Executing whoami should be fine, while ls should be rejected:

kubectl exec -it deploy/nginx-cc -- whoami
root
kubectl exec -it deploy/nginx-cc -- ls
error: Internal error occurred: error executing command in container: failed to
exec in container: failed to start exec "e2d8bad68b64d6918e6bda08a43f457196b5f30d6616baa94a0be0f443238980": cannot enter container 914c589fe74d1fcac834d0dcfa3b6a45562996661278b4a8de5511366d6a4609, with err rpc error: code = PermissionDenied desc = "ExecProcessRequest is blocked by policy: ": unknown

In our example we tie the Init-Data measurement to the TEE evidence using a Runtime Measurement into PCR8 of a vTPM. Assuming a 0-initalized SHA256 register, we can calculate the expected value by extend the zeroes with the SHA256 digest of the Init-Data file:

dd if=/dev/zero of=zeroes bs=32 count=1
openssl dgst -sha256 --binary init-data.toml > init-data.digest
openssl dgst -sha256 --binary <(cat zeroes init-data.digest) | xxd -p -c32
765156eda5fe806552610f2b6e828509a8b898ad014c76ad8600261eb7c5e63f

As part of the policy we also allow-listed a specific command that can request a KBS token using an endpoint that is exposed to a container by a specific Guest Component. Note: This is not something a user would want to typically enable, since this token is used to retrieve confidential secrets and we would not want it to leak outside the Guest. We are using it here to illustrate that we could retrieve a secret in the container, since we passed remote attestation including the verification of the Init-Data digest.

kubectl exec deploy/nginx-cc -- curl -s http://127.0.0.1:8006/aa/token\?token_type=kbs | jq -c 'keys'
["tee_keypair","token"]

Since this has been successful we can inspect the logs of the Attestation Service (bundled into a KBS here) to confirm it has been considered in the appraisal. The first text block shows the claims from the (successfully verified) TEE evidence, the second block is displaying the acceptable reference values for a PCR8 measurement:

kubectl logs deploy/kbs -n coco-tenant | grep -C 2 765156eda5fe806552610f2b6e828509a8b898ad014c76ad8600261eb7c5e63f
...
        "aztdxvtpm.tpm.pcr06": String("65f0a56c41416fa82d573df151746dc1d6af7bd8d4a503b2ab07664305d01e59"),
        "aztdxvtpm.tpm.pcr07": String("124daf47b4d67179a77dc3c1bcca198ae1ee1d094a2a879974842e44ab98bb06"),
        "aztdxvtpm.tpm.pcr08": String("765156eda5fe806552610f2b6e828509a8b898ad014c76ad8600261eb7c5e63f"),
        "aztdxvtpm.tpm.pcr09": String("1b094d2504eb2f06edcc94e1ffad4e141a3cd5024b885f32179d1b3680f8d88a"),
        "aztdxvtpm.tpm.pcr10": String("bb5dfdf978af8a473dc692f98ddfd6c6bb329caaa5447ac0b3bf46ef68803b17"),
--
        "aztdxvtpm.tpm.pcr08": [
            "7aaf19294adabd752bf095e1f076baed85d4b088fa990cb575ad0f3e0569f292",
            "765156eda5fe806552610f2b6e828509a8b898ad014c76ad8600261eb7c5e63f",
        ],
		"aztdxvtpm.tpm.pcr10": [],

Size Limitations

In practice there are limitations with regards to the size of init-data bodies. Especially the policy sections of such a document can reach considerable size for complex pods and thus exceed the limitations that currently exist for annotation values in a Kubernetes Pod. As of today various options are being discussed to work with this limitation, ranging from simple text-compression to more elaborate schemes.

Conclusion

In this article we discussed the challenges of ensuring the integrity of a Confidential Container workload at runtime. We’ve seen that the dynamic nature of a Pod and the Kubernetes Control Plane make it hard to predict exactly what will be executed in a TEE. We’ve discussed how a policy engine can be used to enforce invariants on such a dynamic workload and how Init-Data can be used to provision a policy into a Confidential Guest VM. Finally, we’ve seen how the integrity of Init-Data can be established by linking it to the TEE’s hardware evidence.

Thanks to Fabiano Fidêncio and Pradipta Banerjee for proof-reading and ideas for improvements!

Deploy Trustee in Kubernetes

Introduction to the trustee-operator for deploying Trustee in a Kubernetes cluster.

Introduction

In this blog, we’ll be going through the deployment of Trustee, the Key Broker Service that provides keys/secrets to clients that want to execute workloads confidentially. Trustee provides a built-in attestation service that complies to the RATS specification.

In this document, we’ll be focusing on how to deploy Trustee in Kubernetes using the Trustee operator.

Definitions

First of all, let’s introduce some definitions.

In confidential computing environments, Attestation is crucial in verifying the trustworthiness of the location where you plan to run your workload.

The Attester provides Evidence, which is evaluated and appraised to decide its trustworthiness.

The Endorser is the HW manufacturer who provides an endorsement, which the verifier uses to validate the evidence received from the attester.

The reference value provider service (RVPS) is a component in the Attestation Service (AS) responsible for storing and providing reference values.

Kubernetes deployment

The following instructions are assuming a Kubernetes cluster is set up with the Operator Lifecycle Manager (OLM) running. OLM helps users install, update, and manage the lifecycle of Kubernetes native applications (Operators) and their associated services.

kind create cluster -n trustee
# install the olm operator
kubectl create -f https://raw.githubusercontent.com/operator-framework/operator-lifecycle-manager/master/deploy/upstream/quickstart/crds.yaml
kubectl create -f https://raw.githubusercontent.com/operator-framework/operator-lifecycle-manager/master/deploy/upstream/quickstart/olm.yaml

Namespace creation

This is the default Namespace, where all the relevant Trustee objects will be created.

kubectl apply -f - << EOF
apiVersion: v1
kind: Namespace
metadata:
  name: kbs-operator-system
EOF

Operator Group

An Operator group, defined by the OperatorGroup resource, provides multi-tenant configuration to OLM-installed Operators:

kubectl apply -f - << EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: kbs-operator-system
  namespace: kbs-operator-system
spec:
EOF

Subscription

A subscription, defined by a Subscription object, represents an intention to install an Operator. It is the custom resource that relates an Operator to a catalog source:

kubectl apply -f - << EOF
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: kbs-operator-system
  namespace: kbs-operator-system
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: trustee-operator
  source: operatorhubio-catalog
  sourceNamespace: olm
  startingCSV: trustee-operator.v0.1.0
EOF

Check Trustee Operator installation

Now it is time to check if the Trustee operator has been installed properly, by running the command:

kubectl get csv -n kbs-operator-system

We should expect something like:

NAME                      DISPLAY            VERSION   REPLACES   PHASE
trustee-operator.v0.1.0   Trustee Operator   0.1.0                Succeeded

Configuration

The Trustee Operator configuration requires a few steps. Some of the steps are provided as an example, but you may want to customize the examples for your real requirements.

Authorization key-pair generation

First of all, we’d need to create the key pairs for Trustee authorization. The public key is used by Trustee for client authorization, the private key is used by the client to prove its identity and register keys/secrets.

Create secret for client authorization:

openssl genpkey -algorithm ed25519 > privateKey
openssl pkey -in privateKey -pubout -out publicKey
kubectl create secret generic kbs-auth-public-key --from-file=publicKey -n kbs-operator-system

HTTPS configuration

It is recommended to enable the HTTPS protocol for the following reasons:

  • secure the Trustee server API
  • bind the Trusted Execution Environment (TEE) to a given Trustee server by seeding the public key and certificate (as measured init data)

In this example we’re going to create a self-signed certificate using the following template:

cat << EOF > kbs-service-509.conf
[req]
default_bits       = 2048
default_keyfile    = localhost.key
distinguished_name = req_distinguished_name
req_extensions     = req_ext
x509_extensions    = v3_ca

[req_distinguished_name]
countryName                 = Country Name (2 letter code)
countryName_default         = UK
stateOrProvinceName         = State or Province Name (full name)
stateOrProvinceName_default = England
localityName                = Locality Name (eg, city)
localityName_default        = Bristol
organizationName            = Organization Name (eg, company)
organizationName_default    = Red Hat
organizationalUnitName      = organizationalunit
organizationalUnitName_default = Development
commonName                  = Common Name (e.g. server FQDN or YOUR name)
commonName_default          = kbs-service
commonName_max              = 64

[req_ext]
subjectAltName = @alt_names

[v3_ca]
subjectAltName = @alt_names

[alt_names]
DNS.1   = kbs-service
EOF

Create secret for self-signed certificate:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout https.key -out https.crt \
    -config kbs-service-509.conf -passin pass:\
    -subj "/C=UK/ST=England/L=Bristol/O=Red Hat/OU=Development/CN=kbs-service"
kubectl create secret generic kbs-https-certificate --from-file=https.crt -n kbs-operator-system
kubectl create secret generic kbs-https-key --from-file=https.key -n kbs-operator-system

Trustee ConfigMap object

This command will create the ConfigMap object that provides Trustee all the needed configuration:

kubectl apply -f - << EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: kbs-config
  namespace: kbs-operator-system
data:
  kbs-config.json: |
    {
        "insecure_http" : false,
        "private_key": "/etc/https-key/https.key",
        "certificate": "/etc/https-cert/https.crt",
        "sockets": ["0.0.0.0:8080"],
        "auth_public_key": "/etc/auth-secret/publicKey",
        "attestation_token_config": {
          "attestation_token_type": "CoCo"
        },
        "repository_config": {
          "type": "LocalFs",
          "dir_path": "/opt/confidential-containers/kbs/repository"
        },
        "as_config": {
          "work_dir": "/opt/confidential-containers/attestation-service",
          "policy_engine": "opa",
          "attestation_token_broker": "Simple",
          "attestation_token_config": {
            "duration_min": 5
          },
          "rvps_config": {
            "store_type": "LocalJson",
            "store_config": {
              "file_path": "/opt/confidential-containers/rvps/reference-values/reference-values.json"
            }
          }
        },
        "policy_engine_config": {
          "policy_path": "/opt/confidential-containers/opa/policy.rego"
        }
    }
EOF

Reference Values

The reference values are an important part of the attestation process. The client collects the measurements (from the running software, the TEE hardware and its firmware) and submits a quote with the claims to the attestation server. These measurements, in order for the attestation protocol to succeed, have to match one of potentially multiple configured valid values that had been registered to Trustee previously. You could also apply flexible rules like “firmware of secure processor > v1.30”, etc. This process guarantees the cVM (confidential VM) is running the expected software stack and that it hasn’t been tampered with.

kubectl apply -f - << EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: rvps-reference-values
  namespace: kbs-operator-system
data:
  reference-values.json: |
    [
    ]
EOF

Create secrets

How to create secrets to be shared with the attested clients? In this example we create a secret kbsres1 with two entries. These resources (key1, key2) can be retrieved by the Trustee clients. You can add more secrets as per your requirements.

kubectl create secret generic kbsres1 --from-literal key1=res1val1 --from-literal key2=res1val2 -n kbs-operator-system

Create KbsConfig CRD

Finally, the CRD for the operator is created:

kubectl apply -f - << EOF
apiVersion: confidentialcontainers.org/v1alpha1
kind: KbsConfig
metadata:
  labels:
    app.kubernetes.io/name: kbsconfig
    app.kubernetes.io/instance: kbsconfig-sample
    app.kubernetes.io/part-of: kbs-operator
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: kbs-operator
  name: kbsconfig-sample
  namespace: kbs-operator-system
spec:
  kbsConfigMapName: kbs-config
  kbsAuthSecretName: kbs-auth-public-key
  kbsDeploymentType: AllInOneDeployment
  kbsRvpsRefValuesConfigMapName: rvps-reference-values
  kbsSecretResources: ["kbsres1"]
  kbsHttpsKeySecretName: kbs-https-key
  kbsHttpsCertSecretName: kbs-https-certificate
EOF

Set Namespace for the context entry

kubectl config set-context --current --namespace=kbs-operator-system

Check if the PODs are running

kubectl get pods -n kbs-operator-system
NAME                                                   READY   STATUS    RESTARTS   AGE
trustee-deployment-7bdc6858d7-bdncx                    1/1     Running   0          69s
trustee-operator-controller-manager-6c584fc969-8dz2d   2/2     Running   0          4h7m

Also, the log should report something like:

POD_NAME=$(kubectl get pods -l app=kbs -o jsonpath='{.items[0].metadata.name}' -n kbs-operator-system)
kubectl logs -n kbs-operator-system $POD_NAME
[2024-06-10T13:38:01Z INFO  kbs] Using config file /etc/kbs-config/kbs-config.json
[2024-06-10T13:38:01Z WARN  attestation_service::rvps] No RVPS address provided and will launch a built-in rvps
[2024-06-10T13:38:01Z INFO  attestation_service::token::simple] No Token Signer key in config file, create an ephemeral key and without CA pubkey cert
[2024-06-10T13:38:01Z INFO  api_server] Starting HTTPS server at [0.0.0.0:8080]
[2024-06-10T13:38:01Z INFO  actix_server::builder] starting 12 workers
[2024-06-10T13:38:01Z INFO  actix_server::server] Tokio runtime found; starting in existing Tokio runtime

End-to-End Attestation

Since we’re running this tutorial in a regular machine (no HW endorsement), we need to customize the default resource policy when using the sample attester (no real HW TEE platform). In the default policy, claims originating from a sample TEE would be rejected. This restriction should not be removed in a production scenario.

To showcase how we can assert properties of a TEE, we assert the sample TEE’s “security version number”. For a real TEE this could be a minimum firmware revision, or similar properties of a TEE.

cat << EOF > policy.rego
package policy

default allow = false
allow {
        input["tcb-status"]["sample.svn"] == "1"
}
EOF

POD_NAME=$(kubectl get pods -l app=kbs -o jsonpath='{.items[0].metadata.name}' -n kbs-operator-system)
kubectl cp --no-preserve policy.rego $POD_NAME:/opt/confidential-containers/opa/policy.rego

We create a pod using an already existing image where the kbs-client is deployed:

kubectl apply -f - << EOF
apiVersion: v1
kind: Pod
metadata:
  name: kbs-client
spec:
  containers:
  - name: kbs-client
    image: quay.io/confidential-containers/kbs-client:latest
    imagePullPolicy: IfNotPresent
    command:
      - sleep
      - "360000"
    env:
      - name: RUST_LOG
        value:  none
EOF

Finally we are able to test the entire attestation protocol, when fetching one of the aforementioned secret:

kubectl cp https.crt kbs-client:/
kubectl exec -it kbs-client -- kbs-client --cert-file https.crt --url https://kbs-service:8080 get-resource --path default/kbsres1/key1
cmVzMXZhbDE=

If we type the command:

echo cmVzMXZhbDE= | base64 -d

We’ll get res1val1, the secret we created before.

Summary

In this blog we have shown how to use the Trustee operator for deploying Trustee and run the attestation workflow with a sample attester.

Memory Protection for AI ML Model Inferencing

How Confidential Containers (CoCo) can provide additional security to the model inferencing platform on Kuberentes.

Introduction

With the rapid stride of artificial intelligence & machine learning and businesses integrating these into their products and operations, safeguarding sensitive data and models is a top priority. That’s where Confidential Containers (CoCo) comes into picture. Confidential Containers:

  • Provides an extra layer of protection for data in use.
  • Helps prevent data leaks.
  • Prevents tampering and unauthorized access to sensitive data and models.

By integrating CoCo with model-serving frameworks like KServe1, businesses can create a secure environment for deploying and managing machine learning models. This integration is critical in strengthening data protection strategies and ensuring that sensitive information stays safe.

Model Inferencing

Model inferencing typically occurs on large-scale cloud infrastructure. The following diagram illustrates how users interact with these deployments.

Model Inferencing

Importance of Model Protection

Protecting both the model and the data is crucial. The loss of the model leads to a loss of intellectual property (IP), which negatively impacts the organization’s competitive edge and revenue. Additionally, any loss of user data used in conjunction with the model can erode users’ trust, which is a vital asset that, once lost, can be difficult to regain.

Additionally, reputational damage can have long-lasting effects, tarnishing a company’s image in the eyes of both current and potential customers. Ultimately, the loss of a model can diminish a company’s competitive advantage, setting it back in a race where innovation and trustworthiness are key.

Attack Vectors against Model Serving Platforms

Model serving platforms are critical for deploying machine learning solutions at scale. However, they are vulnerable to several common attack vectors. These attack vectors include the following:

  • Data or model poisoning: Introducing malicious data to corrupt the model’s learning process.
  • Data privacy breaches: Unauthorized access to sensitive data.
  • Model theft: Proprietary or fine-tuned models are illicitly copied or stolen.
  • Denial-of-service attacks: Overwhelming the system to degrade performance or render it inoperable.

The OWASP Top 10 for LLMs paper2 provides a detailed explanation of the different attack vectors.

Among these attack vectors, our focus here is “model theft” as it directly jeopardizes the intellectual property and competitive advantage of organizations.

Traditional Model Protection Mechanisms

Kubernetes offers various mechanisms to harden the cluster in order to limit the access to data and code. Role-Based Access Control (RBAC) is a foundational pillar regulating who can interact with the Kubernetes API and how. Thus ensuring that only authorized personnel have access to sensitive operations. API security mechanisms complements RBAC and acts as gatekeeper, safeguarding the integrity of interactions between services within the cluster. Monitoring, logging, and auditing further augment these defences by providing real-time visibility into the system’s operations, enabling prompt detection and remediation of any suspicious activities.

Traditional Model Protection Mechanisms on Kubernetes

Additionally, encrypting models at rest ensures that data remains secure even when not in active use, while using Transport Layer Security (TLS) for data in transit between components in the cluster protects sensitive information from interception, maintaining the confidentiality and integrity of data as it moves within the Kubernetes environment. These layered security measures create a robust framework for protecting models against threats, safeguarding the valuable intellectual property and data they encapsulate.

But, is this enough?

Demo: Read Unencrypted Memory

This video showcases how one can read the pod memory when it is run using the default runc3 or kata-containers4. But using kata’s confidential compute5 support we can avoid exposing the memory to the underlying worker node.

Confidential Containers (CoCo)

The Confidential Containers (CoCo) project aims at integrating confidential computing6 into Kubernetes, offering a transformative approach to enhancing data security within containerized applications. By leveraging Trusted Execution Environments (TEEs)7 to create secure enclaves for container execution, CoCo ensures that sensitive data and models are processed in a fully isolated and encrypted memory environment. CoCo not only shields the memory of applications hosting the models from unauthorized access but also from privileged administrators who might have access to the underlying infrastructure.

As a result, it adds a critical layer of security, protecting against both external breaches and internal threats. The confidentiality of memory at runtime means that even if the perimeter defenses are compromised, the data and models within these protected containers remain impenetrable, ensuring the integrity and confidentiality of sensitive information crucial for maintaining competitive advantage and user trust.

KServe

KServe1 is a model inference platform on Kubernetes. By embracing a broad spectrum of model-serving frameworks such as TensorFlow, PyTorch, ONNX, SKLearn, and XGBoost, KServe facilitates a flexible environment for deploying machine learning models. It leverages Custom Resource Definitions (CRDs), controllers, and operators to offer a declarative and uniform interface for model serving, simplifying the operational complexities traditionally associated with such tasks.

Beyond its core functionalities, KServe inherits all the advantageous features of Kubernetes, including high availability (HA), efficient resource utilization through bin-packing, and auto scaling capabilities. These features collectively ensure that KServe can dynamically adapt to changing workloads and demands, guaranteeing both resilience and efficiency in serving machine learning models at scale.

KServe on Confidential Containers (CoCo)

In the diagram below we can see that we are running the containers hosting models in a confidential computing environment using CoCo. Integrating KServe with CoCo offers a transformative approach to bolstering security in model-serving operations. By running model-serving containers within the secure environment provided by CoCo, these containers gain memory protection. This security measure ensures that both the models and the sensitive data they process, including query inputs and inference outputs, are safeguarded against unauthorized access.

KServe on CoCo Image Source8

Such protection extends beyond external threats, offering a shield against potential vulnerabilities posed by infrastructure providers themselves. This layer of security ensures that the entire inference process, from input to output, remains confidential and secure within the protected memory space, thereby enhancing the overall integrity and reliability of model-serving workflows.

Takeaways

Throughout this exploration, we’ve uncovered the pivotal role of Confidential Containers (CoCo) in fortifying data protection, particularly for data in use. CoCo emerges as a comprehensive solution capable of mitigating unauthorized in-memory data access risks. Model-serving frameworks, such as KServe, stand to gain significantly from the enhanced security layer provided by CoCo, ensuring the protection of sensitive data and models throughout their operational life cycle.

However, it’s essential to recognize that not all components must operate within CoCo’s protected environment. A strategic approach involves identifying critical areas where models and data are most vulnerable to unauthorized access and focusing CoCo’s protective measures on these segments. This selective application ensures efficient resource utilization while maximizing data security and integrity.

Further

In the next blog we will see how to deploy KServe on Confidential Containers for memory protection.

Note: This blog is a transcription of the talk we gave at Kubecon EU 2024. You can find the slides on Sched9 and the talk recording on YouTube10.


  1. KServe Website: https://kserve.github.io/website/ ↩︎ ↩︎

  2. OWASP Top 10 for LLMs paper: https://owasp.org/www-project-top-10-for-large-language-model-applications/assets/PDF/OWASP-Top-10-for-LLMs-2023-v1_1.pdf ↩︎

  3. runc: https://github.com/opencontainers/runc ↩︎

  4. kata-containers: https://katacontainers.io/ ↩︎

  5. kata-cc https://confidentialcontainers.org/docs/kata-containers/ ↩︎

  6. Confidential Computing: https://en.wikipedia.org/wiki/Confidential_computing ↩︎

  7. Trusted Execution Environments (TEEs): https://en.wikipedia.org/wiki/Trusted_execution_environment ↩︎

  8. KServe Control Plane https://kserve.github.io/website/latest/modelserving/control_plane/ ↩︎

  9. Fortifying AI Security in Kubernetes with Confidential Containers (CoCo) - Suraj Deshmukh, Microsoft & Pradipta Banerjee, Red Hat: https://sched.co/1YeOx ↩︎

  10. Fortifying AI Security in Kubernetes with Confidential Containers (CoCo): https://youtu.be/Ko0o5_hpmxI?si=JJRN9VMzvVzUz5vq ↩︎

Building Trust into OS images for Confidential Containers

Describe different ways to establish trust in one of CoCo’s infrastructure components.

Containers and OS Images

Confidential Containers using Kata-Containers are launched in a Confidential Virtual Machine (CVM). Those CVMs require a minimal Linux system which will run in our Trusted Execution Environment (TEE) and host the agent side of Kata-Containers (including various auxiliary attestation tools) to launch containers and facilitate secure key releases for a confidential Pod. Integrity of the workload is one of the key pillars for Confidential Computing. Consequently, this implies we also must trust the infrastructure components that host containers on a confidential guest VM, specifically: firmware, rootfs, kernel and kernel cmdline.

For a TEE there are various options to establish this kind of trust. Which option will be used depends on the capabilities and specifics of a TEE. All of them will include various degrees of “measurements” (that is: cryptographic hashes for a blob of data and/or code) to reach the same goal: providing a verifiable statement about the integrity of the OS image. We’ll discuss three viable options; those are not exhaustive

Initial ramdisk

A yardstick representing a measurement that covers the boxes firmware, initrd, kernel and kernel cmdline. The yardstick is linked to a hardware attestation report box via a signed paper icon.

We can opt to not use a rootfs and bundle the required userland components into Linux’ initial ramdisk (initrd), which is loaded by the kernel. Outside a CoCo scenario this facility is used to provide a boot stage in which kernel drivers can be loaded on-demand from a memory-backed (compressed) volume, not having to bundle device drivers for various hardware statically in each vendor kernel. For CoCo VMs, this kind of flexibility is not really required: we do know beforehand the virtualized hardware that our CVM is configured with, and it will require only a limited set of drivers. Due to its static nature, relying solely on an initrd would be impractical for many workloads. For CoCo however, this is a viable option, since the dynamic aspect of its workload is mostly deferred to the container execution. This means we can have the kernel launch a kata-agent as PID 1 directly from an initrd.

This option is appealing for certain CoCo deployments. If we have a Trusted Execution Environment (TEE) that will produce a launch-measurement of the initial RAM state of a CVM, we can use this measurement to gain confidence that our os image is genuine. We can calculate the expected value of a given launch measurement offline and then verify during remote attestation that the actual launch measurement matches our expected value.

Calculating a launch measurement

An expected SEV-SNP launch measurement for Linux direct boot with Qemu can be calculated using trusted artifacts (firmware, kernel & initrd) and a few platform parameters. Please note that the respective kernel/fw components and tools are still being actively developed. The AMDESE/AMDSEV repository provides instructions and pointers to a working set of revisions.

$ sev-snp-measure \
	--mode snp \
	--vcpus=1 \
	--vcpu-type=EPYC-Milan-v1 \
	--kernel=vmlinuz \
	--initrd=initrd.img \
	--append="console=ttyS0" \
	--ovmf OVMF.fd
20f28c1e85c4250c2c061d1997cfc815185cefe756c74b37ea1c81eb8f2e0e3c5c43e58d65e0e792ed2bd04a0720f970

DM Verity

A yardstick representing a measurement covers a box called rootfs. The yardstick points to a box Kernel cmdline. Another yardstick representing a measurement covers the boxes firmware, initrd, kernel and kernel cmdline. That yardstick is linked to a hardware attestation report box via a signed paper icon.

The infrastructure components might outgrow a reasonably sized initrd. We want to limit the initrd to a smaller size, to not spend too much of the CVM’s RAM, but leave as much as possible for the container payload. In the worst case we’ll have to spend ~2x the size of the initrd (we also need to keep the compressed initrd in RAM). With an increasing amount of supported TEEs the infrastructure components will inevitably grow, since they need to support various attestation schemes and hardware. Some of those schemes might also have a larger dependency tree, which is unreasonable to statically link or bundle into an initrd. There might be compliance requirements which mandate the use of certain cryptographic libraries that must not be statically compiled. Those considerations might nudge us to a more traditional Linux setup of kernel, initrd and rootfs.

A rootfs can comfortably host the infrastructure components and we can still package support for all kinds of TEE in a single OS image artifact. However, the dependencies for a given TEE can now be loaded dynamically into RAM. For CVMs there is a restriction when it comes to how an OS image is handled: We must prevent the CVM host, storage provider or anyone else outside the TEE from compromising the image to uphold the integrity of a CoCo workload. dm-verity is a kernel technology that prevents changes to a read-only filesystem during runtime. Block-level access is guarded by a hash tree and on unexpected block data the kernel will panic. This protection scheme requires a root hash that needs to be provided when the verity-protected filesystem is mounted. We can provide this root hash through the kernel cmdline or bake it into the initrd. In any case, the TEE has to include the root hash in a launch measurement to provide verifiable integrity guarantees.

Creating a verity volume

DM-Verity volumes feature a hash tree and a root hash in addition to the actual data. The hash tree can be stored on disk next to the verity volume or as a local file. We’ll store the hash-tree as file for brevity and write a string CoCo into a file /coco on the formatted volume:

$ dd if=/dev/zero of=rootfs.raw bs=1M count=100
$ DEVICE="$(sudo losetup --show -f rootfs.raw)"
$ sudo cfdisk "$DEVICE"
# create 1 partition
$ sudo mkfs.ext4 "$DEVICE"
...
$ sudo mount "$DEVICE" /mnt
$ echo "CoCo" | sudo tee /mnt/coco
CoCo
$ sudo umount /mnt
$ sudo veritysetup format "$DEVICE" ./hash-tree
VERITY header information for ./hash-tree
UUID:                   91bbc990-f0df-48c0-b8f0-1b996cf0c3cf
Hash type:              1
Data blocks:            25600
Data block size:        4096
Hash block size:        4096
Hash algorithm:         sha256
Salt:                   cef7ea72e3487f4f8d26df8731df561f64e03236fa494dc0ae87fe0f07a4825b
Root hash:              ad86ff8492be2ee204cb54d70c84412c2dc89cefd34e263184f4e00295a412f3
$ export ROOT_HASH=ad86ff8492be2ee204cb54d70c84412c2dc89cefd34e263184f4e00295a412f3

Corrupting the image

Now we toggle a bit on the raw image (CoCo => DoCo in /coco). If the image is attached as a block device via dm-verity, there will be IO errors and respective entries in the kernel log, once we attempt to read the file.

$ hexdump -C rootfs.raw | grep CoCo
06000000  43 6f 43 6f 0a 00 00 00  00 00 00 00 00 00 00 00  |CoCo............|
$ printf '\x44' | dd of=rootfs.raw bs=1 seek="$((16#06000000))" count=1 conv=notrunc
$ hexdump -C rootfs.raw | grep DoCo
06000000  44 6f 43 6f 0a 00 00 00  00 00 00 00 00 00 00 00  |DoCo............|
$ sudo veritysetup open "$DEVICE" rootfs ./hash-tree "$ROOT_HASH"
$ sudo mount /dev/mapper/rootfs /mnt
$ cat /mnt/coco
cat: /mnt/coco: Input/output error
$ dmesg | tail -1
[194754.361797] device-mapper: verity: 7:0: data block 24576 is corrupted

vTPM

A yardstick representing a measurement covers a box called rootfs. The yardstick points to a box Kernel cmdline. A set of yardsticks representing multiple measurements cover the boxes firmware, initrd, kernel and kernel cmdline. Those yardsticks point to a box vTPM. That box is linked to a hardware attestation report box via a signed paper icon.

There are setups in which a launch measurement of the TEEs will not cover the kernel and/or initrd. An example of such a TEE is Azure’s Confidential VM offering (L1 VMs provided by hypervisor running on a physical host). Those CVMs can host Confidential Containers in a CoCo Peerpod setup. The hardware evidence, which is attesting encrypted RAM and CPU registers is exclusively fetched during an early boot phase. Only in later stages the kernel and initrd are loaded from an OS image and hence the launch measurement will not cover the CoCo infrastructure components yet. To still be able to provide an integrity guarantee such a TEE can defer measurements of the later boot stages to a virtual TPM device (vTPM).

To isolate it from the host a confidential vTPM is provisioned within the TEE during early boot and cryptographically linked to the TEE’s hardware evidence. To further secure secrets like private keys from the guest OS, the provisioning is performed at a certain privilege level preventing direct access and manipulation by the guest OS which is running at a lesser privilege level.

TPM is a mature technology, deployed in a lot of hardware to protect operating systems and workloads from being compromised. It’s seeing increased adoption and support in the Linux kernel and userland. A TPM device has multiple Platform Configuration Registers (PCR). Those can hold measurements and they can be “extended” with additional measurements in a one-way function to create a comprehensive, replayable log of events that occur during the boot process. “Measured Boot” is a procedure in which each boot step measures the subsequent step into a specific PCR. As a whole this represents a verifiable state of the system, much like an initial launch measurement, however with more granularity.

Image building

Modern OS Image build tools for Linux like systemd’s mkosi make it trivial to build OS images with dm-verity protection enabled, along with Unified Kernel Images (UKI) which bundles kernel, initrd and kernel cmdline into conveniently measurable artifacts. A modern distribution packaging recent systemd (v253+) revisions like Fedora (38+) will perform the required TPM measurements.

Creating reference values

To retrieve the expected measurements, for a dm-verity protected OS image, we can boot the resulting image in a trusted environment locally. The swtpm project is a great option to provide the virtual machine with a vTPM.

$ swtpm socket \
	--tpmstate dir=/tmp/vtpm \
	--ctrl type=unixio,path=/tmp/vtpm/swtpm.sock \
	--tpm2 \
	--log level=20

We retrieve VM firmware from debian’s repository and attach the vTPM socket as character device:

# retrieve vm firmware from debian's repo
$ wget http://security.debian.org/debian-security/pool/updates/main/e/edk2/ovmf_2022.11-6+deb12u1_all.deb
$ mkdir fw
$ dpkg-deb -x ovmf_2022.11-6+deb12u1_all.deb fw/
$ cp fw/usr/share/OVMF/OVMF_*.fd .
$ OS_IMAGE=image.raw
$ qemu-system-x86_64 \
	-machine type=q35,accel=kvm,smm=off \
	-m 1024 \
	-drive file=./OVMF_CODE.fd,format=raw,if=pflash \
	-drive file=./OVMF_VARS.fd,format=raw,if=pflash \
	-drive "file=${OS_IMAGE},format=raw" \
	-chardev socket,id=chrtpm,path=/tmp/vtpm/swtpm.sock \
	-tpmdev emulator,id=tpm0,chardev=chrtpm \
	-device tpm-tis,tpmdev=tpm0 \
	-nographic

Comparing PCRs

Once logged into the VM we can retrieve the relevant measurements in the form of PCRs (the package tpm2_tools needs to be available):

$ tpm2_pcrread sha256:0,1,2,3,4,5,6,7,8,9,10,11
  sha256:
	0 : 0x61E3B90D0862D052BF6C802E0FD2A44A671A37FE2EB67368D89CB56E5D23014E
	1 : 0x33D454DFCF5E46C0B7AFD332272E72ADC3D1A86CCAE25AA98DD475C9FCA36CFC
	2 : 0x7842C772A64365B48AC733EDEE9B131DF5F0E71EA95074F80E32450995C5773D
	3 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	4 : 0x22B156BE656EED7542AB03CC76DCC8A82F2A31044B5F17B3B8A388CB8DB37850
	5 : 0x3F72C8A7A38564991898859F725D12E5BE64CBD26265BC8F5E39CBE1101EBD49
	6 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	7 : 0x65CAF8DD1E0EA7A6347B635D2B379C93B9A1351EDC2AFC3ECDA700E534EB3068
	8 : 0x0000000000000000000000000000000000000000000000000000000000000000
	9 : 0x8E74577DC5814F2EBF094988CB2E789F1D637B4D43930F3714500F9E2E65615D
	10: 0x961D21A6CB38D377F951748BA7B8DD05A2E1BA6C712BB34EF7A39C5862721F1E
	11: 0x9DBA7A9D3C5200B0E526112151BBD23D77006CBFCF290CFA6249601CA9812608

If we boot the same image on a Confidential VM in Azure’s cloud, we’ll see different measurements. This is expected since the early boot stack does not match our reference setup:

$ tpm2_pcrread sha256:0,1,2,3,4,5,6,7,8,9,10,11
  sha256:
	0 : 0x782B20B10F55CC46E2142CC2145D548698073E5BEB82752C8D7F9279F0D8A273
	1 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	2 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	3 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	4 : 0xC7BB081502F18392EB5837951A9BA48E9DB23F91DE39A9AF8B2B29C333D71EA0
	5 : 0x0358DC1195BBDD59E3C556A452E292A6E7ECF11408BE7DAEC6776E678BEBEC23
	6 : 0x531086506EADC75D0E540F516D68E03095E5700FE8F1BD0F840025B07A3AB4F7
	7 : 0x64CDD65955B69C5ADD78577E32BFE52DDF9ADBF240977AEA39703908F4F6D8BA
	8 : 0x0000000000000000000000000000000000000000000000000000000000000000
	9 : 0x8E74577DC5814F2EBF094988CB2E789F1D637B4D43930F3714500F9E2E65615D
	10: 0x5A7ACDE0EF2AB221551CB24CCFDB7AE959047E3C0E0C39427D329992A9C7FDDF
	11: 0x9DBA7A9D3C5200B0E526112151BBD23D77006CBFCF290CFA6249601CA9812608

We can identify the common PCRs between the measurements in a cloud VM and those that we gathered in our reference setup. Those are good candidates to include them as reference values in a relying party against which a TEE’s evidence can be verified.

$ grep -F -x -f pcr_reference.txt pcr_cloud.txt
	3 : 0x3D458CFE55CC03EA1F443F1562BEEC8DF51C75E14A9FCF9A7234A13F198E7969
	8 : 0x0000000000000000000000000000000000000000000000000000000000000000
	9 : 0x8E74577DC5814F2EBF094988CB2E789F1D637B4D43930F3714500F9E2E65615D
	11: 0x9DBA7A9D3C5200B0E526112151BBD23D77006CBFCF290CFA6249601CA9812608

The UAPI Group’s TPM PCR Registry for Linux and systemd specifies PCR11 as a container for UKI measurements, covering kernel, initrd and kernel cmdline. Further registers that might be worth considering would be PCR4 (shim + UKI) or PCR7 (Secure Boot state).

Conclusion and outlook

We have looked at three different ways of building trust into OS host images for Confidential Containers. The intention was to illustrate how a chain of trust can be established using concrete examples and tools. The scenarios and technologies haven’t been covered comprehensively, each of those would be worth their own in-depth article.

Finally we have so far only covered the (mostly static) steps and components that provide a sandbox for confidential containers. Asserting integrity for containers themselves is a unique challenge for CoCo. There are a lot of dynamic aspects to consider in a realistic container deployment. Future articles might provide insights into how this can be achieved.

Thanks to Pradipta Banerjee, Iago López Galeiras & Tobin Feldman-Fitzthum for reviewing this post!

Introduction to Confidential Containers (CoCo)

Introduction to the project: Motivation, Mechanics, Foundational Principles and Community.

Confidential Containers (CoCo) is an innovative sandbox project under the Cloud Native Computing Foundation (CNCF), revolutionizing cloud-native confidential computing by leveraging diverse hardware platforms and cutting-edge technologies.

The CoCo project builds on existing and emerging hardware security technologies such as Intel SGX, Intel TDX, AMD SEV-SNP and IBM Z Secure Execution, in combination with new software frameworks to protect data in use. The project brings together software and hardware companies including Alibaba-cloud, AMD, ARM, Edgeless Systems, IBM, Intel, Microsoft, Nvidia, Red Hat, Rivos, etc.

Motivation

At the core of a confidential computing solution lies Trusted Execution Environments (TEEs), and it is this foundational idea that propelled the inception of the CoCo project.

TEEs represent isolated environments endowed with heightened security, a shield crafted by confidential computing (CC) capable hardware. This security fortress stands guard, ensuring that applications and data remain impervious to unauthorized access or tampering during their active use.

The driving force behind CoCo is the seamless integration of TEE infrastructure into the realm of cloud-native computing. By bridging the gap between TEEs and the cloud-native world, the project strives to bring enhanced security to the forefront of modern computing practices.

The overarching goal of CoCo is ambitious yet clear: standardize confidential computing at the container level and simplify its integration into Kubernetes.

The aim is to empower Kubernetes users to deploy confidential container workloads effortlessly, using familiar workflows and tools. CoCo envisions a future where Kubernetes users can embrace the benefits of confidential computing without the need for extensive knowledge of the underlying technologies, making security an integral and accessible aspect of their everyday operations.

Mechanics

CoCo helps in deploying your workload that extends beyond the confines of your own infrastructure. Whether it’s a cloud provider’s domain, a separate division within your organization, or even an external entity, CoCo empowers you to confidently entrust your workload to diverse hands.

This capability hinges on a fundamental approach: encrypting your workload’s memory and fortifying other essential low-level resources at the hardware level. This memory protection ensures that, regardless of the hosting environment, your data remains shielded, and unauthorized access is thwarted.

A key aspect of CoCo’s mechanics lies in the use of cryptography-based proofs which involve employing cryptographic techniques to create verifiable evidence, such as signatures or hashes, ensuring the integrity of your software. These serve a dual purpose: validating that your software runs untampered and, conversely, preventing the execution of your workload if any unauthorized alterations are detected.

In essence, CoCo employs cryptographic mechanisms to provide assurance, creating a secure foundation that allows your software to operate with integrity across varied and potentially untrusted hosting environments.

Foundational Principles

The project puts a strong emphasis on delivering practical cloud-native solution:

  • Simplicity: CoCo places a premium on simplicity, employing a dedicated Kubernetes operator for deployment and configuration. This strategic choice aims to maximize accessibility by abstracting away much of the hardware-dependent intricacies, ensuring a user-friendly experience.

  • Stability: Supporting continuous integration (CI) for the key workflows of the release.

  • Use case driven development: CoCo adopts a use case-driven development approach, rallying the community around a select set of key use cases. Rather than a feature-centric model, this strategy ensures that development efforts are purposeful, with a spotlight on supporting essential use cases. This pragmatic approach aligns the project with real-world needs, making CoCo a solution crafted for practical cloud-native scenarios.

Community

Discover the vibrant CoCo community and explore ways to actively engage with the project by visiting our dedicated community page. We welcome and actively seek your thoughts, feedback, and potential contributions. Join us in shaping the future of confidential containers and explore collaborative opportunities to integrate CoCo with other cloud-native projects. Your participation is not just encouraged; it’s integral to the evolution and success of this open-source initiative. Visit the community page now to be a part of the conversation and contribute to the advancement of confidential computing in the cloud-native ecosystem.

See our CoCo community meeting notes for details on the weekly meetings, recordings, slack channels and more.