Skip to content

Policy Enforcement#

Subsystem Goal#

Simply put, we want to prevent tenants from doing things they shouldn't be able to do that will either compromise the platform, impact other tenants, or increase unnecessary costs.

Components in Use#

While working on this subsystem, we will introduce the following components:

  • Gatekeeper - Gatekeeper wraps the Open Policy Agent engine and makes it more accessible in a Kubernetes environment by allowing us to define policies using Kubernetes objects
  • gatekeeper-policies - our custom policies and templates being applied into the cluster packaged as a deployable Helm chart
  • landlord - our helm chart that creates the policies based on the configuration of each tenant

Background#

Understanding Admission Controllers#

While Kubernetes has a fairly robust RBAC system, it is still based on the ability to perform actions on specific resources. As such, it isn't able to help us with many scenarios, including the following:

  • While tenants should be able to create Services, how can we prevent them from defining their own LoadBalancer or NodePort Services?
  • While tenants should be able to define Ingress objects, how do we prevent them from defining an Ingress for a domain that actually belongs to another tenant?
  • How can we prevent tenants from running privileged pods, mounting host volumes, and other potentially vulnerable actions defined in the Pod Security Standards?

Using admission controllers (and another guide), we can deploy services that sit in the API request process and either mutate an incoming object or validate it to determine if it should be accepted and persisted.

The API request workflow

The API request process with the mutating and validating admission controllers notifying services (webhooks)

Rather than defining our own webhook services and policy engine, we decided to leverage Gatekeeper.

Gatekeeper and OPA#

Gatekeeper is an open-source project that wraps the Open Policy Agent (OPA) engine and provides the ability to define policies without worrying about the semantics of the admission controller.

OPA provides the ability to define policies in a language called Rego. While Rego can take a while to understand, you can read it as a series of truth statements or declarations. As long as the statements are true, execution will continue. For example:

package block_load_balancer_services

violation[{"msg": msg}] {
  input.review.kind.kind == "Service"
  input.review.object.spec.type == "LoadBalancer"
  msg := sprintf("LoadBalancer Services are not authorized - %q", [input.review.object.metadata.name])
}

The input.review object comes from the payload received by the admission controller, specifically the .request portion of the payload (see an example here). With the declared assertions, a message will only be set if the incoming object is for a Service that has spec.type set to LoadBalancer. With Gatekeeper, any messages become validation errors and prevent the object from being stored.

In addition, Gatekeeper provides the ability to provide parameters to the policy, letting you use the same policy for multiple use cases (such as authorizing the hostnames allowed for a tenant to use). We'll see a sample of doing that below.

Deploying it Yourself#

Installing Gatekeeper#

  1. Install Gatekeeper by using its Helm chart. This will install Gatekeeper into a namespace named platform-gatekeeper-system.

    helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
    helm repo update
    helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace platform-gatekeeper-system --create-namespace
    

    After a moment, you should see a few pods running in the platform-gatekeeper-system namespace.

    > kubectl get pods -n platform-gatekeeper-system
    NAME                                            READY   STATUS    RESTARTS   AGE
    gatekeeper-audit-5f66c86485-g66t9               1/1     Running   0          41s
    gatekeeper-controller-manager-b8fcdfb56-5z99d   1/1     Running   0          41s
    gatekeeper-controller-manager-b8fcdfb56-wp4bj   1/1     Running   0          41s
    gatekeeper-controller-manager-b8fcdfb56-xnx67   1/1     Running   0          41s
    

A simple example#

  1. Now that Gatekeeper is installed, let's define a simple ConstraintTemplate. This will simply block all Service with a spec.type of LoadBalancer. We'll create a more sophisticated policy in the next section.

    Run the following command to define our ConstraintTemplate.

    cat <<EOF | kubectl apply -f -
    apiVersion: templates.gatekeeper.sh/v1beta1
    kind: ConstraintTemplate
    metadata:
      name: blockloadbalancerdemo
    spec:
      crd:
        spec:
          names:
            kind: BlockLoadBalancerDemo
      targets:
        - target: admission.k8s.gatekeeper.sh
          rego: |
            package block_loadbalancers_demo
    
            violation[{"msg": msg}] {
              input.review.kind.kind == "Service"
              input.review.object.spec.type == "LoadBalancer"
              msg := sprintf("LoadBalancer not permitted on Service - %v", [input.review.object.metadata.name])
            }
    EOF
    

    Once defined, Gatekeeper will actually create a new customer resource using the name specified in spec.crd.spec.names.kind. This is how we actually apply the policy.

  2. Now, let's create a BlockLoadBalancerDemo object and ensure no Service in our tenant namespace can be of the load balancer type.

    cat <<EOF | kubectl apply -f -
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: BlockLoadBalancerDemo
    metadata:
      name: sample-tenant
    spec:
      match:
        namespaces:
          - sample-tenant
        kinds:
          - apiGroups: [""]
            kinds: ["Service"]
    EOF
    
  3. Now, let's try to create a LoadBalancer Service in the sample-tenant namespace. We should see that the request is denied!

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Service
    metadata:
      name: lb-service
      namespace: sample-tenant
    spec:
      type: LoadBalancer
      selector:
        app: sample
      ports:
        - port: 80
    EOF
    

    When you run it, you should see the following error:

    Error from server ([sample-tenant] LoadBalancer not permitted on Service - lb-service): error when creating "STDIN": admission webhook "validation.gatekeeper.sh" denied the request: [sample-tenant] LoadBalancer not permitted on Service - lb-service
    

    Our validation worked!

A parameterized policy#

Now, let's create a slightly more sophisticated policy that includes parameters. This is especially useful when we limit the Ingress names a namespace is authorized to use.

For this example, we're going to do a simplified version of our Ingress policy. The real one supports wildcards in the parameters, but makes the policy a lot more complicated. So... we'll keep it simple here.

  1. Run the following command to define a new ConstraintTemplate.

    cat <<EOF | kubectl apply -f -
    apiVersion: templates.gatekeeper.sh/v1beta1
    kind: ConstraintTemplate
    metadata:
      name: authorizedingresshost
    spec:
      crd:
        spec:
          names:
            kind: AuthorizedIngressHost
          validation:
            # Schema for the "parameters" field
            openAPIV3Schema:
              properties:
                domains:
                  type: array
                  items: string
      targets:
        - target: admission.k8s.gatekeeper.sh
          rego: |
            package authorized_ingress_host
    
            violation[{"msg": msg}] {
              input.review.kind.kind == "Ingress"
              host := input.review.object.spec.rules[_].host
              valid_ingress_host := input.parameters.domains[_]
              not host == valid_ingress_host
              msg := sprintf("Unauthorized host on Ingress - %q", [host])
            }
    EOF
    

    You might have noticed the new spec.crd.spec.validation which outlines the additional properties that our custom resource should accept.

  2. Now, let's apply our new policy to limit the sample-tenant namespace to only use the sample-app.localhost host (which we used in the last step).

    cat <<EOF | kubectl apply -f -
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: AuthorizedIngressHost
    metadata:
      name: sample-tenant
    spec:
      match:
        namespaces:
          - sample-tenant
        kinds:
          - apiGroups: ["extensions", "networking.k8s.io"]
            kinds: ["Ingress"]
      parameters:
        domains:
          - sample-app.localhost
    EOF
    
  3. Now, let's create an Ingress that should work:

    cat <<EOF | kubectl apply -f -
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: valid-ingress
      namespace: sample-tenant
    spec:
      rules:
        - host: sample-app.localhost
          http:
            paths:
            - path: /
              pathType: Prefix
              backend:
                service:
                  name: sample
                  port: 
                    number: 80
    EOF
    

    That should work! But doesn't if you went thru the gitops steps :)

  4. Now, let's make an invalid Ingress using an invalid name. It should fail!

    cat <<EOF | kubectl apply -f -
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: invalid-ingress
      namespace: sample-tenant
    spec:
      rules:
        - host: invalid.localhost
          http:
            paths:
            - path: /
              pathType: Prefix
              backend:
                service:
                  name: sample
                  port: 
                    number: 80
    EOF
    

    You should see an error message!

    Error from server ([sample-tenant] Unauthorized host on Ingress - "invalid.localhost"): error when creating "STDIN": admission webhook "validation.gatekeeper.sh" denied the request: [sample-tenant] Unauthorized host on Ingress - "invalid.localhost"
    

Deploying the Platform Policies locally#

If you want to test the policies locally, you can install the Helm chart that defines the policies yourself!

  1. Install the Helm chart using the following commands:

    helm repo add gatekeeper-policies https://code.vt.edu/api/v4/projects/7857/packages/helm/stable
    helm repo update
    helm install gatekeeper-policies gatekeeper-policies/constraint-templates --namespace platform-gatekeeper-system
    
  2. Now, you can use any of the templates to test the policies being used in the platform. For now, let's define a policy that limits the use of privileged pods in the sample-tenant namespace.

    cat <<EOF | kubectl apply -f -
    apiVersion: constraints.gatekeeper.sh/v1beta1
    kind: PssPrivilegedPod
    metadata:
      name: sample-tenant
    spec:
      match:
        namespaces:
          - sample-tenant
        kinds:
          - apiGroups: [""]
            kinds: ["Pod"]
          - apiGroups: ["apps"]
            kinds: ["DaemonSet", "ReplicaSet", "Deployment", "StatefulSet"]
          - apiGroups: ["batch"]
            kinds: ["Job", "CronJob"]
    EOF
    

    If this command fails indicating no matches for kind "PssPrivilegedPod", wait a moment and try again. Gatekeeper simply hasn't completed processing the ConstraintTemplate to create the custom resources.

  3. Now, if we try to create a Pod (or anything that uses the Pod spec) with a privileged container, it'll be blocked! Let's test that out.

    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Pod
    metadata:
      name: sample-privileged-pod
      namespace: sample-tenant
    spec:
      containers:
      - name: nginx
        image: nginx
        securityContext:
          privileged: true
    EOF
    

    And... it should fail!

    Error from server ([sample-tenant] Container 'nginx' of Pod 'sample-privileged-pod' should set 'securityContext.privileged' to false): error when creating "STDIN": admission webhook "validation.gatekeeper.sh" denied the request: [sample-tenant] Container 'nginx' of Pod 'sample-privileged-pod' should set 'securityContext.privileged' to false
    

What's next?#

Now that we have policy enforcement plugged in, how do we actually grant access to the tenant resources to the proper users? In the next section, we'll talk about user authentication.

Go to the User Authentication subsystem now!

Common Troubleshooting Tips#

Coming soon!