HTTP Routing and TLS#

Subsystem Goal#

This subsystem is responsible for the routing of all HTTP-based traffic, ensuring requests end up going to the correct application. By providing automatic certificate provisioning, we can require HTTPS on all requests.

Components in Use#

While working on this subsystem, we will introduce the following components:

Traefik - a cloud-native proxy that can be configured to watch Ingress objects to update its routing config
Cert Manager - a Kubernetes-based certificate manager that can leverage LetsEncrypt (or other ACME providers) or use an in-cluster CA to provide certificates

Background#

Understanding Ingress#

In Kubernetes, Ingress is the standardized method to define HTTP routing rules. These rules provide the ability to route based on hostname or path and provide TLS configuration. Below is an example:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: hello-world
spec:
  rules:
    - host: docs-getting-started.tenants.platform.it.vt.edu
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: hello-world
              port: 
                number: 3000
  tls:
    - hosts:
        - docs-getting-started.tenants.platform.it.vt.edu
      secretName: hello-world-tls-cert

This object indicates that requests sent to docs-getting-started.tenants.platform.it.vt.edu should be forwarded to a Service named hello-world on its port 3000. It also indicates that the TLS key and certificate for the same host are stored in a secret called hello-world-tls-cert (more on that in a moment).

Ingress Controllers#

It's imporant to note that defining an Ingress does nothing on its own, as these are simply objects defining desired state. In order to actually perform routing, you must deploy an ingress controller. These controllers will watch the Ingress objects and update their routing config. They are typically exposed as a LoadBalancer or NodePort Service and serve as the entry point for all HTTP requests into the clustered environment. This is where Traefik plugs in.

Traefik is configured to watch all of the Ingress objects and updates its routing rules appropriately. It is the single reverse proxy that sits at the edge of the cluster and performs routing to all of the applications.

graph LR A[Browser] --> C[Load Balancer] C --> D1[Traefik] C --> D2[Traefik] subgraph Cluster D1 -->|app1.example.com| E[App 1] D1 --> F[App 2] D2 --> E[App 1] D2 -->|app2.example.com| F[App 2] end

Certificate Management#

As seen earlier, the Ingress object allows TLS configuration to be defined and simply references a Kubernetes secret that contains a private key and certificate. That secret must be of type kubernetes.io/tls and have the following structure:

apiVersion: v1
kind: Secret
metadata:
  name: secret-tls
type: kubernetes.io/tls
data:
  # the data is abbreviated in this example
  tls.crt: |
        MIIC2DCCAcCgAwIBAgIBATANBgkqh ...
  tls.key: |
        MIIEpgIBAAKCAQEA7yn3bRHQ5FHMQ ...

While tenants are certainly welcome to provide their own SSL credentials, we wanted to remove the burden of provisioning and renewing certificates. Fortunately, the ACME protocol makes it possible to automate this process (read this blog post to better understand the protocol). By plugging in an ACME client, we can automatically provision and renew certificates and store them as secrets where an ingress controller can use them.

Cert Manager provides a Kubernetes-based approach to managing certificates. It provides the ability to define a Certificate, which serves as a request for a TLS certificate. By defining the various ways certs can be issued (by defining ClusterIssuer objects), we can support LetsEncrypt or a variety of other issuers. Once a Certificate is defined, the user never needs to worry about expired certs, as cert-manager will automatically rotate soon-to-be-expiring certificates!

Deploying it Yourself#

In this example, we're going to deploy Traefik onto our local Docker Desktop environment and deploy two simple applications.

Configuring minikube ingress#

Enable ingress addon
```
minikube addons enable ingress
```

Add hostname to /etc/hosts

cat <<EOF >> /etc/hosts   
$(minikube ip) app1.localhost
$(minikube ip) app2.localhost
$(minikube ip) sample-app.localhost
$(minikube ip) flux-webhooks.localhost
$(minikube ip) k8s-api.localhost.vt.edu
EOF

You will need to be running a minikube tunnel in order to connect to things
```
minikube tunnel
```
You will need to leave it running.

Deploying an Ingress Controller#

Deploy Traefik by installing the Helm chart.
```
helm repo add traefik https://helm.traefik.io/traefik
helm repo update
helm install traefik traefik/traefik --namespace=platform-traefik --create-namespace --set ports.websecure.tls.enabled=true --set 'additionalArguments[0]=--serverstransport.insecureskipverify'
```
By specifying ports.websecure.tls.enabled=true, we enable TLS on all routes. Otherwise, every Ingress will need an additional annotation to enable TLS.

The --serverstransport.insecureskipverify will let us use self-signed certs in communication from Traefik to the pod (which will be important in later subsystems).

After a moment, you should see the Traefik pod running in the platform-traefik namespace.

> kubectl get pods -n platform-traefik
NAME                       READY   STATUS    RESTARTS   AGE
traefik-76fbffb484-xnwls   1/1     Running   0          24s

If you want to open the Traefik dashboard, you can use port-forwarding!
```
kubectl port-forward -n platform-traefik <pod-name> 9000:9000
```
And then open your browser to http://localhost:9000/dashboard/.

Deploying a Sample Application#

Now, let's create a namespace to run our simple apps.
```
kubectl create namespace ingress-demo
```

Let's deploy an app with a Service! In this case, we'll simply use nginx.

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app1
  namespace: ingress-demo
  labels:
    app: nginx
spec:
  containers:
    - name: nginx
      image: nginx:alpine
---
apiVersion: v1
kind: Service
metadata:
  name: app1
  namespace: ingress-demo
spec:
  selector:
    app: nginx
  ports:
    - port: 80
EOF

Let's deploy another app. This time, we'll use a slightly different image: nginxdemos/hello

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app2
  namespace: ingress-demo
  labels:
    app: nginx-hello-world
spec:
  containers:
    - name: nginx
      image: nginxdemos/hello
---
apiVersion: v1
kind: Service
metadata:
  name: app2
  namespace: ingress-demo
spec:
  selector:
    app: nginx-hello-world
  ports:
    - port: 80
EOF

Now, let's define two different Ingress objects for each of the apps. We'll put the first app at app1.localhost and the second at app2.localhost.

cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app1
  namespace: ingress-demo
spec:
  rules:
    - host: app1.localhost
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: app1
              port: 
                number: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app2
  namespace: ingress-demo
spec:
  rules:
    - host: app2.localhost
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: app2
              port: 
                number: 80
EOF

Now, open your browser to http://app1.localhost. You should see the default nginx landing page. But, if you open http://app2.localhost, you should see a different app! That's because Traefik observed the routing rules and sent the traffic to the correct pod based on the host header.

Deploying a Self-Signed In-Cluster CA#

Now that we have two applications up and running, let's deploy Cert Manager and configure it to use an in-cluster CA (since we can't easily satisfy LetsEncrypt challenges locally).

Install Cert Manager by installing its Helm chart:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace platform-cert-manager --create-namespace --set installCRDs=true

After a moment, you should see the cert-manager pods running in the platform-cert-manager namespace.

> kubectl get pods -n platform-cert-manager
NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-6d6bb4f487-xffkx              1/1     Running   0          54s
cert-manager-cainjector-7d55bf8f78-wpr5z   1/1     Running   0          54s
cert-manager-webhook-6bc54c7c69-9v99p      1/1     Running   0          54s

To issue certificates, we need to define a ClusterIssuer. The issuer provides the configuration needed to satisfy a certificate request. For example, we can have one issuer that uses an in-cluster CA while another uses LetsEncrypt.

Run the following to create a root CA and an issuer that can be used to secure

cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: platform-internal-root
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: platform-internal-ca
  namespace: platform-cert-manager
spec:
  secretName: platform-internal-ca
  duration: 43800h # 5y
  issuerRef:
    kind: ClusterIssuer
    name: platform-internal-root
  commonName: "ca.platform.cert-manager"
  isCA: true
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: platform-internal-ca
spec:
  ca:
    secretName: platform-internal-ca
EOF

Creating Certificates for our Apps#

Now that we have a ClusterIssuer defined, let's request some Certificates and use them for our two sample apps!

Run the following command to define a Certificate for app1. Note that we indicate the TLS details should be stored in a secret named app1-tls.

cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: app1
  namespace: ingress-demo
spec:
  commonName: app1.localhost
  dnsNames:
    - app1.localhost
  secretName: app1-tls
  issuerRef:
    kind: ClusterIssuer
    name: platform-internal-ca
EOF

After a few seconds, you should see the Certificate is ready for use:

kubectl get certificates -n ingress-demo

And you should see output that looks something like this:

NAME   READY   SECRET     AGE
app1   True    app1-tls   8s

And if we look at the app1-tls secret, we'll see it both exists and is populated:

kubectl get secret -o yaml -n ingress-demo app1-tls

With output:

apiVersion: v1
data:
  ca.crt: LS0tLS...
  tls.crt: LS0tLS...
  tls.key: LS0tLS...
kind: Secret
metadata:
  annotations:
    cert-manager.io/alt-names: app1.localhost
    cert-manager.io/certificate-name: app1
    cert-manager.io/common-name: app1.localhost
    cert-manager.io/ip-sans: ""
    cert-manager.io/issuer-group: ""
    cert-manager.io/issuer-kind: ClusterIssuer
    cert-manager.io/issuer-name: platform-internal-ca
    cert-manager.io/uri-sans: ""
  creationTimestamp: "2022-02-11T14:59:41Z"
  name: app1-tls
  namespace: ingress-demo
  resourceVersion: "1031225"
  uid: 9069ad14-234d-4ca0-99de-8bf59bd16ca9
type: kubernetes.io/tls

Now, we're going to update our Ingress object for app1 to include the TLS configuration. We simply need to point it to the correct secret.

cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app1
  namespace: ingress-demo
spec:
  rules:
    - host: app1.localhost
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: app1
              port: 
                number: 80
  tls:
    - hosts: ["app1.localhost"]
      secretName: app1-tls
EOF

You can try to open your browser to https://app1.localhost, but it might fail because Traefik uses HSTS, which forces the browser to not allow you to click through untrusted sites.

So, we can just use cURL! If you fetch the site, you should see the correct name going from our internal CA!
```
curl https://app1.localhost --resolve app1.localhost:443:127.0.0.1 -vk
```
You should see output that looks something like this...
```
...
* Server certificate:
*  subject: CN=app1.localhost
*  start date: Feb 11 14:59:41 2022 GMT
*  expire date: May 12 14:59:41 2022 GMT
*  issuer: CN=ca.platform.cert-manager
...
<!DOCTYPE html>
<html>
...
```
If you want, you can do the same thing for app2!

What's next?#

Now that we have an ingress controller and certificate management deployed, let's make it so we can deploy an actual application using a tenant workflow!

Go to the GitOps subsystem now!

Common Troubleshooting Notes#

Why is my certificate not provisioning?

The cert-manager tool has a troubleshooting guide, which is a good starting step. There are also additional steps when troubleshooting ACME certificates.

Beyond those, we've also run into a few experiences and observations.

Ensure DNS is resolving to the platform's name. If the name for the requested certificate doesn't resolve to the platform, the ACME challenge can't be satisfied and a certificate won't be issued.
DNS caching. If a DNS change was recently made, there's a chance the LetsEncrypt challenge verifier still has the old name cached. In these cases, it's important to make the DNS change before requesting the certificate. In these situations, you'll need to simply sit and wait for the DNS TTL.
Certificate is just stuck. On rare occasions, we've been able to validate the Order was completed, but the Certificate is simply never provisioned. It's unclear why, but in every situation, manually removing the Certificate and re-requesting it has worked.