Custom Policy Rules MADE EASY

If you are using Kubernetes a lot, you likely have used Custom Resources Definitions in Kubernetes. The power of Kubernetes lies in the Kube API, which allows cloud native applications to extend Kubernetes resources with custom resources. This allows you to potentially create anything as a Kubernetes resource.

While this is a pretty dope feature, it also comes with a few downsides. For instance, anything that you customize will not have the same community support as a tool that you use "out of the box".

You want to prevent that people who have access to your Kubernetes cluster e.g. team members are able to install malicious resources but follow best practices and potentially, are constraint in the type of resources that they are able to create on a Kubernetes cluster. Best practices can be enforced through policies. However, custom resources don't have out-of-the-box policies that you can implement.

In this blog post, I am going to introduce you to Datree custom rules. Datree makes it possible for users to write custom rules to check your Kubernetes custom resource definitions for misconfigurations.

If you prefer the video tutorial, you can find it linked below:

For this, we are going to write Policies as Code. If you are completely new to Datree and/or Policies as Code, I highly suggest you have a look at my previous video on Datree that showcases how you can get started with Policies as Code.

Prerequisites

To follow this tutorial, the one thing that you must have enabled are Policies as Code from the Datree Settings Menu. Once you have enabled it, you will be able to install the policy resources. To enable Policies  follow these steps:

Access Account

If you do not have Datree installed yet, go ahead and install the CLI by running the following command in your terminal:

curl <https://get.datree.io> | /bin/bash

For different installation instructions, please have a look at the Datree documentation.

There are two ways in which you can access the Dashboard. The first one is that you run the example test command from the documentation:

datree test ~/.datree/k8s-demo.yaml

Alternatively, you can go to app.datree.io and you will be able to access your Dashboard.

Enable Policy as Code

To enable Policy as Code, head over to Your Account (top left) > Settings; next:

  1. Enable Policy as Code
  2. Download the policies.yaml file

This file will contain all the default policies. You can also see an example in the documentation.

Custom Policies

The Datree documentation provides a great overview of how to write custom rules. You can write the custom rules either in JSON or YAML format. In my case, I am going to go with the YAML format since this is what I am already used to.

We are going to start off with our policies.yaml file from the previous example. Currently, we are not making use of any custom rules.

apiVersion: v1
policies:
  - name: Staging
    isDefault: true
    rules:
      - identifier: CONTAINERS_MISSING_IMAGE_VALUE_VERSION
        messageOnFailure: Incorrect value for key `image` - specify an image version to avoid unpleasant “version surprises” in the future
      - identifier: CRONJOB_INVALID_SCHEDULE_VALUE
        messageOnFailure: 'Incorrect value for key `schedule` - the (cron) schedule expressions is not valid and therefor, will not work as expected'
      - identifier: WORKLOAD_INVALID_LABELS_VALUE
        messageOnFailure: Incorrect value for key(s) under `labels` - the vales syntax is not valid so it will not be accepted by the Kuberenetes engine
      - identifier: WORKLOAD_INCORRECT_RESTARTPOLICY_VALUE_ALWAYS
        messageOnFailure: Incorrect value for key `restartPolicy` - any other value than `Always` is not supported by this resource
      - identifier: HPA_MISSING_MAXREPLICAS_KEY
        messageOnFailure: Missing property object `maxReplicas` - the value should be within the accepted boundaries recommended by the organization
      - identifier: WORKLOAD_INCORRECT_NAMESPACE_VALUE_DEFAULT
        messageOnFailure: Incorrect value for key `namespace` - use an explicit namespace instead of the default one (`default`)
      - identifier: DEPLOYMENT_INCORRECT_REPLICAS_VALUE
        messageOnFailure: Incorrect value for key `replicas` - don't relay on a single pod to do all of the work. Running 2 or more replicas will increase the availability of the service
      - identifier: K8S_DEPRECATED_APIVERSION_1.16
        messageOnFailure: Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.16)
      - identifier: K8S_DEPRECATED_APIVERSION_1.17
        messageOnFailure: Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.17)
      - identifier: CONTAINERS_INCORRECT_PRIVILEGED_VALUE_TRUE
        messageOnFailure: Incorrect value for key `privileged` - this mode will allow the container the same access as processes running on the host
      - identifier: CRONJOB_MISSING_CONCURRENCYPOLICY_KEY
        messageOnFailure: Missing property object `concurrencyPolicy` - the behavior will be more deterministic if jobs won't run concurrently
      - identifier: HPA_MISSING_MINREPLICAS_KEY
        messageOnFailure: 'Missing property object `minReplicas` - the value should be within 3 to 5 pods. Please contact @AnaisUrlichs for further information. '
      - identifier: SERVICE_INCORRECT_TYPE_VALUE_NODEPORT
        messageOnFailure: 'Incorrect value for key `type` - `NodePort` will open a port on all nodes where it can be reached by the network external to the cluster. Please use ClusterIP instead. Contact @AnaisUrlichs for further information. '
  - name: Prod
    rules:
      - identifier: CONTAINERS_MISSING_IMAGE_VALUE_VERSION
        messageOnFailure: Incorrect value for key `image` - specify an image version to avoid unpleasant “version surprises” in the future
      - identifier: K8S_INCORRECT_KIND_VALUE_POD
        messageOnFailure: Incorrect value for key `kind` - raw pod won't be rescheduled in the event of a node failure
      - identifier: CONTAINERS_MISSING_MEMORY_REQUEST_KEY
        messageOnFailure: Missing property object `requests.memory` - value should be within the accepted boundaries recommended by the organization.
      - identifier: CONTAINERS_MISSING_CPU_REQUEST_KEY
        messageOnFailure: Missing property object `requests.cpu` - value should be within the accepted boundaries recommended by the organization
      - identifier: CONTAINERS_MISSING_MEMORY_LIMIT_KEY
        messageOnFailure: Missing property object `limits.memory` - value should be within the accepted boundaries recommended by the organization
      - identifier: CONTAINERS_MISSING_CPU_LIMIT_KEY
        messageOnFailure: Missing property object `limits.cpu` - value should be within the accepted boundaries recommended by the organization

As you can see in the file, we currently specify the rules for two different environments "Prod" and "Staging". However, the existing rules provided by Datree will not cover our Custom Resource Definitions. Thus, once we create custom resources, we will want to add custom rules.

Example

Next, we will need an example of custom resources based on which we can then go ahead and create our custom rule. You can use any Kubernetes custom resource definition for this.

I am going to create custom policies for my cluster.yaml resources that I am giving Crossplane to create a Kubernetes cluster for me.

If you are completely new to Crossplane, have a look at my previous video.

Crossplane allows us to specify what kind of resources we can create on a cloud provider through custom resources.

My cluster.yaml file looks something along these lines:

kind: CivoKubernetes
apiVersion: cluster.civo.crossplane.io/v1alpha1
metadata: 
  name: second-crossplane
spec:
  name: second-crossplane
  instances: 5
  size: g3.k3s.large
  applications: 
    - "prometheus-operator"
    - "istio:Istio v1.10.1"
  connectionDetails:
    connectionSecretNamePrefix: "cluster-details"
    connectionSecretNamespace: "default"
  providerConfigRef:
    name: civo-provider

This file is specifically written to create a Civo Kubernetes resources through the Civo Crossplane provider. To learn more about both, have a look at the following blog post: https://www.civo.com/learn/civo-crossplane-provider

Next, I want to create several custom rules. These rules will be used to check whether the cluster.yaml file is configured correctly.

customRules:
  - identifier: CUSTOM_CLUSTER_SIZE
    name: Ensure correct cluster size is used [CUSTOM RULE]
    defaultMessageOnFailure: Use only small or medium clusters 
    schema:
      properties:
        spec:
          properties:
            size:
              enum:
                - g3.k3s.small
                - g3.k3s.medium
              required:
                - size
  - identifier: CUSTOM_CLUSTER_INSTANCES
    name: Ensure only a limited number of instances are used [CUSTOM RULE]
    defaultMessageOnFailure: Use only 2,3,4 instance sizes
    schema:
      properties:
        spec:
          properties:
            instances:
              type: number
              enum:
                - 2
                - 3
                - 4
  - identifier: CUSTOM_WORKLOAD_INCORRECT_ENVIRONMENT_LABELS
    name: Ensure correct environment labels are used [CUSTOM RULE]
    defaultMessageOnFailure: Use only approved environment labels (`prod`, `staging` and `test`)
    schema:
      properties:
        metadata:
          properties:
            labels:
              properties:
                environment:
                  enum:
                    - prod
                    - staging
                    - test
              required:
              - environment
          required:
          - labels

Again, please have a look at the Datree documentation to see how you can set up these custom rules.

Next, we have to reference the custom rules in our policies.yaml file. Thus, we are going to create a new section within "policies" that will reference our custom rules:

- name: Crossplane
    rules:
      - identifier: CUSTOM_CLUSTER_SIZE
        messageOnFailure: Ensure correct cluster size is used [CUSTOM RULE]
      - identifier: CUSTOM_CLUSTER_INSTANCES
        messageOnFailure: Ensure only a limited number of instances are used [CUSTOM RULE]
      - identifier: CUSTOM_WORKLOAD_INCORRECT_ENVIRONMENT_LABELS
        messageOnFailure: Please choose either staging, test, or prod environments

My entire policies.yaml file will now look as follows:

apiVersion: v1
policies:
  - name: Staging
    isDefault: true
    rules:
      - identifier: CONTAINERS_MISSING_IMAGE_VALUE_VERSION
        messageOnFailure: Incorrect value for key `image` - specify an image version to avoid unpleasant “version surprises” in the future
      - identifier: CRONJOB_INVALID_SCHEDULE_VALUE
        messageOnFailure: 'Incorrect value for key `schedule` - the (cron) schedule expressions is not valid and therefor, will not work as expected'
      - identifier: WORKLOAD_INVALID_LABELS_VALUE
        messageOnFailure: Incorrect value for key(s) under `labels` - the vales syntax is not valid so it will not be accepted by the Kuberenetes engine
      - identifier: WORKLOAD_INCORRECT_RESTARTPOLICY_VALUE_ALWAYS
        messageOnFailure: Incorrect value for key `restartPolicy` - any other value than `Always` is not supported by this resource
      - identifier: HPA_MISSING_MAXREPLICAS_KEY
        messageOnFailure: Missing property object `maxReplicas` - the value should be within the accepted boundaries recommended by the organization
      - identifier: WORKLOAD_INCORRECT_NAMESPACE_VALUE_DEFAULT
        messageOnFailure: Incorrect value for key `namespace` - use an explicit namespace instead of the default one (`default`)
      - identifier: DEPLOYMENT_INCORRECT_REPLICAS_VALUE
        messageOnFailure: Incorrect value for key `replicas` - don't relay on a single pod to do all of the work. Running 2 or more replicas will increase the availability of the service
      - identifier: K8S_DEPRECATED_APIVERSION_1.16
        messageOnFailure: Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.16)
      - identifier: K8S_DEPRECATED_APIVERSION_1.17
        messageOnFailure: Incorrect value for key `apiVersion` - the version you are trying to use is not supported by the Kubernetes cluster version (>=1.17)
      - identifier: CONTAINERS_INCORRECT_PRIVILEGED_VALUE_TRUE
        messageOnFailure: Incorrect value for key `privileged` - this mode will allow the container the same access as processes running on the host
      - identifier: CRONJOB_MISSING_CONCURRENCYPOLICY_KEY
        messageOnFailure: Missing property object `concurrencyPolicy` - the behavior will be more deterministic if jobs won't run concurrently
      - identifier: HPA_MISSING_MINREPLICAS_KEY
        messageOnFailure: 'Missing property object `minReplicas` - the value should be within 3 to 5 pods. Please contact @AnaisUrlichs for further information. '
      - identifier: SERVICE_INCORRECT_TYPE_VALUE_NODEPORT
        messageOnFailure: 'Incorrect value for key `type` - `NodePort` will open a port on all nodes where it can be reached by the network external to the cluster. Please use ClusterIP instead. Contact @AnaisUrlichs for further information. '
  - name: Prod
    rules:
      - identifier: CONTAINERS_MISSING_IMAGE_VALUE_VERSION
        messageOnFailure: Incorrect value for key `image` - specify an image version to avoid unpleasant “version surprises” in the future
      - identifier: K8S_INCORRECT_KIND_VALUE_POD
        messageOnFailure: Incorrect value for key `kind` - raw pod won't be rescheduled in the event of a node failure
      - identifier: CONTAINERS_MISSING_MEMORY_REQUEST_KEY
        messageOnFailure: Missing property object `requests.memory` - value should be within the accepted boundaries recommended by the organization.
      - identifier: CONTAINERS_MISSING_CPU_REQUEST_KEY
        messageOnFailure: Missing property object `requests.cpu` - value should be within the accepted boundaries recommended by the organization
      - identifier: CONTAINERS_MISSING_MEMORY_LIMIT_KEY
        messageOnFailure: Missing property object `limits.memory` - value should be within the accepted boundaries recommended by the organization
      - identifier: CONTAINERS_MISSING_CPU_LIMIT_KEY
        messageOnFailure: Missing property object `limits.cpu` - value should be within the accepted boundaries recommended by the organization
  - name: Crossplane
    rules:
      - identifier: CUSTOM_CLUSTER_SIZE
        messageOnFailure: Ensure correct cluster size is used [CUSTOM RULE]
      - identifier: CUSTOM_CLUSTER_INSTANCES
        messageOnFailure: Ensure only a limited number of instances are used [CUSTOM RULE]
      - identifier: CUSTOM_WORKLOAD_INCORRECT_ENVIRONMENT_LABELS
        messageOnFailure: Please choose either staging, test, or prod environments
customRules:
  - identifier: CUSTOM_CLUSTER_SIZE
    name: Ensure correct cluster size is used [CUSTOM RULE]
    defaultMessageOnFailure: Use only small or medium clusters 
    schema:
      properties:
        spec:
          properties:
            size:
              enum:
                - g3.k3s.small
                - g3.k3s.medium
              required:
                - size
  - identifier: CUSTOM_CLUSTER_INSTANCES
    name: Ensure only a limited number of instances are used [CUSTOM RULE]
    defaultMessageOnFailure: Use only 2,3,4 instance sizes
    schema:
      properties:
        spec:
          properties:
            instances:
              type: number
              enum:
                - 2
                - 3
                - 4
  - identifier: CUSTOM_WORKLOAD_INCORRECT_ENVIRONMENT_LABELS
    name: Ensure correct environment labels are used [CUSTOM RULE]
    defaultMessageOnFailure: Use only approved environment labels (`prod`, `staging` and `test`)
    schema:
      properties:
        metadata:
          properties:
            labels:
              properties:
                environment:
                  enum:
                    - prod
                    - staging
                    - test
              required:
              - environment
          required:
          - labels

As you can see, I am now referencing:

  1. My environment-specific rules that make use of the existing rules provided by Datree
  2. A new environment, called "Crossplane"
  3. My custom rules are defined at the end of the file, at the same level of "apiVersion"

Next, I want to publish my custom rules to Datree to make use of my custom rules:

datree publish policies.yaml

Once I publish my policies, I will see my new "Corssplane" environment appear in the Datree UI:

Note: You can access the Datree UI by running "datree test". This will provide you with a link to your UI.

Scrolling down, through my policies, I can then see that my custom rules have been activated:

Test your Custom Resource Definitions

Next, I am going to test my cluster.yaml file to see whether or not my new Crossplane-specific policies pass:

datree test ./cluster.yaml -p Crossplane --ignore-missing-schemas

Note that we require the flag `--ignore-missing-schema` since we are using a custom schema for Crossplane.

As you can see, two of our policies are currently not passing. Thus, we have to update our Custom Resource Definition for our cluster. The updated version will be the following:

kind: CivoKubernetes
apiVersion: cluster.civo.crossplane.io/v1alpha1
metadata: 
  name: second-crossplane
spec:
  name: second-crossplane
  instances: 3
  size: g3.k3s.small
  applications: 
    - "OpenFaaS"
    - "prometheus-operator"
    - "istio:Istio v1.10.1"
  connectionDetails:
    connectionSecretNamePrefix: "cluster-details"
    connectionSecretNamespace: "default"
  providerConfigRef:
    name: civo-provider

Once we test the cluster.yaml file again, we can now see that all of our custom rules for Crossplane are passing:

What's next?

In this blog post, I showed you

  • The magic of Custom Resource Definitions
  • Why you want to consider using custom policies
  • How you can get started with custom rules in Datree

If you would like to see more on optimising your deployments, custom rules, or even more security related topics, please let me know.

Lastly, make sure to give Datree a GH star and to share this blog post to make sure that others will see it too.

Until next time 👋🏼