第 4 章 Developing Operators
4.1. Getting started with the Operator SDK
This guide outlines the basics of the Operator SDK and walks Operator authors with cluster administrator access to a Kubernetes-based cluster (such as OpenShift Container Platform) through an example of building a simple Go-based Memcached Operator and managing its lifecycle from installation to upgrade.
This is accomplished using two centerpieces of the Operator Framework: Operator SDK (the operator-sdk
CLI tool and controller-runtime
library API) and Operator Lifecycle Manager (OLM).
OpenShift Container Platform 4.5 supports Operator SDK v0.17.2.
4.1.1. Architecture of the Operator SDK
The Operator Framework is an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way. Operators take advantage of Kubernetes extensibility to deliver the automation advantages of cloud services like provisioning, scaling, and backup and restore, while being able to run anywhere that Kubernetes can run.
Operators make it easy to manage complex, stateful applications on top of Kubernetes. However, writing an Operator today can be difficult because of challenges such as using low-level APIs, writing boilerplate, and a lack of modularity, which leads to duplication.
The Operator SDK is a framework designed to make writing Operators easier by providing:
- High-level APIs and abstractions to write the operational logic more intuitively
- Tools for scaffolding and code generation to quickly bootstrap a new project
- Extensions to cover common Operator use cases
4.1.1.1. Workflow
The Operator SDK provides the following workflow to develop a new Operator:
- Create a new Operator project using the Operator SDK command line interface (CLI).
- Define new resource APIs by adding custom resource definitions (CRDs).
- Specify resources to watch using the Operator SDK API.
- Define the Operator reconciling logic in a designated handler and use the Operator SDK API to interact with resources.
- Use the Operator SDK CLI to build and generate the Operator deployment manifests.
图 4.1. Operator SDK workflow
At a high level, an Operator using the Operator SDK processes events for watched resources in an Operator author-defined handler and takes actions to reconcile the state of the application.
4.1.1.2. Manager file
The main program for the Operator is the manager file at cmd/manager/main.go
. The manager automatically registers the scheme for all custom resources (CRs) defined under pkg/apis/
and runs all controllers under pkg/controller/
.
The manager can restrict the namespace that all controllers watch for resources:
mgr, err := manager.New(cfg, manager.Options{Namespace: namespace})
By default, this is the namespace that the Operator is running in. To watch all namespaces, you can leave the namespace option empty:
mgr, err := manager.New(cfg, manager.Options{Namespace: ""})
4.1.1.3. Prometheus Operator support
Prometheus is an open-source systems monitoring and alerting toolkit. The Prometheus Operator creates, configures, and manages Prometheus clusters running on Kubernetes-based clusters, such as OpenShift Container Platform.
Helper functions exist in the Operator SDK by default to automatically set up metrics in any generated Go-based Operator for use on clusters where the Prometheus Operator is deployed.
4.1.2. Installing the Operator SDK CLI
The Operator SDK has a CLI tool that assists developers in creating, building, and deploying a new Operator project. You can install the SDK CLI on your workstation so you are prepared to start authoring your own Operators.
4.1.2.1. Installing from GitHub release
You can download and install a pre-built release binary of the Operator SDK CLI from the project on GitHub.
Prerequisites
- Go v1.13+
-
docker
v17.03+,podman
v1.2.0+, orbuildah
v1.7+ -
OpenShift CLI (
oc
) v4.5+ installed - Access to a cluster based on Kubernetes v1.12.0+
- Access to a container registry
Procedure
Set the release version variable:
$ RELEASE_VERSION=v0.17.2
Download the release binary.
For Linux:
$ curl -OJL https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
For macOS:
$ curl -OJL https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin
Verify the downloaded release binary.
Download the provided
.asc
file.For Linux:
$ curl -OJL https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu.asc
For macOS:
$ curl -OJL https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin.asc
Place the binary and corresponding
.asc
file into the same directory and run the following command to verify the binary:For Linux:
$ gpg --verify operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu.asc
For macOS:
$ gpg --verify operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin.asc
If you do not have the public key of the maintainer on your workstation, you will get the following error:
Example output with error
$ gpg: assuming signed data in 'operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin' $ gpg: Signature made Fri Apr 5 20:03:22 2019 CEST $ gpg: using RSA key <key_id> 1 $ gpg: Can't check signature: No public key
- 1
- RSA key string.
To download the key, run the following command, replacing
<key_id>
with the RSA key string provided in the output of the previous command:$ gpg [--keyserver keys.gnupg.net] --recv-key "<key_id>" 1
- 1
- If you do not have a key server configured, specify one with the
--keyserver
option.
Install the release binary in your
PATH
:For Linux:
$ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
$ sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk
$ rm operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
For macOS:
$ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin
$ sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin /usr/local/bin/operator-sdk
$ rm operator-sdk-${RELEASE_VERSION}-x86_64-apple-darwin
Verify that the CLI tool was installed correctly:
$ operator-sdk version
4.1.2.2. Installing from Homebrew
You can install the SDK CLI using Homebrew.
Prerequisites
- Homebrew
-
docker
v17.03+,podman
v1.2.0+, orbuildah
v1.7+ -
OpenShift CLI (
oc
) v4.5+ installed - Access to a cluster based on Kubernetes v1.12.0+
- Access to a container registry
Procedure
Install the SDK CLI using the
brew
command:$ brew install operator-sdk
Verify that the CLI tool was installed correctly:
$ operator-sdk version
4.1.2.3. Compiling and installing from source
You can obtain the Operator SDK source code to compile and install the SDK CLI.
Prerequisites
Procedure
Clone the
operator-sdk
repository:$ mkdir -p $GOPATH/src/github.com/operator-framework
$ cd $GOPATH/src/github.com/operator-framework
$ git clone https://github.com/operator-framework/operator-sdk
$ cd operator-sdk
Check out the desired release branch:
$ git checkout master
Compile and install the SDK CLI:
$ make dep
$ make install
This installs the CLI binary
operator-sdk
at $GOPATH/bin.Verify that the CLI tool was installed correctly:
$ operator-sdk version
4.1.3. Building a Go-based Operator using the Operator SDK
The Operator SDK makes it easier to build Kubernetes native applications, a process that can require deep, application-specific operational knowledge. The SDK not only lowers that barrier, but it also helps reduce the amount of boilerplate code needed for many common management capabilities, such as metering or monitoring.
This procedure walks through an example of building a simple Memcached Operator using tools and libraries provided by the SDK.
Prerequisites
- Operator SDK CLI installed on the development workstation
-
Operator Lifecycle Manager (OLM) installed on a Kubernetes-based cluster (v1.8 or above to support the
apps/v1beta2
API group), for example OpenShift Container Platform 4.5 -
Access to the cluster using an account with
cluster-admin
permissions -
OpenShift CLI (
oc
) v4.5+ installed
Procedure
Create a new project.
Use the CLI to create a new
memcached-operator
project:$ mkdir -p $GOPATH/src/github.com/example-inc/
$ cd $GOPATH/src/github.com/example-inc/
$ operator-sdk new memcached-operator
$ cd memcached-operator
Add a new custom resource definition (CRD).
Use the CLI to add a new CRD API called
Memcached
, withAPIVersion
set tocache.example.com/v1apha1
andKind
set toMemcached
:$ operator-sdk add api \ --api-version=cache.example.com/v1alpha1 \ --kind=Memcached
This scaffolds the Memcached resource API under
pkg/apis/cache/v1alpha1/
.Modify the spec and status of the
Memcached
custom resource (CR) at thepkg/apis/cache/v1alpha1/memcached_types.go
file:type MemcachedSpec struct { // Size is the size of the memcached deployment Size int32 `json:"size"` } type MemcachedStatus struct { // Nodes are the names of the memcached pods Nodes []string `json:"nodes"` }
After modifying the
*_types.go
file, always run the following command to update the generated code for that resource type:$ operator-sdk generate k8s
Optional: Add custom validation to your CRD.
OpenAPI v3.0 schemas are added to CRD manifests in the
spec.validation
block when the manifests are generated. This validation block allows Kubernetes to validate the properties in a Memcached CR when it is created or updated.Additionally, a
pkg/apis/<group>/<version>/zz_generated.openapi.go
file is generated. This file contains the Go representation of this validation block if the+k8s:openapi-gen=true annotation
is present above theKind
type declaration, which is present by default. This auto-generated code is the OpenAPI model of your GoKind
type, from which you can create a full OpenAPI Specification and generate a client.As an Operator author, you can use Kubebuilder markers (annotations) to configure custom validations for your API. These markers must always have a
+kubebuilder:validation
prefix. For example, adding an enum-type specification can be done by adding the following marker:// +kubebuilder:validation:Enum=Lion;Wolf;Dragon type Alias string
Usage of markers in API code is discussed in the Kubebuilder Generating CRDs and Markers for Config/Code Generation documentation. A full list of OpenAPIv3 validation markers is also available in the Kubebuilder CRD Validation documentation.
If you add any custom validations, run the following command to update the OpenAPI validation section in the
deploy/crds/cache.example.com_memcacheds_crd.yaml
file for the CRD:$ operator-sdk generate crds
Example generated YAML
spec: validation: openAPIV3Schema: properties: spec: properties: size: format: int32 type: integer
Add a new controller.
Add a new controller to the project to watch and reconcile the
Memcached
resource:$ operator-sdk add controller \ --api-version=cache.example.com/v1alpha1 \ --kind=Memcached
This scaffolds a new controller implementation under
pkg/controller/memcached/
.For this example, replace the generated controller file
pkg/controller/memcached/memcached_controller.go
with the example implementation.The example controller executes the following reconciliation logic for each
Memcached
resource:- Create a Memcached deployment if it does not exist.
-
Ensure that the Deployment size is the same as specified by the
Memcached
CR spec. -
Update the
Memcached
resource status with the names of the Memcached pods.
The next two sub-steps inspect how the controller watches resources and how the reconcile loop is triggered. You can skip these steps to go directly to building and running the Operator.
Inspect the controller implementation at the
pkg/controller/memcached/memcached_controller.go
file to see how the controller watches resources.The first watch is for the
Memcached
type as the primary resource. For each add, update, or delete event, the reconcile loop is sent a reconcileRequest
(a<namespace>:<name>
key) for thatMemcached
object:err := c.Watch( &source.Kind{Type: &cachev1alpha1.Memcached{}}, &handler.EnqueueRequestForObject{})
The next watch is for
Deployment
objects, but the event handler maps each event to a reconcileRequest
for the owner of the deployment. In this case, this is theMemcached
object for which the deployment was created. This allows the controller to watch deployments as a secondary resource:err := c.Watch(&source.Kind{Type: &appsv1.Deployment{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &cachev1alpha1.Memcached{}, })
Every controller has a
Reconciler
object with aReconcile()
method that implements the reconcile loop. The reconcile loop is passed theRequest
argument which is a<namespace>:<name>
key used to lookup the primary resource object,Memcached
, from the cache:func (r *ReconcileMemcached) Reconcile(request reconcile.Request) (reconcile.Result, error) { // Lookup the Memcached instance for this reconcile request memcached := &cachev1alpha1.Memcached{} err := r.client.Get(context.TODO(), request.NamespacedName, memcached) ... }
Based on the return value of the
Reconcile()
function, the reconcileRequest
might be requeued, and the loop might be triggered again:// Reconcile successful - don't requeue return reconcile.Result{}, nil // Reconcile failed due to error - requeue return reconcile.Result{}, err // Requeue for any reason other than error return reconcile.Result{Requeue: true}, nil
Build and run the Operator.
Before running the Operator, the CRD must be registered with the Kubernetes API server:
$ oc create \ -f deploy/crds/cache_v1alpha1_memcached_crd.yaml
After registering the CRD, there are two options for running the Operator:
- As a Deployment inside a Kubernetes cluster
- As Go program outside a cluster
Choose one of the following methods.
Option A: Running as a deployment inside the cluster.
Build the
memcached-operator
image and push it to a registry:$ operator-sdk build quay.io/example/memcached-operator:v0.0.1
The deployment manifest is generated at
deploy/operator.yaml
. Update the deployment image as follows since the default is just a placeholder:$ sed -i 's|REPLACE_IMAGE|quay.io/example/memcached-operator:v0.0.1|g' deploy/operator.yaml
-
Ensure you have an account on Quay.io for the next step, or substitute your preferred container registry. On the registry, create a new public image repository named
memcached-operator
. Push the image to the registry:
$ podman push quay.io/example/memcached-operator:v0.0.1
Set up RBAC and create the
memcached-operator
manifests:$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/operator.yaml
Verify that the
memcached-operator
deploy is up and running:$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE memcached-operator 1 1 1 1 1m
Option B: Running locally outside the cluster.
This method is preferred during development cycle to deploy and test faster.
Run the Operator locally with the default Kubernetes configuration file present at
$HOME/.kube/config
:$ operator-sdk run --local --namespace=default
You can use a specific
kubeconfig
using the flag--kubeconfig=<path/to/kubeconfig>
.
Verify that the Operator can deploy a Memcached application by creating a
Memcached
CR.-
Create the example
Memcached
CR that was generated atdeploy/crds/cache_v1alpha1_memcached_cr.yaml
. View the file:
$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
Example output
apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "example-memcached" spec: size: 3
Create the object:
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Ensure that
memcached-operator
creates the deployment for the CR:$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE memcached-operator 1 1 1 1 2m example-memcached 3 3 3 3 1m
Check the pods and CR to confirm the CR status is updated with the pod names:
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE example-memcached-6fd7c98d8-7dqdr 1/1 Running 0 1m example-memcached-6fd7c98d8-g5k7v 1/1 Running 0 1m example-memcached-6fd7c98d8-m7vn7 1/1 Running 0 1m memcached-operator-7cc7cfdf86-vvjqk 1/1 Running 0 2m
$ oc get memcached/example-memcached -o yaml
Example output
apiVersion: cache.example.com/v1alpha1 kind: Memcached metadata: clusterName: "" creationTimestamp: 2018-03-31T22:51:08Z generation: 0 name: example-memcached namespace: default resourceVersion: "245453" selfLink: /apis/cache.example.com/v1alpha1/namespaces/default/memcacheds/example-memcached uid: 0026cc97-3536-11e8-bd83-0800274106a1 spec: size: 3 status: nodes: - example-memcached-6fd7c98d8-7dqdr - example-memcached-6fd7c98d8-g5k7v - example-memcached-6fd7c98d8-m7vn7
-
Create the example
Verify that the Operator can manage a deployed Memcached application by updating the size of the deployment.
Change the
spec.size
field in thememcached
CR from3
to4
:$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
Example output
apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "example-memcached" spec: size: 4
Apply the change:
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Confirm that the Operator changes the deployment size:
$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE example-memcached 4 4 4 4 5m
Clean up the resources:
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_crd.yaml
$ oc delete -f deploy/operator.yaml
$ oc delete -f deploy/role.yaml
$ oc delete -f deploy/role_binding.yaml
$ oc delete -f deploy/service_account.yaml
Additional resources
- For more information about OpenAPI v3.0 validation schemas in CRDs, refer to the Kubernetes documentation.
4.1.4. Managing a Go-based Operator using Operator Lifecycle Manager
The previous section has covered manually running an Operator. The next sections explore using Operator Lifecycle Manager (OLM), which is what enables a more robust deployment model for Operators being run in production environments.
OLM helps you to install, update, and generally manage the lifecycle of all of the Operators and their associated services on a Kubernetes cluster. It runs as an Kubernetes extension and lets you use oc
for all the lifecycle management functions without any additional tools.
Prerequisites
-
OLM installed on a Kubernetes-based cluster (v1.8 or above to support the
apps/v1beta2
API group), for example OpenShift Container Platform 4.5 - Memcached Operator built
Procedure
Generate an Operator manifest.
An Operator manifest describes how to display, create, and manage the application, in this case Memcached, as a whole. It is defined by a
ClusterServiceVersion
(CSV) object and is required for OLM to function.From the
memcached-operator/
directory that was created when you built the Memcached Operator, generate the CSV manifest:$ operator-sdk generate csv --csv-version 0.0.1
注意See Building a CSV for the Operator Framework for more information on manually defining a manifest file.
Create an Operator group that specifies the namespaces that the Operator will target. Create the following Operator group in the namespace where you will create the CSV. In this example, the
default
namespace is used:apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: memcached-operator-group namespace: default spec: targetNamespaces: - default
Deploy the Operator. Use the files that were generated into the
deploy/
directory by the Operator SDK when you built the Memcached Operator.Edit the generated CSV manifest file by adding
displayName
fields for each custom resource definition (CRD)kind
in thespec.customresourcedefinitions.owned
section:deploy/olm-catalog/memcached-operator/0.0.1/memcached-operator.v0.0.1.clusterserviceversion.yaml file
... spec: customresourcedefinitions: owned: - kind: Memcached name: memcacheds.cache.example.com version: v1alpha1 description: Memcached is the Schema for the memcacheds API displayName: Memcached 1 ...
- 1
- Specify a display name for the CRD.
Apply the CSV manifest to the specified namespace in the cluster:
$ oc apply -f deploy/olm-catalog/memcached-operator/0.0.1/memcached-operator.v0.0.1.clusterserviceversion.yaml
When you apply this manifest, the cluster does not immediately update because it does not yet meet the requirements specified in the manifest.
Create the role, role binding, and service account to grant resource permissions to the Operator, and the custom resource definition (CRD) to create the
Memcached
custom resource that the Operator manages:$ oc create -f deploy/crds/cache.example.com_memcacheds_crd.yaml
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
Because OLM creates Operators in a particular namespace when a manifest is applied, administrators can leverage the native Kubernetes RBAC permission model to restrict which users are allowed to install Operators.
Create an application instance.
The Memcached Operator is now running in the
default
namespace. Users interact with Operators via instances of custom resources; in this case, the resource has the kindMemcached
. Native Kubernetes RBAC also applies to custom resources, providing administrators control over who can interact with each Operator.Creating instances of
Memcached
objects in this namespace will now trigger the Memcached Operator to instantiate pods running thememcached
server that are managed by the Operator. The more custom resources you create, the more unique Memcached application instances are managed by the Memcached Operator running in this namespace.$ cat <<EOF | oc apply -f - apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "memcached-for-wordpress" spec: size: 1 EOF
$ cat <<EOF | oc apply -f - apiVersion: "cache.example.com/v1alpha1" kind: "Memcached" metadata: name: "memcached-for-drupal" spec: size: 1 EOF
$ oc get Memcached
Example output
NAME AGE memcached-for-drupal 22s memcached-for-wordpress 27s
$ oc get pods
Example output
NAME READY STATUS RESTARTS AGE memcached-app-operator-66b5777b79-pnsfj 1/1 Running 0 14m memcached-for-drupal-5476487c46-qbd66 1/1 Running 0 3s memcached-for-wordpress-65b75fd8c9-7b9x7 1/1 Running 0 8s
4.1.5. Additional resources
- See Appendices to learn about the project directory structures created by the Operator SDK.
- Operator Development Guide for Red Hat Partners