Chapter 4. Additional Concepts


4.1. Networking

4.1.1. Overview

Kubernetes ensures that pods are able to network with each other, and allocates each pod an IP address from an internal network. This ensures all containers within the pod behave as if they were on the same host. Giving each pod its own IP address means that pods can be treated like physical hosts or virtual machines in terms of port allocation, networking, naming, service discovery, load balancing, application configuration, and migration.

Creating links between pods is unnecessary. However, it is not recommended that you have a pod talk to another directly by using the IP address. Instead, we recommend that you create a service, then interact with the service.

4.1.2. OpenShift Container Platform DNS

If you are running multiple services, such as frontend and backend services for use with multiple pods, in order for the frontend pods to communicate with the backend services, environment variables are created for user names, service IP, and more. If the service is deleted and recreated, a new IP address can be assigned to the service, and requires the frontend pods to be recreated in order to pick up the updated values for the service IP environment variable. Additionally, the backend service has to be created before any of the frontend pods to ensure that the service IP is generated properly and that it can be provided to the frontend pods as an environment variable.

For this reason, OpenShift Container Platform has a built-in DNS so that the services can be reached by the service DNS as well as the service IP/port. OpenShift Container Platform supports split DNS by running SkyDNS on the master that answers DNS queries for services. The master listens to port 53 by default.

When the node starts, the following message indicates the Kubelet is correctly resolved to the master:

0308 19:51:03.118430    4484 node.go:197] Started Kubelet for node
openshiftdev.local, server at 0.0.0.0:10250
I0308 19:51:03.118459    4484 node.go:199]   Kubelet is setting 10.0.2.15 as a
DNS nameserver for domain "local"

If the second message does not appear, the Kubernetes service may not be available.

On a node host, each container’s nameserver has the master name added to the front, and the default search domain for the container will be .<pod_namespace>.cluster.local. The container will then direct any nameserver queries to the master before any other nameservers on the node, which is the default behavior for Docker-formatted containers. The master will answer queries on the .cluster.local domain that have the following form:

Table 4.1. DNS Example Names
Object TypeExample

Default

<pod_namespace>.cluster.local

Services

<service>.<pod_namespace>.svc.cluster.local

Endpoints

<name>.<namespace>.endpoints.cluster.local

This prevents having to restart frontend pods in order to pick up new services, which creates a new IP for the service. This also removes the need to use environment variables, as pods can use the service DNS. Also, as the DNS does not change, you can reference database services as db.local in config files. Wildcard lookups are also supported, as any lookups resolve to the service IP, and removes the need to create the backend service before any of the frontend pods, since the service name (and hence DNS) is established upfront.

This DNS structure also covers headless services, where a portal IP is not assigned to the service and the kube-proxy does not load-balance or provide routing for its endpoints. Service DNS can still be used and responds with multiple A records, one for each pod of the service, allowing the client to round-robin between each pod.

4.1.3. Network Plug-ins

OpenShift Container Platform supports the same plug-in model as Kubernetes for networking pods. The following network plug-ins are currently supported by OpenShift Container Platform.

4.1.4. OpenShift Container Platform SDN

OpenShift Container Platform deploys a software-defined networking (SDN) approach for connecting pods in an OpenShift Container Platform cluster. The OpenShift SDN connects all pods across all node hosts, providing a unified cluster network.

OpenShift SDN is automatically installed and configured as part of the Ansible-based installation procedure. See the OpenShift Container Platform SDN section for more information.

4.1.4.1. Nuage SDN for OpenShift Container Platform

Nuage Networks' SDN solution delivers highly scalable, policy-based overlay networking for pods in an OpenShift Container Platform cluster. Nuage SDN can be installed and configured as a part of the Ansible-based installation procedure. See the Advanced Installation section for information on how to install and deploy OpenShift Container Platform with Nuage SDN.

Nuage Networks provides a highly scalable, policy-based SDN platform called Virtualized Services Platform (VSP). Nuage VSP uses an SDN Controller, along with the open source Open vSwitch for the data plane.

Nuage uses overlays to provide policy-based networking between OpenShift Container Platform and other environments consisting of VMs and bare metal servers. The platform’s real-time analytics engine enables visibility and security monitoring for OpenShift Container Platform applications.

Nuage VSP integrates with OpenShift Container Platform to allows business applications to be quickly turned up and updated by removing the network lag faced by DevOps teams.

Figure 4.1. Nuage VSP Integration with OpenShift Container Platform

Nuage VSP Integration with OpenShift Container Platform

There are two specific components responsible for the integration.

  1. The nuage-openshift-monitor service, which runs as a separate service on the OpenShift Container Platform master node.
  2. The vsp-openshift plug-in, which is invoked by the OpenShift Container Platform runtime on each of the nodes of the cluster.

Nuage Virtual Routing and Switching software (VRS) is based on open source Open vSwitch and is responsible for the datapath forwarding. The VRS runs on each node and gets policy configuration from the controller.

Nuage VSP Terminology

Figure 4.2. Nuage VSP Building Blocks

Nuage VSP Building Blocks
  1. Domains: An organization contains one or more domains. A domain is a single "Layer 3" space. In standard networking terminology, a domain maps to a VRF instance.
  2. Zones: Zones are defined under a domain. A zone does not map to anything on the network directly, but instead acts as an object with which policies are associated such that all endpoints in the zone adhere to the same set of policies.
  3. Subnets: Subnets are defined under a zone. A subnet is a specific Layer 2 subnet within the domain instance. A subnet is unique and distinct within a domain, that is, subnets within a Domain are not allowed to overlap or to contain other subnets in accordance with the standard IP subnet definitions.
  4. VPorts: A VPort is a new level in the domain hierarchy, intended to provide more granular configuration. In addition to containers and VMs, VPorts are also used to attach Host and Bridge Interfaces, which provide connectivity to Bare Metal servers, Appliances, and Legacy VLANs.
  5. Policy Group: Policy Groups are collections of VPorts.

Mapping of Constructs

Many OpenShift Container Platform concepts have a direct mapping to Nuage VSP constructs:

Figure 4.3. Nuage VSP and OpenShift Container Platform mapping

Nuage VSP and OpenShift Container Platform mapping

A Nuage subnet is not mapped to an OpenShift Container Platform node, but a subnet for a particular project can span multiple nodes in OpenShift Container Platform.

A pod spawning in OpenShift Container Platform translates to a virtual port being created in VSP. The vsp-openshift plug-in interacts with the VRS and gets a policy for that virtual port from the VSD via the VSC. Policy Groups are supported to group multiple pods together that must have the same set of policies applied to them. Currently, pods can only be assigned to policy groups using the operations workflow where a policy group is created by the administrative user in VSD. The pod being a part of the policy group is specified by means of nuage.io/policy-group label in the specification of the pod.

4.1.4.1.1. Integration Components

Nuage VSP integrates with OpenShift Container Platform using two main components:

  1. nuage-openshift-monitor
  2. vsp-openshift plugin

nuage-openshift-monitor

nuage-openshift-monitor is a service that monitors the OpenShift Container Platform API server for creation of projects, services, users, user-groups, etc.

Note

In case of a Highly Available (HA) OpenShift Container Platform cluster with multiple masters, nuage-openshift-monitor process runs on all the masters independently without any change in functionality.

For the developer workflow, nuage-openshift-monitor also auto-creates VSD objects by exercising the VSD REST API to map OpenShift Container Platform constructs to VSP constructs. Each cluster instance maps to a single domain in Nuage VSP. This allows a given enterprise to potentially have multiple cluster installations - one per domain instance for that Enterprise in Nuage. Each OpenShift Container Platform project is mapped to a zone in the domain of the cluster on the Nuage VSP. Whenever nuage-openshift-monitor sees an addition or deletion of the project, it instantiates a zone using the VSDK APIs corresponding to that project and allocates a block of subnet for that zone. Additionally, the nuage-openshift-monitor also creates a network macro group for this project. Likewise, whenever nuage-openshift-monitor sees an addition ordeletion of a service, it creates a network macro corresponding to the service IP and assigns that network macro to the network macro group for that project (user provided network macro group using labels is also supported) to enable communication to that service.

For the developer workflow, all pods that are created within the zone get IPs from that subnet pool. The subnet pool allocation and management is done by nuage-openshift-monitor based on a couple of plug-in specific parameters in the master-config file. However the actual IP address resolution and vport policy resolution is still done by VSD based on the domain/zone that gets instantiated when the project is created. If the initial subnet pool is exhausted, nuage-openshift-monitor carves out an additional subnet from the cluster CIDR to assign to a given project.

For the operations workflow, the users specify Nuage recognized labels on their application or pod specification to resolve the pods into specific user-defined zones and subnets. However, this cannot be used to resolve pods in the zones or subnets created via the developer workflow by nuage-openshift-monitor.

Note

In the operations workflow, the administrator is responsible for pre-creating the VSD constructs to map the pods into a specific zone/subnet as well as allow communication between OpenShift entities (ACL rules, policy groups, network macros, and network macro groups). Detailed description of how to use Nuage labels is provided in the Nuage VSP Openshift Integration Guide.

vsp-openshift Plug-in

The vsp-openshift networking plug-in is called by the OpenShift Container Platform runtime on each OpenShift Container Platform node. It implements the network plug-in init and pod setup, teardown, and status hooks. The vsp-openshift plug-in is also responsible for allocating the IP address for the pods. In particular, it communicates with the VRS (the forwarding engine) and configures the IP information onto the pod.

4.2. OpenShift SDN

4.2.1. Overview

OpenShift Container Platform uses a software-defined networking (SDN) approach to provide a unified cluster network that enables communication between pods across the OpenShift Container Platform cluster. This pod network is established and maintained by the OpenShift Container Platform SDN, which configures an overlay network using Open vSwitch (OVS).

OpenShift Container Platform SDN provides two SDN plug-ins for configuring the pod network:

  • The ovs-subnet plug-in is the original plug-in which provides a "flat" pod network where every pod can communicate with every other pod and service.
  • The ovs-multitenant plug-in provides OpenShift Container Platform project level isolation for pods and services. Each project receives a unique Virtual Network ID (VNID) that identifies traffic from pods assigned to the project. Pods from different projects cannot send packets to or receive packets from pods and services of a different project.

    However, projects which receive VNID 0 are more privileged in that they are allowed to communicate with all other pods, and all other pods can communicate with them. In OpenShift Container Platform clusters, the default project has VNID 0. This facilitates certain services like the load balancer, etc. to communicate with all other pods in the cluster and vice versa.

Following is a detailed discussion of the design and operation of OpenShift Container Platform SDN, which may be useful for troubleshooting.

Note

Information on configuring the SDN on masters and nodes is available in Configuring the SDN.

4.2.2. Design on Masters

On an OpenShift Container Platform master, OpenShift Container Platform SDN maintains a registry of nodes, stored in etcd. When the system administrator registers a node, OpenShift Container Platform SDN allocates an unused subnet from the cluster network and stores this subnet in the registry. When a node is deleted, OpenShift Container Platform SDN deletes the subnet from the registry and considers the subnet available to be allocated again.

In the default configuration, the cluster network is the 10.128.0.0/14 network (i.e. 10.128.0.0 - 10.131.255.255), and nodes are allocated /23 subnets (i.e., 10.128.0.0/23, 10.128.2.0/23, 10.128.4.0/23, and so on). This means that the cluster network has 512 subnets available to assign to nodes, and a given node is allocated 510 addresses that it can assign to the containers running on it. The size and address range of the cluster network are configurable, as is the host subnet size.

Note that OpenShift Container Platform SDN on a master does not configure the local (master) host to have access to any cluster network. Consequently, a master host does not have access to pods via the cluster network, unless it is also running as a node.

When using the ovs-multitenant plug-in, the OpenShift Container Platform SDN master also watches for the creation and deletion of projects, and assigns VXLAN VNIDs to them, which will be used later by the nodes to isolate traffic correctly.

4.2.3. Design on Nodes

On a node, OpenShift Container Platform SDN first registers the local host with the SDN master in the aforementioned registry so that the master allocates a subnet to the node.

Next, OpenShift Container Platform SDN creates and configures six network devices:

  • br0, the OVS bridge device that pod containers will be attached to. OpenShift Container Platform SDN also configures a set of non-subnet-specific flow rules on this bridge. The ovs-multitenant plug-in does this immediately.
  • lbr0, a Linux bridge device, which is configured as the Docker service’s bridge and given the cluster subnet gateway address (eg, 10.128.x.1/23).
  • tun0, an OVS internal port (port 2 on br0). This also gets assigned the cluster subnet gateway address, and is used for external network access. OpenShift Container Platform SDN configures netfilter and routing rules to enable access from the cluster subnet to the external network via NAT.
  • vlinuxbr and vovsbr, two Linux peer virtual Ethernet interfaces. vlinuxbr is added to lbr0 and vovsbr is added to br0 (port 9 with the ovs-subnet plug-in and port 3 with the ovs-multitenant plug-in) to provide connectivity for containers created directly with the Docker service outside of OpenShift Container Platform.
  • vxlan0, the OVS VXLAN device (port 1 on br0), which provides access to containers on remote nodes.

Each time a pod is started on the host, OpenShift Container Platform SDN:

  1. moves the host side of the pod’s veth interface pair from the lbr0 bridge (where the Docker service placed it when starting the container) to the OVS bridge br0.
  2. adds OpenFlow rules to the OVS database to route traffic addressed to the new pod to the correct OVS port.
  3. in the case of the ovs-multitenant plug-in, adds OpenFlow rules to tag traffic coming from the pod with the pod’s VNID, and to allow traffic into the pod if the traffic’s VNID matches the pod’s VNID (or is the privileged VNID 0). Non-matching traffic is filtered out by a generic rule.

The pod is allocated an IP address in the cluster subnet by the Docker service itself because the Docker service is told to use the lbr0 bridge, which OpenShift Container Platform SDN has assigned the cluster gateway address (eg. 10.128.x.1/23). Note that the tun0 is also assigned the cluster gateway IP address because it is the default gateway for all traffic destined for external networks, but these two interfaces do not conflict because the lbr0 interface is only used for IPAM and no OpenShift Container Platform SDN pods are connected to it.

OpenShift Container Platform SDN nodes also watch for subnet updates from the SDN master. When a new subnet is added, the node adds OpenFlow rules on br0 so that packets with a destination IP address in the remote subnet go to vxlan0 (port 1 on br0) and thus out onto the network. The ovs-subnet plug-in sends all packets across the VXLAN with VNID 0, but the ovs-multitenant plug-in uses the appropriate VNID for the source container.

4.2.4. Packet Flow

Suppose you have two containers, A and B, where the peer virtual Ethernet device for container A’s eth0 is named vethA and the peer for container B’s eth0 is named vethB.

Note

If the Docker service’s use of peer virtual Ethernet devices is not already familiar to you, review Docker’s advanced networking documentation.

Now suppose first that container A is on the local host and container B is also on the local host. Then the flow of packets from container A to container B is as follows:

eth0 (in A’s netns) vethA br0 vethB eth0 (in B’s netns)

Next, suppose instead that container A is on the local host and container B is on a remote host on the cluster network. Then the flow of packets from container A to container B is as follows:

eth0 (in A’s netns) vethA br0 vxlan0 network [1] vxlan0 br0 vethB eth0 (in B’s netns)

Finally, if container A connects to an external host, the traffic looks like:

eth0 (in A’s netns) vethA br0 tun0 (NAT) eth0 (physical device) Internet

Almost all packet delivery decisions are performed with OpenFlow rules in the OVS bridge br0, which simplifies the plug-in network architecture and provides flexible routing. In the case of the ovs-multitenant plug-in, this also provides enforceable network isolation.

4.2.5. Network Isolation

You can use the ovs-multitenant plug-in to achieve network isolation. When a packet exits a pod assigned to a non-default project, the OVS bridge br0 tags that packet with the project’s assigned VNID. If the packet is directed to another IP address in the node’s cluster subnet, the OVS bridge only allows the packet to be delivered to the destination pod if the VNIDs match.

If a packet is received from another node via the VXLAN tunnel, the Tunnel ID is used as the VNID, and the OVS bridge only allows the packet to be delivered to a local pod if the tunnel ID matches the destination pod’s VNID.

Packets destined for other cluster subnets are tagged with their VNID and delivered to the VXLAN tunnel with a tunnel destination address of the node owning the cluster subnet.

As described before, VNID 0 is privileged in that traffic with any VNID is allowed to enter any pod assigned VNID 0, and traffic with VNID 0 is allowed to enter any pod. Only the default OpenShift Container Platform project is assigned VNID 0; all other projects are assigned unique, isolation-enabled VNIDs. Cluster administrators can optionally control the pod network for the project using the administrator CLI.

4.3. Authentication

4.3.1. Overview

The authentication layer identifies the user associated with requests to the OpenShift Container Platform API. The authorization layer then uses information about the requesting user to determine if the request should be allowed.

As an administrator, you can configure authentication using a master configuration file.

4.3.2. Users and Groups

A user in OpenShift Container Platform is an entity that can make requests to the OpenShift Container Platform API. Typically, this represents the account of a developer or administrator that is interacting with OpenShift Container Platform.

A user can be assigned to one or more groups, each of which represent a certain set of users. Groups are useful when managing authorization policies to grant permissions to multiple users at once, for example allowing access to objects within a project, versus granting them to users individually.

In addition to explicitly defined groups, there are also system groups, or virtual groups, that are automatically provisioned by OpenShift. These can be seen when viewing cluster bindings.

In the default set of virtual groups, note the following in particular:

Virtual GroupDescription

system:authenticated

Automatically associated with all authenticated users.

system:authenticated:oauth

Automatically associated with all users authenticated with an OAuth access token.

system:unauthenticated

Automatically associated with all unauthenticated users.

4.3.3. API Authentication

Requests to the OpenShift Container Platform API are authenticated using the following methods:

OAuth Access Tokens
  • Obtained from the OpenShift Container Platform OAuth server using the <master>/oauth/authorize and <master>/oauth/token endpoints.
  • Sent as an Authorization: Bearer…​ header or an access_token=…​ query parameter
X.509 Client Certificates
  • Requires a HTTPS connection to the API server.
  • Verified by the API server against a trusted certificate authority bundle.
  • The API server creates and distributes certificates to controllers to authenticate themselves.

Any request with an invalid access token or an invalid certificate is rejected by the authentication layer with a 401 error.

If no access token or certificate is presented, the authentication layer assigns the system:anonymous virtual user and the system:unauthenticated virtual group to the request. This allows the authorization layer to determine which requests, if any, an anonymous user is allowed to make.

4.3.3.1. Impersonation

A request to the OpenShift Container Platform API may include an Impersonate-User header, which indicates that the requester wants to have the request handled as though it came from the specified user. This can be done on the command line by passing the --as=username flag.

Before User A is allowed to impersonate User B, User A is first authenticated. Then, an authorization check occurs to ensure that User A is allowed to impersonate the user named User B. If User A is requesting to impersonate a service account (system:serviceaccount:namespace:name), OpenShift Container Platform checks to ensure that User A can impersonate the serviceaccount named name in namespace. If the check fails, the request fails with a 403 (Forbidden) error code.

By default, project administrators and editors are allowed to impersonate service accounts in their namespace. The sudoers role allows a user to impersonate system:admin, which in turn has cluster administrator permissions. This grants some protection against typos (but not security) for someone administering the cluster. For example, oc delete nodes --all would be forbidden, but oc delete nodes --all --as=system:admin would be allowed. You can add a user to that group using oadm policy add-cluster-role-to-user sudoer <username>.

4.3.4. OAuth

The OpenShift Container Platform master includes a built-in OAuth server. Users obtain OAuth access tokens to authenticate themselves to the API.

When a person requests a new OAuth token, the OAuth server uses the configured identity provider to determine the identity of the person making the request.

It then determines what user that identity maps to, creates an access token for that user, and returns the token for use.

OAuth Clients

Every request for an OAuth token must specify the OAuth client that will receive and use the token. The following OAuth clients are automatically created when starting the OpenShift Container Platform API:

OAuth ClientUsage

openshift-web-console

Requests tokens for the web console.

openshift-browser-client

Requests tokens at <master>/oauth/token/request with a user-agent that can handle interactive logins.

openshift-challenging-client

Requests tokens with a user-agent that can handle WWW-Authenticate challenges.

To register additional clients:

$ oc create -f <(echo '
kind: OAuthClient
apiVersion: v1
metadata:
 name: demo 1
secret: "..." 2
redirectURIs:
 - "http://www.example.com/" 3
grantMethod: prompt 4
')
1
The name of the OAuth client is used as the client_id parameter when making requests to <master>/oauth/authorize and <master>/oauth/token.
2
The secret is used as the client_secret parameter when making requests to <master>/oauth/token.
3
The redirect_uri parameter specified in requests to <master>/oauth/authorize and <master>/oauth/token must be equal to (or prefixed by) one of the URIs in redirectURIs.
4
The grantMethod is used to determine what action to take when this client requests tokens and has not yet been granted access by the user. Uses the same values seen in Grant Options.

Service Accounts as OAuth Clients

A service account can be used as a constrained form of OAuth client. Service accounts can only request a subset of scopes that allow access to some basic user information and role-based power inside of the service account’s own namespace:

  • user:info
  • user:check-access
  • role:<any role>:<serviceaccount namespace>
  • role:<any role>:<serviceaccount namespace>:!

When using a service account as an OAuth client:

  • client_id is system:serviceaccount:<serviceaccount namespace>:<serviceaccount name>.
  • client_secret can be any of the API tokens for that service account. For example:

    $ oc sa get-token <serviceaccount name>
  • redirect_uri must match an annotation on the service account that has the prefix serviceaccounts.openshift.io/oauth-redirecturi. For example, serviceaccounts.openshift.io/oauth-redirecturi.one.
  • To get WWW-Authenticate challenges, set an serviceaccounts.openshift.io/oauth-want-challenges annotation on the service account to true.

Integrations

All requests for OAuth tokens involve a request to <master>/oauth/authorize. Most authentication integrations place an authenticating proxy in front of this endpoint, or configure OpenShift Container Platform to validate credentials against a backing identity provider. Requests to <master>/oauth/authorize can come from user-agents that cannot display interactive login pages, such as the CLI. Therefore, OpenShift Container Platform supports authenticating using a WWW-Authenticate challenge in addition to interactive login flows.

If an authenticating proxy is placed in front of the <master>/oauth/authorize endpoint, it should send unauthenticated, non-browser user-agents WWW-Authenticate challenges, rather than displaying an interactive login page or redirecting to an interactive login flow.

Note

To prevent cross-site request forgery (CSRF) attacks against browser clients, Basic authentication challenges should only be sent if a X-CSRF-Token header is present on the request. Clients that expect to receive Basic WWW-Authenticate challenges should set this header to a non-empty value.

If the authenticating proxy cannot support WWW-Authenticate challenges, or if OpenShift Container Platform is configured to use an identity provider that does not support WWW-Authenticate challenges, users can visit <master>/oauth/token/request using a browser to obtain an access token manually.

Obtaining OAuth Tokens

The OAuth server supports standard authorization code grant and the implicit grant OAuth authorization flows.

Run the following command to request an OAuth token by using the authorization code grant method:

$ curl -H "X-Remote-User: <username>" \
     --cacert /etc/origin/master/ca.crt \
     --cert /etc/origin/master/admin.crt \
     --key /etc/origin/master/admin.key \
     -I https://<master-address>/oauth/authorize?response_type=token\&client_id=openshift-challenging-client | grep -oP "access_token=\K[^&]*"

When requesting an OAuth token using the implicit grant flow (response_type=token) with a client_id configured to request WWW-Authenticate challenges (like openshift-challenging-client), these are the possible server responses from /oauth/authorize, and how they should be handled:

StatusContentClient response

302

Location header containing an access_token parameter in the URL fragment (RFC 4.2.2)

Use the access_token value as the OAuth token

302

Location header containing an error query parameter (RFC 4.1.2.1)

Fail, optionally surfacing the error (and optional error_description) query values to the user

302

Other Location header

Follow the redirect, and process the result using these rules

401

WWW-Authenticate header present

Respond to challenge if type is recognized (e.g. Basic, Negotiate, etc), resubmit request, and process the result using these rules

401

WWW-Authenticate header missing

No challenge authentication is possible. Fail and show response body (which might contain links or details on alternate methods to obtain an OAuth token)

Other

Other

Fail, optionally surfacing response body to the user

To request an OAuth token using the implicit grant flow:

$ curl -u <username>:<password>
'https://<master-address>:8443/oauth/authorize?client_id=openshift-challenging-client&response_type=token' -skv / 1
/ -H "X-CSRF-Token: xxx" 2
*   Trying 10.64.33.43...
* Connected to 10.64.33.43 (10.64.33.43) port 8443 (#0)
* found 148 certificates in /etc/ssl/certs/ca-certificates.crt
* found 592 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
*        server certificate verification SKIPPED
*        server certificate status verification SKIPPED
*        common name: 10.64.33.43 (matched)
*        server certificate expiration date OK
*        server certificate activation date OK
*        certificate public key: RSA
*        certificate version: #3
*        subject: CN=10.64.33.43
*        start date: Thu, 09 Aug 2018 04:00:39 GMT
*        expire date: Sat, 08 Aug 2020 04:00:40 GMT
*        issuer: CN=openshift-signer@1531109367
*        compression: NULL
* ALPN, server accepted to use http/1.1
* Server auth using Basic with user 'developer'
> GET /oauth/authorize?client_id=openshift-challenging-client&response_type=token HTTP/1.1
> Host: 10.64.33.43:8443
> Authorization: Basic ZGV2ZWxvcGVyOmRzc2Zkcw==
> User-Agent: curl/7.47.0
> Accept: */*
> X-CSRF-Token: xxx
>
< HTTP/1.1 302 Found
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Expires: Fri, 01 Jan 1990 00:00:00 GMT
< Location:
https://10.64.33.43:8443/oauth/token/implicit#access_token=gzTwOq_mVJ7ovHliHBTgRQEEXa1aCZD9lnj7lSw3ekQ&expires_in=86400&scope=user%3Afull&token_type=Bearer 3
< Pragma: no-cache
< Set-Cookie: ssn=MTUzNTk0OTc1MnxIckVfNW5vNFlLSlF5MF9GWEF6Zm55Vl95bi1ZNE41S1NCbFJMYnN1TWVwR1hwZmlLMzFQRklzVXRkc0RnUGEzdnBEa0NZZndXV2ZUVzN1dmFPM2dHSUlzUmVXakQ3Q09rVXpxNlRoVmVkQU5DYmdLTE9SUWlyNkJJTm1mSDQ0N2pCV09La3gzMkMzckwxc1V1QXpybFlXT2ZYSmI2R2FTVEZsdDBzRjJ8vk6zrQPjQUmoJCqb8Dt5j5s0b4wZlITgKlho9wlKAZI=; Path=/; HttpOnly; Secure
< Date: Mon, 03 Sep 2018 04:42:32 GMT
< Content-Length: 0
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host 10.64.33.43 left intact
1
client-id is set to openshift-challenging-client and response-type is set to token.
2
Set X-CSRF-Token header to a non-empty value.
3
The token is returned in the Location header of the 302 response as access_token=gzTwOq_mVJ7ovHliHBTgRQEEXa1aCZD9lnj7lSw3ekQ.

To view only the OAuth token value, run the following command:

$ curl -u <username>:<password>
'https://<master-address>:8443/oauth/authorize?client_id=openshift-challenging-client&response_type=token' 1
-skv -H "X-CSRF-Token: xxx" --stderr - |  grep -oP "access_token=\K[^&]*" 2

hvqxe5aMlAzvbqfM2WWw3D6tR0R2jCQGKx0viZBxwmc
1
client-id is set to openshift-challenging-client and response-type is set to token.
2
Set X-CSRF-Token header to a non-empty value.

You can also use the Code Grant method to request a token

4.4. Authorization

4.4.1. Overview

Authorization policies determine whether a user is allowed to perform a given action within a project. This allows platform administrators to use the cluster policy to control who has various access levels to the OpenShift Container Platform platform itself and all projects. It also allows developers to use local policy to control who has access to their projects. Note that authorization is a separate step from authentication, which is more about determining the identity of who is taking the action.

Authorization is managed using:

Rules

Sets of permitted verbs on a set of objects. For example, whether something can create pods.

Roles

Collections of rules. Users and groups can be associated with, or bound to, multiple roles at the same time.

Bindings

Associations between users and/or groups with a role.

Cluster administrators can visualize rules, roles, and bindings using the CLI. For example, consider the following excerpt from viewing a policy, showing rule sets for the admin and basic-user default roles:

admin			Verbs					Resources															Resource Names	Extension
			[create delete get list update watch]	[projects resourcegroup:exposedkube resourcegroup:exposedopenshift resourcegroup:granter secrets]				[]
			[get list watch]			[resourcegroup:allkube resourcegroup:allkube-status resourcegroup:allopenshift-status resourcegroup:policy]			[]
basic-user		Verbs					Resources															Resource Names	Extension
			[get]					[users]																[~]
			[list]					[projectrequests]														[]
			[list]					[projects]															[]
			[create]				[subjectaccessreviews]														[]		IsPersonalSubjectAccessReview

The following excerpt from viewing policy bindings shows the above roles bound to various users and groups:

RoleBinding[admins]:
				Role:	admin
				Users:	[alice system:admin]
				Groups:	[]
RoleBinding[basic-user]:
				Role:	basic-user
				Users:	[joe]
				Groups:	[devel]

The relationships between the the policy roles, policy bindings, users, and developers are illustrated below.

OpenShift Container Platform Authorization Policy

4.4.2. Evaluating Authorization

Several factors are combined to make the decision when OpenShift Container Platform evaluates authorization:

Identity

In the context of authorization, both the user name and list of groups the user belongs to.

Action

The action being performed. In most cases, this consists of:

Project

The project being accessed.

Verb

Can be get, list, create, update, delete, deletecollection or watch.

Resource Name

The API endpoint being accessed.

Bindings

The full list of bindings.

OpenShift Container Platform evaluates authorizations using the following steps:

  1. The identity and the project-scoped action is used to find all bindings that apply to the user or their groups.
  2. Bindings are used to locate all the roles that apply.
  3. Roles are used to find all the rules that apply.
  4. The action is checked against each rule to find a match.
  5. If no matching rule is found, the action is then denied by default.

4.4.3. Cluster Policy and Local Policy

There are two levels of authorization policy:

Cluster policy

Roles and bindings that are applicable across all projects. Roles that exist in the cluster policy are considered cluster roles. Cluster bindings can only reference cluster roles.

Local policy

Roles and bindings that are scoped to a given project. Roles that exist only in a local policy are considered local roles. Local bindings can reference both cluster and local roles.

This two-level hierarchy allows re-usability over multiple projects through the cluster policy while allowing customization inside of individual projects through local policies.

During evaluation, both the cluster bindings and the local bindings are used. For example:

  1. Cluster-wide "allow" rules are checked.
  2. Locally-bound "allow" rules are checked.
  3. Deny by default.

4.4.4. Roles

Roles are collections of policy rules, which are sets of permitted verbs that can be performed on a set of resources. OpenShift Container Platform includes a set of default roles that can be added to users and groups in the cluster policy or in a local policy.

Default RoleDescription

admin

A project manager. If used in a local binding, an admin user will have rights to view any resource in the project and modify any resource in the project except for role creation and quota. If the cluster-admin wants to allow an admin to modify roles, the cluster-admin must create a project-scoped Policy object using JSON.

basic-user

A user that can get basic information about projects and users.

cluster-admin

A super-user that can perform any action in any project. When granted to a user within a local policy, they have full control over quota and roles and every action on every resource in the project.

cluster-status

A user that can get basic cluster status information.

edit

A user that can modify most objects in a project, but does not have the power to view or modify roles or bindings.

self-provisioner

A user that can create their own projects.

view

A user who cannot make any modifications, but can see most objects in a project. They cannot view or modify roles or bindings.

Tip

Remember that users and groups can be associated with, or bound to, multiple roles at the same time.

Cluster administrators can visualize these roles, including a matrix of the verbs and resources each are associated using the CLI to view the cluster roles. Additional system: roles are listed as well, which are used for various OpenShift Container Platform system and component operations.

By default in a local policy, only the binding for the admin role is immediately listed when using the CLI to view local bindings. However, if other default roles are added to users and groups within a local policy, they become listed in the CLI output, as well.

If you find that these roles do not suit you, a cluster-admin user can create a policyBinding object named <projectname>:default with the CLI using a JSON file. This allows the project admin to bind users to roles that are defined only in the <projectname> local policy.

Important

The cluster- role assigned by the project administrator is limited in a project. It is not the same cluster- role granted by the cluster-admin or system:admin.

Cluster roles are roles defined at the cluster level, but can be bound either at the cluster level or at the project level.

Learn how to create a local role for a project.

4.4.4.1. Updating Cluster Roles

After any OpenShift Container Platform cluster upgrade, the recommended default roles may have been updated. See Updating Policy Definitions for instructions on getting to the new recommendations using:

$ oadm policy reconcile-cluster-roles

4.4.5. Security Context Constraints

In addition to authorization policies that control what a user can do, OpenShift Container Platform provides security context constraints (SCC) that control the actions that a pod can perform and what it has the ability to access. Administrators can manage SCCs using the CLI.

SCCs are also very useful for managing access to persistent storage.

SCCs are objects that define a set of conditions that a pod must run with in order to be accepted into the system. They allow an administrator to control the following:

  1. Running of privileged containers.
  2. Capabilities a container can request to be added.
  3. Use of host directories as volumes.
  4. The SELinux context of the container.
  5. The user ID.
  6. The use of host namespaces and networking.
  7. Allocating an FSGroup that owns the pod’s volumes
  8. Configuring allowable supplemental groups
  9. Requiring the use of a read only root file system
  10. Controlling the usage of volume types
  11. Configuring allowable seccomp profiles

Seven SCCs are added to the cluster by default, and are viewable by cluster administrators using the CLI:

$ oc get scc
NAME               PRIV      CAPS      SELINUX     RUNASUSER          FSGROUP     SUPGROUP    PRIORITY   READONLYROOTFS   VOLUMES
anyuid             false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    10         false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
hostaccess         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim secret]
hostmount-anyuid   false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim secret]
hostnetwork        false     []        MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
nonroot            false     []        MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
privileged         true      []        RunAsAny    RunAsAny           RunAsAny    RunAsAny    <none>     false            [*]
restricted         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
Important

Do not modify the default SCCs. Customizing the default SCCs can lead to issues when OpenShift Container Platform is upgraded. Instead, create new SCCs.

The definition for each SCC is also viewable by cluster administrators using the CLI. For example, for the privileged SCC:

# oc export scc/privileged

allowHostDirVolumePlugin: true
allowHostIPC: true
allowHostNetwork: true
allowHostPID: true
allowHostPorts: true
allowPrivilegedContainer: true
allowedCapabilities: null
apiVersion: v1
defaultAddCapabilities: null
fsGroup: 1
  type: RunAsAny
groups: 2
- system:cluster-admins
- system:nodes
kind: SecurityContextConstraints
metadata:
  annotations:
    kubernetes.io/description: 'privileged allows access to all privileged and host
      features and the ability to run as any user, any group, any fsGroup, and with
      any SELinux context.  WARNING: this is the most relaxed SCC and should be used
      only for cluster administration. Grant with caution.'
  creationTimestamp: null
  name: privileged
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities: null
runAsUser: 3
  type: RunAsAny
seLinuxContext: 4
  type: RunAsAny
supplementalGroups: 5
  type: RunAsAny
users: 6
- system:serviceaccount:default:registry
- system:serviceaccount:default:router
- system:serviceaccount:openshift-infra:build-controller
volumes:
- '*'
1
The FSGroup strategy which dictates the allowable values for the Security Context
2
The groups that have access to this SCC
3
The run as user strategy type which dictates the allowable values for the Security Context
4
The SELinux context strategy type which dictates the allowable values for the Security Context
5
The supplemental groups strategy which dictates the allowable supplemental groups for the Security Context
6
The users who have access to this SCC

The users and groups fields on the SCC control which SCCs can be used. By default, cluster administrators, nodes, and the build controller are granted access to the privileged SCC. All authenticated users are granted access to the restricted SCC.

The privileged SCC:

  • allows privileged pods.
  • allows host directories to be mounted as volumes.
  • allows a pod to run as any user.
  • allows a pod to run with any MCS label.
  • allows a pod to use the host’s IPC namespace.
  • allows a pod to use the host’s PID namespace.
  • allows a pod to use any FSGroup.
  • allows a pod to use any supplemental group.

The restricted SCC:

  • ensures pods cannot run as privileged.
  • ensures pods cannot use host directory volumes.
  • requires that a pod run as a user in a pre-allocated range of UIDs.
  • requires that a pod run with a pre-allocated MCS label.
  • allows a pod to use any FSGroup.
  • allows a pod to use any supplemental group.
Note

For more information about each SCC, see the kubernetes.io/description annotation available on the SCC.

SCCs are comprised of settings and strategies that control the security features a pod has access to. These settings fall into three categories:

Controlled by a boolean

Fields of this type default to the most restrictive value. For example, AllowPrivilegedContainer is always set to false if unspecified.

Controlled by an allowable set

Fields of this type are checked against the set to ensure their value is allowed.

Controlled by a strategy

Items that have a strategy to generate a value provide:

  • A mechanism to generate the value, and
  • A mechanism to ensure that a specified value falls into the set of allowable values.

4.4.5.1. SCC Strategies

4.4.5.1.1. RunAsUser
  1. MustRunAs - Requires a runAsUser to be configured. Uses the configured runAsUser as the default. Validates against the configured runAsUser.
  2. MustRunAsRange - Requires minimum and maximum values to be defined if not using pre-allocated values. Uses the minimum as the default. Validates against the entire allowable range.
  3. MustRunAsNonRoot - Requires that the pod be submitted with a non-zero runAsUser or have the USER directive defined in the image. No default provided.
  4. RunAsAny - No default provided. Allows any runAsUser to be specified.
4.4.5.1.2. SELinuxContext
  1. MustRunAs - Requires seLinuxOptions to be configured if not using pre-allocated values. Uses seLinuxOptions as the default. Validates against seLinuxOptions.
  2. RunAsAny - No default provided. Allows any seLinuxOptions to be specified.
4.4.5.1.3. SupplementalGroups
  1. MustRunAs - Requires at least one range to be specified if not using pre-allocated values. Uses the minimum value of the first range as the default. Validates against all ranges.
  2. RunAsAny - No default provided. Allows any supplementalGroups to be specified.
4.4.5.1.4. FSGroup
  1. MustRunAs - Requires at least one range to be specified if not using pre-allocated values. Uses the minimum value of the first range as the default. Validates against the first ID in the first range.
  2. RunAsAny - No default provided. Allows any fsGroup ID to be specified.

4.4.5.2. Controlling Volumes

The usage of specific volume types can be controlled by setting the volumes field of the SCC. The allowable values of this field correspond to the volume sources that are defined when creating a volume:

  • azureFile
  • flocker
  • flexVolume
  • hostPath
  • emptyDir
  • gcePersistentDisk
  • awsElasticBlockStore
  • gitRepo
  • secret
  • nfs
  • iscsi
  • glusterfs
  • persistentVolumeClaim
  • rbd
  • cinder
  • cephFS
  • downwardAPI
  • fc
  • configMap
  • *

The recommended minimum set of allowed volumes for new SCCs are configMap, downwardAPI, emptyDir, persistentVolumeClaim, and secret.

Note

* is a special value to allow the use of all volume types.

Note

For backwards compatibility, the usage of allowHostDirVolumePlugin overrides settings in the volumes field. For example, if allowHostDirVolumePlugin is set to false but allowed in the volumes field, then the hostPath value will be removed from volumes.

4.4.5.3. Seccomp

SeccompProfiles lists the allowed profiles that can be set for the pod or container’s seccomp annotations. An unset (nil) or empty value means that no profiles are specified by the pod or container. Use the wildcard * to allow all profiles. When used to generate a value for a pod, the first non-wildcard profile is used as the default.

Refer to the seccomp documentation for more information about configuring and using custom profiles.

4.4.5.4. Admission

Admission control with SCCs allows for control over the creation of resources based on the capabilities granted to a user.

In terms of the SCCs, this means that an admission controller can inspect the user information made available in the context to retrieve an appropriate set of SCCs. Doing so ensures the pod is authorized to make requests about its operating environment or to generate a set of constraints to apply to the pod.

The set of SCCs that admission uses to authorize a pod are determined by the user identity and groups that the user belongs to. Additionally, if the pod specifies a service account, the set of allowable SCCs includes any constraints accessible to the service account.

Admission uses the following approach to create the final security context for the pod:

  1. Retrieve all SCCs available for use.
  2. Generate field values for security context settings that were not specified on the request.
  3. Validate the final settings against the available constraints.

If a matching set of constraints is found, then the pod is accepted. If the request cannot be matched to an SCC, the pod is rejected.

A pod must validate every field against the SCC. The following are examples for just two of the fields that must be validated:

Note

These examples are in the context of a strategy using the preallocated values.

A FSGroup SCC Strategy of MustRunAs

If the pod defines a fsGroup ID, then that ID must equal the default fsGroup ID. Otherwise, the pod is not validated by that SCC and the next SCC is evaluated.

If the SecurityContextConstraints.fsGroup field has value RunAsAny and the pod specification omits the Pod.spec.securityContext.fsGroup, then this field is considered valid. Note that it is possible that during validation, other SCC settings will reject other pod fields and thus cause the pod to fail.

A SupplementalGroups SCC Strategy of MustRunAs

If the pod specification defines one or more supplementalGroups IDs, then the pod’s IDs must equal one of the IDs in the namespace’s openshift.io/sa.scc.supplemental-groups annotation. Otherwise, the pod is not validated by that SCC and the next SCC is evaluated.

If the SecurityContextConstraints.supplementalGroups field has value RunAsAny and the pod specification omits the Pod.spec.securityContext.supplementalGroups, then this field is considered valid. Note that it is possible that during validation, other SCC settings will reject other pod fields and thus cause the pod to fail.

4.4.5.4.1. SCC Prioritization

SCCs have a priority field that affects the ordering when attempting to validate a request by the admission controller. A higher priority SCC is moved to the front of the set when sorting. When the complete set of available SCCs are determined they are ordered by:

  1. Highest priority first, nil is considered a 0 priority
  2. If priorities are equal, the SCCs will be sorted from most restrictive to least restrictive
  3. If both priorities and restrictions are equal the SCCs will be sorted by name

By default, the anyuid SCC granted to cluster administrators is given priority in their SCC set. This allows cluster administrators to run pods as any user by without specifying a RunAsUser on the pod’s SecurityContext. The administrator may still specify a RunAsUser if they wish.

4.4.5.4.2. Understanding Pre-allocated Values and Security Context Constraints

The admission controller is aware of certain conditions in the security context constraints that trigger it to look up pre-allocated values from a namespace and populate the security context constraint before processing the pod. Each SCC strategy is evaluated independently of other strategies, with the pre-allocated values (where allowed) for each policy aggregated with pod specification values to make the final values for the various IDs defined in the running pod.

The following SCCs cause the admission controller to look for pre-allocated values when no ranges are defined in the pod specification:

  1. A RunAsUser strategy of MustRunAsRange with no minimum or maximum set. Admission looks for the openshift.io/sa.scc.uid-range annotation to populate range fields.
  2. An SELinuxContext strategy of MustRunAs with no level set. Admission looks for the openshift.io/sa.scc.mcs annotation to populate the level.
  3. A FSGroup strategy of MustRunAs. Admission looks for the openshift.io/sa.scc.supplemental-groups annotation.
  4. A SupplementalGroups strategy of MustRunAs. Admission looks for the openshift.io/sa.scc.supplemental-groups annotation.

During the generation phase, the security context provider will default any values that are not specifically set in the pod. Defaulting is based on the strategy being used:

  1. RunAsAny and MustRunAsNonRoot strategies do not provide default values. Thus, if the pod needs a field defined (for example, a group ID), this field must be defined inside the pod specification.
  2. MustRunAs (single value) strategies provide a default value which is always used. As an example, for group IDs: even if the pod specification defines its own ID value, the namespace’s default field will also appear in the pod’s groups.
  3. MustRunAsRange and MustRunAs (range-based) strategies provide the minimum value of the range. As with a single value MustRunAs strategy, the namespace’s default value will appear in the running pod. If a range-based strategy is configurable with multiple ranges, it will provide the minimum value of the first configured range.
Note

FSGroup and SupplementalGroups strategies fall back to the openshift.io/sa.scc.uid-range annotation if the openshift.io/sa.scc.supplemental-groups annotation does not exist on the namespace. If neither exist, the SCC will fail to create.

Note

By default, the annotation-based FSGroup strategy configures itself with a single range based on the minimum value for the annotation. For example, if your annotation reads 1/3, the FSGroup strategy will configure itself with a minimum and maximum of 1. If you want to allow more groups to be accepted for the FSGroup field, you can configure a custom SCC that does not use the annotation.

Note

The openshift.io/sa.scc.supplemental-groups annotation accepts a comma delimited list of blocks in the format of <start>/<length or <start>-<end>. The openshift.io/sa.scc.uid-range annotation accepts only a single block.

4.4.6. Determining What You Can Do as an Authenticated User

From within your OpenShift Container Platform project, you can determine what verbs you can perform against all namespace-scoped resources (including third-party resources). Run:

$ oc policy can-i --list --loglevel=8

The output will help you to determine what API request to make to gather the information.

To receive information back in a user-readable format, run:

$ oc policy can-i --list

The output will provide a full list.

To determine if you can perform specific verbs, run:

$ oc policy can-i <verb> <resource>

User scopes can provide more information about a given scope. For example:

$ oc policy can-i <verb> <resource> --scopes=user:info

4.5. Persistent Storage

4.5.1. Overview

Managing storage is a distinct problem from managing compute resources. OpenShift Container Platform leverages the Kubernetes persistent volume (PV) framework to allow administrators to provision persistent storage for a cluster. Using persistent volume claims (PVCs), developers can request PV resources without having specific knowledge of the underlying storage infrastructure.

PVCs are specific to a project and are created and used by developers as a means to use a PV. PV resources on their own are not scoped to any single project; they can be shared across the entire OpenShift Container Platform cluster and claimed from any project. After a PV has been bound to a PVC, however, that PV cannot then be bound to additional PVCs. This has the effect of scoping a bound PV to a single namespace (that of the binding project).

PVs are defined by a PersistentVolume API object, which represents a piece of existing networked storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plug-ins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. PV objects capture the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.

Important

High-availability of storage in the infrastructure is left to the underlying storage provider.

PVCs are defined by a PersistentVolumeClaim API object, which represents a request for storage by a developer. It is similar to a pod in that pods consume node resources and PVCs consume PV resources. For example, pods can request specific levels of resources (e.g., CPU and memory), while PVCs can request specific storage capacity and access modes (e.g, they can be mounted once read/write or many times read-only).

4.5.2. Lifecycle of a Volume and Claim

PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs have the following lifecycle.

4.5.2.1. Provisioning

A cluster administrator creates some number of PVs. They carry the details of the real storage that is available for use by cluster users. They exist in the API and are available for consumption.

4.5.2.2. Binding

A user creates a PersistentVolumeClaim with a specific amount of storage requested and with certain access modes. A control loop in the master watches for new PVCs, finds a matching PV (if possible), and binds them together. The user will always get at least what they asked for, but the volume may be in excess of what was requested. To minimize the excess, OpenShift Container Platform binds to the smallest PV that matches all other criteria.

Claims remain unbound indefinitely if a matching volume does not exist. Claims are bound as matching volumes become available. For example, a cluster provisioned with many 50Gi volumes would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.

4.5.2.3. Using

Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a pod. For those volumes that support multiple access modes, the user specifies which mode is desired when using their claim as a volume in a pod.

Once a user has a claim and that claim is bound, the bound PV belongs to the user for as long as they need it. Users schedule pods and access their claimed PVs by including a persistentVolumeClaim in their pod’s volumes block. See below for syntax details.

4.5.2.4. Releasing

When a user is done with a volume, they can delete the PVC object from the API which allows reclamation of the resource. The volume is considered "released" when the claim is deleted, but it is not yet available for another claim. The previous claimant’s data remains on the volume which must be handled according to policy.

4.5.2.5. Reclaiming

The reclaim policy of a PersistentVolume tells the cluster what to do with the volume after it is released. Currently, volumes can either be Retain or Recycle.

Retain allows for manual reclamation of the resource. For those volume plug-ins that support it, recycling performs a basic scrub on the volume (e.g., rm -rf /<volume>/*) and makes it available again for a new claim.

4.5.3. Persistent Volumes

Each PV contains a spec and status, which is the specification and status of the volume.

Example 4.1. Persistent Volume Object Definition

  apiVersion: v1
  kind: PersistentVolume
  metadata:
    name: pv0003
  spec:
    capacity:
      storage: 5Gi
    accessModes:
      - ReadWriteOnce
    persistentVolumeReclaimPolicy: Recycle
    nfs:
      path: /tmp
      server: 172.17.0.2

4.5.3.1. Types of Persistent Volumes

OpenShift Container Platform supports the following PersistentVolume plug-ins:

4.5.3.2. Capacity

Generally, a PV will have a specific storage capacity. This is set using the PV’s capacity attribute. See the Kubernetes Resource Model to understand the units expected by capacity.

Currently, storage capacity is the only resource that can be set or requested. Future attributes may include IOPS, throughput, etc.

4.5.3.3. Access Modes

A PersistentVolume can be mounted on a host in any way supported by the resource provider. Providers will have different capabilities and each PV’s access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV’s capabilities.

Claims are matched to volumes with similar access modes. The only two matching criteria are access modes and size. A claim’s access modes represent a request. Therefore, the user may be granted more, but never less. For example, if a claim requests RWO, but the only volume available was an NFS PV (RWO+ROX+RWX), the claim would match NFS because it supports RWO.

Direct matches are always attempted first. The volume’s modes must match or contain more modes than you requested. The size must be greater than or equal to what is expected. If two types of volumes (NFS and iSCSI, for example) both have the same set of access modes, then either of them will match a claim with those modes. There is no ordering between types of volumes and no way to choose one type over another.

All volumes with the same modes are grouped, then sorted by size (smallest to largest). The binder gets the group with matching modes and iterates over each (in size order) until one size matches.

The access modes are:

Access ModeCLI AbbreviationDescription

ReadWriteOnce

RWO

The volume can be mounted as read-write by a single node.

ReadOnlyMany

ROX

The volume can be mounted read-only by many nodes.

ReadWriteMany

RWX

The volume can be mounted as read-write by many nodes.

Important

A volume’s AccessModes are descriptors of the volume’s capabilities. They are not enforced constraints. The storage provider is responsible for runtime errors resulting from invalid use of the resource.

For example, a GCE Persistent Disk has AccessModes ReadWriteOnce and ReadOnlyMany. The user must mark their claims as read-only if they want to take advantage of the volume’s ability for ROX. Errors in the provider show up at runtime as mount errors.

iSCSI and Fibre Channel volumes do not have any fencing mechanisms yet. You must ensure the volumes are only used by one node at a time. In certain situations, such as draining a node, the volumes may be used simultaneously by two nodes. Before draining the node, first ensure the pods that use these volumes are deleted.

The table below lists the access modes supported by different persistent volumes:

Table 4.2. Supported Access Modes for Persistent Volumes
Volume PluginReadWriteOnceReadOnlyManyReadWriteMany

AWS EBS

X

-

-

Azure Disk

X

-

-

Ceph RBD

X

X

-

Fiber Channel

X

X

-

GCE Persistent Disk

X

-

-

GlusterFS

X

X

X

HostPath

X

-

-

iSCSI

X

X

-

NFS

X

X

X

Openstack Cinder

X

-

-

Note
  • If pods rely on AWS EBS, GCE Persistent Disks, or Openstack Cinder PVs, use a recreate deployment strategy
  • Azure Disk does not support dynamic provisioning.

4.5.3.4. Reclaim Policy

The current reclaim policies are:

Reclaim PolicyDescription

Retain

Manual reclamation

Recycle

Basic scrub (e.g, rm -rf /<volume>/*)

Note

Currently, only NFS and HostPath support the 'Recycle' reclaim policy.

4.5.3.5. Phase

A volumes can be found in one of the following phases:

PhaseDescription

Available

A free resource that is not yet bound to a claim.

Bound

The volume is bound to a claim.

Released

The claim has been deleted, but the resource is not yet reclaimed by the cluster.

Failed

The volume has failed its automatic reclamation.

The CLI shows the name of the PVC bound to the PV.

4.5.4. Persistent Volume Claims

Each PVC contains a spec and status, which is the specification and status of the claim.

Persistent Volume Claim Object Definition

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

4.5.4.1. Storage Class

Claims can optionally request a specific StorageClass by specifying its name in the storageClassName attribute. Only PVs of the requested class, ones with the same storageClassName as the PVC, can be bound to the PVC. The cluster administrator can configure dynamic provisioners to service one or more storage classes. They create a PV on demand that matches the specifications in the PVC, if they are able.

The cluster administrator can also set a default StorageClass for all PVCs. When a default storage class is configured, the PVC must explicitly ask for StorageClass or storageClassName annotations set to "" to get bound to a PV with a no storage class.

4.5.4.2. Access Modes

Claims use the same conventions as volumes when requesting storage with specific access modes.

4.5.4.3. Resources

Claims, like pods, can request specific quantities of a resource. In this case, the request is for storage. The same resource model applies to both volumes and claims.

4.5.4.4. Claims As Volumes

Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the pod using the claim. The cluster finds the claim in the pod’s namespace and uses it to get the PersistentVolume backing the claim. The volume is then mounted to the host and into the pod:

kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: dockerfile/nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: myclaim

4.6. Remote Commands

4.6.1. Overview

OpenShift Container Platform takes advantage of a feature built into Kubernetes to support executing commands in containers. This is implemented using HTTP along with a multiplexed streaming protocol such as SPDY or HTTP/2.

Developers can use the CLI to execute remote commands in containers.

4.6.2. Server Operation

The Kubelet handles remote execution requests from clients. Upon receiving a request, it upgrades the response, evaluates the request headers to determine what streams (stdin, stdout, and/or stderr) to expect to receive, and waits for the client to create the streams.

After the Kubelet has received all the streams, it executes the command in the container, copying between the streams and the command’s stdin, stdout, and stderr, as appropriate. When the command terminates, the Kubelet closes the upgraded connection, as well as the underlying one.

Architecturally, there are options for running a command in a container. The supported implementation currently in OpenShift Container Platform invokes nsenter directly on the node host to enter the container’s namespaces prior to executing the command. However, custom implementations could include using docker exec, or running a "helper" container that then runs nsenter so that nsenter is not a required binary that must be installed on the host.

4.7. Port Forwarding

4.7.1. Overview

OpenShift Container Platform takes advantage of a feature built into Kubernetes to support port forwarding to pods. This is implemented using HTTP along with a multiplexed streaming protocol such as SPDY or HTTP/2.

Developers can use the CLI to port forward to a pod. The CLI listens on each local port specified by the user, forwarding via the described protocol.

4.7.2. Server Operation

The Kubelet handles port forward requests from clients. Upon receiving a request, it upgrades the response and waits for the client to create port forwarding streams. When it receives a new stream, it copies data between the stream and the pod’s port.

Architecturally, there are options for forwarding to a pod’s port. The supported implementation currently in OpenShift Container Platform invokes nsenter directly on the node host to enter the pod’s network namespace, then invokes socat to copy data between the stream and the pod’s port. However, a custom implementation could include running a "helper" pod that then runs nsenter and socat, so that those binaries are not required to be installed on the host.

4.8. Source Control Management

OpenShift Container Platform takes advantage of preexisting source control management (SCM) systems hosted either internally (such as an in-house Git server) or externally (for example, on GitHub, Bitbucket, etc.). Currently, OpenShift Container Platform only supports Git solutions.

SCM integration is tightly coupled with builds, the two points being:

  • Creating a BuildConfig using a repository, which allows building your application inside of OpenShift Container Platform. You can create a BuildConfigmanually or let OpenShift Container Platform create it automatically by inspecting your repository.
  • Triggering a build upon repository changes.

4.9. Admission Controllers

4.9.1. Overview

Admission control plug-ins intercept requests to the master API prior to persistence of a resource, but after the request is authenticated and authorized.

Each admission control plug-in is run in sequence before a request is accepted into the cluster. If any plug-in in the sequence rejects the request, the entire request is rejected immediately, and an error is returned to the end-user.

Admission control plug-ins may modify the incoming object in some cases to apply system configured defaults. In addition, admission control plug-ins may modify related resources as part of request processing to do things such as incrementing quota usage.

Warning

The OpenShift Container Platform master has a default list of plug-ins that are enabled by default for each type of resource (Kubernetes and OpenShift Container Platform). These are required for the proper functioning of the master. Modifying these lists is not recommended unless you strictly know what you are doing. Future versions of the product may use a different set of plug-ins and may change their ordering. If you do override the default list of plug-ins in the master configuration file, you are responsible for updating it to reflect requirements of newer versions of the OpenShift Container Platform master.

4.9.2. General Admission Rules

Starting in 3.3, OpenShift Container Platform uses a single admission chain for Kubernetes and OpenShift Container Platform resources. This changed from 3.2, and before where we had separate admission chains. This means that the top-level admissionConfig.pluginConfig element can now contain the admission plug-in configuration, which used to be contained in kubernetesMasterConfig.admissionConfig.pluginConfig.

The kubernetesMasterConfig.admissionConfig.pluginConfig should be moved and merged into admissionConfig.pluginConfig.

Also, starting in 3.3, all the supported admission plug-ins are ordered in the single chain for you. You should no longer set admissionConfig.pluginOrderOverride or the kubernetesMasterConfig.admissionConfig.pluginOrderOverride. Instead, you should enable plug-ins that are off by default by either adding their plug-in-specific configuration, or adding a DefaultAdmissionConfig stanza like this:

admissionConfig:
  pluginConfig:
    AlwaysPullImages: 1
      configuration:
        kind: DefaultAdmissionConfig
        apiVersion: v1
        disable: false 2
1
Admission plug-in name.
2
Indicates that a plug-in should be enabled. It is optional and shown here only for reference.

Setting disable to true will disable an admission plug-in that defaults to on.

Warning

Admission plug-ins are commonly used to help enforce security on the API server. Be careful when disabling them.

Note

If you were previously using admissionConfig elements that cannot be safely combined into a single admission chain, you will get a warning in your API server logs and your API server will start with two separate admission chains for legacy compatibility. Update your admissionConfig to resolve the warning.

4.9.3. Customizable Admission Plug-ins

Cluster administrators can configure some admission control plug-ins to control certain behavior, such as:

4.9.4. Admission Controllers Using Containers

Admission controllers using containers also support init containers.

4.10. Other API Objects

4.10.1. LimitRange

A limit range provides a mechanism to enforce min/max limits placed on resources in a Kubernetes namespace.

By adding a limit range to your namespace, you can enforce the minimum and maximum amount of CPU and Memory consumed by an individual pod or container.

See the Kubernetes documentation for more information.

4.10.2. ResourceQuota

Kubernetes can limit both the number of objects created in a namespace, and the total amount of resources requested across objects in a namespace. This facilitates sharing of a single Kubernetes cluster by several teams, each in a namespace, as a mechanism of preventing one team from starving another team of cluster resources.

See Cluster Administration and Kubernetes documentation for more information on ResourceQuota.

4.10.3. Resource

A Kubernetes Resource is something that can be requested by, allocated to, or consumed by a pod or container. Examples include memory (RAM), CPU, disk-time, and network bandwidth.

See the Developer Guide and Kubernetes documentation for more information.

4.10.4. Secret

Secrets are storage for sensitive information, such as keys, passwords, and certificates. They are accessible by the intended pod(s), but held separately from their definitions.

4.10.5. PersistentVolume

A persistent volume is an object (PersistentVolume) in the infrastructure provisioned by the cluster administrator. Persistent volumes provide durable storage for stateful applications.

See the Kubernetes documentation for more information.

4.10.6. PersistentVolumeClaim

A PersistentVolumeClaim object is a request for storage by a pod author. Kubernetes matches the claim against the pool of available volumes and binds them together. The claim is then used as a volume by a pod. Kubernetes makes sure the volume is available on the same node as the pod that requires it.

See the Kubernetes documentation for more information.

4.10.7. OAuth Objects

4.10.7.1. OAuthClient

An OAuthClient represents an OAuth client, as described in RFC 6749, section 2.

The following OAuthClient objects are automatically created:

openshift-web-console

Client used to request tokens for the web console

openshift-browser-client

Client used to request tokens at /oauth/token/request with a user-agent that can handle interactive logins

openshift-challenging-client

Client used to request tokens with a user-agent that can handle WWW-Authenticate challenges

Example 4.2. OAuthClient Object Definition

kind: "OAuthClient"
apiVersion: "v1"
metadata:
  name: "openshift-web-console" 1
  selflink: "/osapi/v1/oAuthClients/openshift-web-console"
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:01Z"
respondWithChallenges: false 2
secret: "45e27750-a8aa-11e4-b2ea-3c970e4b7ffe" 3
redirectURIs:
  - "https://localhost:8443" 4
1
The name is used as the client_id parameter in OAuth requests.
2
When respondWithChallenges is set to true, unauthenticated requests to /oauth/authorize will result in WWW-Authenticate challenges, if supported by the configured authentication methods.
3
The value in the secret parameter is used as the client_secret parameter in an authorization code flow.
4
One or more absolute URIs can be placed in the redirectURIs section. The redirect_uri parameter sent with authorization requests must be prefixed by one of the specified redirectURIs.

4.10.7.2. OAuthClientAuthorization

An OAuthClientAuthorization represents an approval by a User for a particular OAuthClient to be given an OAuthAccessToken with particular scopes.

Creation of OAuthClientAuthorization objects is done during an authorization request to the OAuth server.

Example 4.3. OAuthClientAuthorization Object Definition

---
kind: "OAuthClientAuthorization"
apiVersion: "v1"
metadata:
  name: "bob:openshift-web-console"
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:01-00:00"
clientName: "openshift-web-console"
userName: "bob"
userUID: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe"
scopes: []
----

4.10.7.3. OAuthAuthorizeToken

An OAuthAuthorizeToken represents an OAuth authorization code, as described in RFC 6749, section 1.3.1.

An OAuthAuthorizeToken is created by a request to the /oauth/authorize endpoint, as described in RFC 6749, section 4.1.1.

An OAuthAuthorizeToken can then be used to obtain an OAuthAccessToken with a request to the /oauth/token endpoint, as described in RFC 6749, section 4.1.3.

Example 4.4. OAuthAuthorizeToken Object Definition

kind: "OAuthAuthorizeToken"
apiVersion: "v1"
metadata:
  name: "MDAwYjM5YjMtMzM1MC00NDY4LTkxODItOTA2OTE2YzE0M2Fj" 1
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:01-00:00"
clientName: "openshift-web-console" 2
expiresIn: 300 3
scopes: []
redirectURI: "https://localhost:8443/console/oauth" 4
userName: "bob" 5
userUID: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe" 6
1
name represents the token name, used as an authorization code to exchange for an OAuthAccessToken.
2
The clientName value is the OAuthClient that requested this token.
3
The expiresIn value is the expiration in seconds from the creationTimestamp.
4
The redirectURI value is the location where the user was redirected to during the authorization flow that resulted in this token.
5
userName represents the name of the User this token allows obtaining an OAuthAccessToken for.
6
userUID represents the UID of the User this token allows obtaining an OAuthAccessToken for.

4.10.7.4. OAuthAccessToken

An OAuthAccessToken represents an OAuth access token, as described in RFC 6749, section 1.4.

An OAuthAccessToken is created by a request to the /oauth/token endpoint, as described in RFC 6749, section 4.1.3.

Access tokens are used as bearer tokens to authenticate to the API.

Example 4.5. OAuthAccessToken Object Definition

kind: "OAuthAccessToken"
apiVersion: "v1"
metadata:
  name: "ODliOGE5ZmMtYzczYi00Nzk1LTg4MGEtNzQyZmUxZmUwY2Vh" 1
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:02-00:00"
clientName: "openshift-web-console" 2
expiresIn: 86400 3
scopes: []
redirectURI: "https://localhost:8443/console/oauth" 4
userName: "bob" 5
userUID: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe" 6
authorizeToken: "MDAwYjM5YjMtMzM1MC00NDY4LTkxODItOTA2OTE2YzE0M2Fj" 7
1
name is the token name, which is used as a bearer token to authenticate to the API.
2
The clientName value is the OAuthClient that requested this token.
3
The expiresIn value is the expiration in seconds from the creationTimestamp.
4
The redirectURI is where the user was redirected to during the authorization flow that resulted in this token.
5
userName represents the User this token allows authentication as.
6
userUID represents the User this token allows authentication as.
7
authorizeToken is the name of the OAuthAuthorizationToken used to obtain this token, if any.

4.10.8. User Objects

4.10.8.1. Identity

When a user logs into OpenShift Container Platform, they do so using a configured identity provider. This determines the user’s identity, and provides that information to OpenShift Container Platform.

OpenShift Container Platform then looks for a UserIdentityMapping for that Identity:

  • If the Identity already exists, but is not mapped to a User, login fails.
  • If the Identity already exists, and is mapped to a User, the user is given an OAuthAccessToken for the mapped User.
  • If the Identity does not exist, an Identity, User, and UserIdentityMapping are created, and the user is given an OAuthAccessToken for the mapped User.

Example 4.6. Identity Object Definition

kind: "Identity"
apiVersion: "v1"
metadata:
  name: "anypassword:bob" 1
  uid: "9316ebad-0fde-11e5-97a1-3c970e4b7ffe"
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:01-00:00"
providerName: "anypassword" 2
providerUserName: "bob" 3
user:
  name: "bob" 4
  uid: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe" 5
1
The identity name must be in the form providerName:providerUserName.
2
providerName is the name of the identity provider.
3
providerUserName is the name that uniquely represents this identity in the scope of the identity provider.
4
The name in the user parameter is the name of the user this identity maps to.
5
The uid represents the UID of the user this identity maps to.

4.10.8.2. User

A User represents an actor in the system. Users are granted permissions by adding roles to users or to their groups.

User objects are created automatically on first login, or can be created via the API.

Note

OpenShift Container Platform user names containing /, :, and % are not supported.

Example 4.7. User Object Definition

kind: "User"
apiVersion: "v1"
metadata:
  name: "bob" 1
  uid: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe"
  resourceVersion: "1"
  creationTimestamp: "2015-01-01T01:01:01-00:00"
identities:
  - "anypassword:bob" 2
fullName: "Bob User" 3

<1> `name` is the user name used when adding roles to a user.
<2> The values in `identities` are Identity objects that map to this user. May be `null` or empty for users that cannot log in.
<3> The `fullName` value is an optional display name of user.

4.10.8.3. UserIdentityMapping

A UserIdentityMapping maps an Identity to a User.

Creating, updating, or deleting a UserIdentityMapping modifies the corresponding fields in the Identity and User objects.

An Identity can only map to a single User, so logging in as a particular identity unambiguously determines the User.

A User can have multiple identities mapped to it. This allows multiple login methods to identify the same User.

Example 4.8. UserIdentityMapping Object Definition

kind: "UserIdentityMapping"
apiVersion: "v1"
metadata:
  name: "anypassword:bob" 1
  uid: "9316ebad-0fde-11e5-97a1-3c970e4b7ffe"
  resourceVersion: "1"
identity:
  name: "anypassword:bob"
  uid: "9316ebad-0fde-11e5-97a1-3c970e4b7ffe"
user:
  name: "bob"
  uid: "9311ac33-0fde-11e5-97a1-3c970e4b7ffe"

<1> `*UserIdentityMapping*` name matches the mapped `*Identity*` name

4.10.8.4. Group

A Group represents a list of users in the system. Groups are granted permissions by adding roles to users or to their groups.

Example 4.9. Group Object Definition

kind: "Group"
apiVersion: "v1"
metadata:
  name: "developers" 1
  creationTimestamp: "2015-01-01T01:01:01-00:00"
users:
  - "bob" 2
1 1 1
name is the group name used when adding roles to a group.
2 2
The values in users are the names of User objects that are members of this group.


[1] After this point, device names refer to devices on container B’s host.
Red Hat logoGithubRedditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

© 2024 Red Hat, Inc.