Chapter 14. Troubleshooting a service network
Typically, you can create a service network without referencing this troubleshooting guide. However, this guide provides some tips for situations when the service network does not perform as expected.
See Section 14.8, “Resolving common problems” if you have encountered a specific issue using the skupper
CLI.
A typical troubleshooting workflow is to check all the sites and create debug tar files.
14.1. Checking sites Copy linkLink copied to clipboard!
Using the skupper
command-line interface (CLI) provides a simple method to get started with troubleshooting Skupper.
Procedure
Check the site status:
skupper status --namespace west
$ skupper status --namespace west Skupper is enabled for namespace "west" in interior mode. It is connected to 2 other sites. It has 1 exposed services.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows:
- A site exists in the specified namespace.
- A link exists to two other sites.
- A service is exposed on the service network and is accessible from this namespace.
Check the service network:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf the output is not what you expected, you might want to check links before proceeding.
The output shows:
-
There are 3 sites on the service network,
vm-user-c3d98
,east
andwest
. - Details for each site, for example the namespace names.
-
There are 3 sites on the service network,
Check the status of services exposed on the service network (
-v
is only available on Kubernetes):Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows the
backend
service and the related target of that service.NoteAs part of output each site reports the status of the policy system on that cluster.
List the Skupper events for a site:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows sites being linked and a service being exposed on a service network. However, this output is most useful when reporting an issue and is included in the Skupper debug tar file.
List the Kubernetes events for a site:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows events relating to Kubernetes resources.
Additional information
14.2. Checking links Copy linkLink copied to clipboard!
You must link sites before you can expose services on the service network.
By default, tokens expire after 5 minutes and you can only use a token once. Generate a new token if the link is not connected. You can also generate tokens using the -token-type cert
option for permanent reusable tokens.
This section outlines some advanced options for checking links.
Check the link status:
skupper link status --namespace east
$ skupper link status --namespace east Links created from this site: ------------------------------- Link link1 is connected
Copy to Clipboard Copied! Toggle word wrap Toggle overflow A link exists from the specified site to another site, meaning a token from another site was applied to the specified site.
NoteRunning
skupper link status
on a connected site produces output only if a token was used to create a link.If you use this command on a site where you did not create the link, but there is an incoming link to the site:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the verbose link status:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows detail about the link, including a timestamp of when the link was created and the associated relative cost of using the link.
The status of the link must be
Connected
to allow service traffic.
Additional information
14.3. Checking gateways Copy linkLink copied to clipboard!
By default, skupper gateway
creates a service type gateway and these gateways run properly after a machine restart.
However, if you create a docker or podman type gateway, check that the container is running after a machine restart. For example:
Check the status of Skupper gateways:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This shows a podman type gateway.
Check that the container is running:
podman ps
$ podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4e308ef8ee58 quay.io/skupper/skupper-router:1.9 /home/skrouterd/b... 26 seconds ago Up 27 seconds ago machine-user
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This shows the container running.
NoteTo view stopped containers, use
podman ps -a
ordocker ps -a
.Start the container if necessary:
podman start machine-user
$ podman start machine-user
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
14.4. Checking policies Copy linkLink copied to clipboard!
As a developer you might not be aware of the Skupper policy applied to your site. Follow this procedure to explore the policies applied to the site.
Procedure
- Log into a namespace where a Skupper site has been initialized.
Check whether incoming links are permitted:
kubectl exec deploy/skupper-service-controller -- get policies incominglink
$ kubectl exec deploy/skupper-service-controller -- get policies incominglink ALLOWED POLICY ENABLED ERROR ALLOWED BY false true Policy validation error: incoming links are not allowed
Copy to Clipboard Copied! Toggle word wrap Toggle overflow In this example incoming links are not allowed by policy.
Check other policies:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow As shown, there are commands to check each policy type by specifying what you want to do, for example, to check if you can expose an nginx deployment:
kubectl exec deploy/skupper-service-controller -- get policies expose deployment nginx
$ kubectl exec deploy/skupper-service-controller -- get policies expose deployment nginx ALLOWED POLICY ENABLED ERROR ALLOWED BY false true Policy validation error: deployment/nginx cannot be exposed
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you allowed an nginx deployment, the same command shows that the resource is allowed and displays the name of the policy CR that enabled it:
kubectl exec deploy/skupper-service-controller -- get policies expose deployment nginx
$ kubectl exec deploy/skupper-service-controller -- get policies expose deployment nginx ALLOWED POLICY ENABLED ERROR ALLOWED BY true true allowedexposedresources
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
14.5. Creating a Skupper debug tar file Copy linkLink copied to clipboard!
The debug tar file contains all the logs from the Skupper components for a site and provides detailed information to help debug issues.
Create the debug tar file:
skupper debug dump my-site
$ skupper debug dump my-site Skupper dump details written to compressed archive: `my-site.tar.gz`
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can expand the file using the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow These files can be used to provide support for Skupper, however some items you can check:
- versions
-
See
*versions.txt
for the versions of various components. - ingress
-
See
skupper-site-configmap.yaml
to determine theingress
type for the site. - linking and services
-
See the
skupper-service-controller-*-events.txt
file to view details of token usage and service exposure.
14.6. Understanding Skupper sizing Copy linkLink copied to clipboard!
In September 2023, a number of tests were performed to explore Skupper performance at varying allocations of router CPU. You can view the results in the sizing guide.
The conclusions for router CPU and memory are shown below.
Router CPU
The primary factor to consider when scaling Skupper for your workload is router CPU. (Note that due to the nature of cluster ingress and connection routing, it is important to focus on scaling the router vertically, not horizontally.)
Two CPU cores (2,000 millicores) per router is a good starting point. It includes some headroom and provides low latencies for a large set of workloads.
If the peak throughput required by your workload is low, it is possible to achieve satisfactory latencies with less router CPU.
Some workloads are sensitive to network latency. In these cases, the overhead introduced by the router can limit the achievable throughput. This is when CPU amounts higher than two cores per router may be required.
On the flip side, some workloads are tolerant of network latency. In these cases, one core or less may be sufficient.
These benchmark results are not the last word. They depend on the specifics of our test environment. To get a better idea of how Skupper performs in your environment, you can run these benchmarks yourself.
Router memory
Router memory use scales with the number of open connections. In general, a good starting point is 4G.
Memory | Concurrent open connections | |
512M | 8,192 | |
1G | 16,384 | |
2G | 32,768 | |
4G | 65,536 | |
8G | 131,072 | |
16G | 262,144 | |
32G | 524,288 | |
64G | 104,8576 |
14.7. Improving Skupper router performance Copy linkLink copied to clipboard!
If you encounter Skupper router performance issues, you can scale the Skupper router to address those concerns.
Currently, you must delete and recreate a site to reconfigure the Skupper router.
For example, use this procedure to increase throughput, and if you have many clients, latency.
Delete your site or create a new site in a different namespace.
Note all configuration and delete your existing site:
skupper delete
$ skupper delete
Copy to Clipboard Copied! Toggle word wrap Toggle overflow As an alternative, you can create a new namespace and configure a new site with optimized Skupper router performance. After validating the performance improvement, you can delete and recreate your original site.
Create a site with optimal performance CPU settings:
skupper init --router-cpu 5
$ skupper init --router-cpu 5
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Recreate your configuration from step 1, recreating links and services.
While you can address availability concerns by scaling the number of routers, typically this is not necessary.
14.8. Resolving common problems Copy linkLink copied to clipboard!
The following issues and workarounds might help you debug simple scenarios when evaluating Skupper.
Cannot initialize skupper
If the skupper init
command fails, consider the following options:
Check the load balancer.
If you are evaluating Skupper on minikube, use the following command to create a load balancer:
minikube tunnel
$ minikube tunnel
Copy to Clipboard Copied! Toggle word wrap Toggle overflow For other Kubernetes flavors, see the documentation from your provider.
Initialize without ingress.
This option prevents other sites from linking to this site, but linking outwards is supported. Once a link is established, traffic can flow in either direction. Enter the following command:
skupper init --ingress none
$ skupper init --ingress none
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteSee the Skupper Podman CLI reference documentation for
skupper init
.
Cannot link sites
To link two sites, one site must be accessible from the other site. For example, if one site is behind a firewall and the other site is on an AWS cluster, you must:
- Create a token on the AWS cluster site.
- Create the link on the site inside the firewall.
By default, a token is valid for only 15 minutes and can only be used once. See Using Skupper tokens for more information on creating different types of tokens.
Cannot access Skupper console
Starting with Skupper release 1.3, the console is not enabled by default. To use the new console, see Using the console.
Use skupper status
to find the console URL.
Use the following command to display the password for the admin
user:doctype: article
kubectl get secret/skupper-console-users -o jsonpath={.data.admin} | base64 -d
$ kubectl get secret/skupper-console-users -o jsonpath={.data.admin} | base64 -d
Cannot create a token for linking clusters
There are several reasons why you might have difficulty creating tokens:
- Site not ready
After creating a site, you might see the following message when creating a token:
Error: Failed to create token: Policy validation error: Skupper is not enabled in namespace
Error: Failed to create token: Policy validation error: Skupper is not enabled in namespace
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use
skupper status
to verify the site is working and try to create the token again.- No ingress
You might see the following note after using the
skupper token create
command:Token written to <path> (Note: token will only be valid for local cluster)
Token written to <path> (Note: token will only be valid for local cluster)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This output indicates that the site was deployed without an ingress option. For example
skupper init --ingress none
. You must specify an ingress to allow sites on other clusters to link to your site.You can also use the
skupper token create
command to check if an ingress was specified when the site was created.