Ce contenu n'est pas disponible dans la langue sélectionnée.

Chapter 13. Troubleshooting a service network


Typically, you can create a service network without referencing this troubleshooting guide. However, this guide provides some tips for situations when the service network does not perform as expected.

See Section 13.8, “Resolving common problems” if you have encountered a specific issue using the skupper CLI.

A typical troubleshooting workflow is to check all the sites and create debug tar files.

13.1. Checking sites

Using the skupper command-line interface (CLI) provides a simple method to get started with troubleshooting Skupper.

Procedure

  1. Check the site status:

    $ skupper status --namespace west
    
    Skupper is enabled for namespace "west" in interior mode. It is connected to 2 other sites. It has 1 exposed services.

    The output shows:

    • A site exists in the specified namespace.
    • A link exists to two other sites.
    • A service is exposed on the service network and is accessible from this namespace.
  2. Check the service network:

    $ skupper network status
    Sites:
    ├─ [local] a960b766-20bd-42c8-886d-741f3a9f6aa2(west)
    │  │ namespace: west
    │  │ site name: west
    │  │ version: 1.5.1
    │  ╰─ Linked sites:
    │     ├─ 496ca1de-0c80-4e70-bbb4-d0d6ec2a09c0(east)
    │     │  direction: outgoing
    │     ╰─ 484cccc3-401c-4c30-a6ed-73382701b18a()
    │        direction: incoming
    ├─ [remote] 496ca1de-0c80-4e70-bbb4-d0d6ec2a09c0(east)
    │  │ namespace: east
    │  │ site name: east
    │  │ version: 1.5.1
    │  ╰─ Linked sites:
    │     ╰─ a960b766-20bd-42c8-886d-741f3a9f6aa2(west)
    │        direction: incoming
    ╰─ [remote] 484cccc3-401c-4c30-a6ed-73382701b18a()
       │ site name: vm-user-c3d98
       │ version: 1.5.1
       ╰─ Linked sites:
          ╰─ a960b766-20bd-42c8-886d-741f3a9f6aa2(west)
             direction: outgoing
    Note

    If the output is not what you expected, you might want to check links before proceeding.

    The output shows:

    • There are 3 sites on the service network, vm-user-c3d98, east and west.
    • Details for each site, for example the namespace names.
  3. Check the status of services exposed on the service network (-v is only available on Kubernetes):

    $ skupper service status -v
    Services exposed through Skupper:
    ╰─ backend:8080 (tcp)
       ╰─ Sites:
          ├─ 4d80f485-52fb-4d84-b10b-326b96e723b2(west)
          │  policy: disabled
          ╰─ 316fbe31-299b-490b-9391-7b46507d76f1(east)
             │ policy: disabled
             ╰─ Targets:
                ╰─ backend:8080 name=backend-9d84544df-rbzjx

    The output shows the backend service and the related target of that service.

    Note

    As part of output each site reports the status of the policy system on that cluster.

  4. List the Skupper events for a site:

    $ skupper debug events
    NAME                         COUNT                                                          AGE
    GatewayQueryRequest          3                                                              9m12s
                                 3     gateway request                                          9m12s
    SiteQueryRequest             3                                                              9m12s
                                 3     site data request                                        9m12s
    ServiceControllerEvent       9                                                              10m24s
                                 2     service event for west/frontend                          10m24s
                                 1     service event for west/backend                           10m26s
                                 1     Checking service for: backend                            10m26s
                                 2     Service definitions have changed                         10m26s
                                 1     service event for west/skupper-router                    11m4s
    DefinitionMonitorEvent       15                                                             10m24s
                                 2     service event for west/frontend                          10m24s
                                 1     service event for west/backend                           10m26s
                                 1     Service definitions have changed                         10m26s
                                 5     deployment event for west/frontend                       10m34s
                                 1     deployment event for west/skupper-service-controller     11m4s
    ServiceControllerUpdateEvent 1                                                              10m26s
                                 1     Updating skupper-internal                                10m26s
    ServiceSyncEvent             3                                                              10m26s
                                 1     Service interface(s) added backend                       10m26s
                                 1     Service sync sender connection to                        11m4s
                                       amqps://skupper-router-local.west.svc.cluster.local:5671
                                       established
                                 1     Service sync receiver connection to                      11m4s
                                       amqps://skupper-router-local.west.svc.cluster.local:5671
                                       established
    IpMappingEvent               5                                                              10m34s
                                 1     172.17.0.7 mapped to frontend-6b4688bf56-rp9hc           10m34s
                                 2      mapped to frontend-6b4688bf56-rp9hc                     10m54s
                                 1     172.17.0.4 mapped to                                     11m4s
                                       skupper-service-controller-6c97c5cf5d-6nzph
                                 1     172.17.0.3 mapped to skupper-router-547dffdcbf-l8pdc     11m4s
    TokenClaimVerification       1                                                              10m59s
                                 1     Claim for efe3a241-3e4f-11ed-95d0-482ae336eb38 succeeded 10m59s

    The output shows sites being linked and a service being exposed on a service network. However, this output is most useful when reporting an issue and is included in the Skupper debug tar file.

  5. List the Kubernetes events for a site:

    kubectl get events | grep "deployment/skupper-service-controller"
    10m         Normal    ServiceSyncEvent               deployment/skupper-service-controller   Service sync receiver connection to amqps://skupper-router-local.private1.svc.cluster.local:5671 established
    10m         Normal    ServiceSyncEvent               deployment/skupper-service-controller   Service sync sender connection to amqps://skupper-router-local.private1.svc.cluster.local:5671 established
    10m         Normal    ServiceControllerCreateEvent   deployment/skupper-service-controller   Creating service productcatalogservice
    7m59s       Normal    TokenHandler                   deployment/skupper-service-controller   Connecting using token link1
    7m54s       Normal    TokenHandler                   deployment/skupper-service-controller   Connecting using token link2

    The output shows events relating to Kubernetes resources.

Additional information

13.3. Checking gateways

By default, skupper gateway creates a service type gateway and these gateways run properly after a machine restart.

However, if you create a docker or podman type gateway, check that the container is running after a machine restart. For example:

  1. Check the status of Skupper gateways:

    $ skupper gateway status
    
    Gateway Definition:
    ╰─ machine-user type:podman version:1.5
       ╰─ Bindings:
          ╰─ mydb:3306 tcp mydb:3306 localhost 3306

    This shows a podman type gateway.

  2. Check that the container is running:

    $ podman ps
    CONTAINER ID  IMAGE                                           COMMAND               CREATED         STATUS             PORTS                   NAMES
    4e308ef8ee58  quay.io/skupper/skupper-router:1.5             /home/skrouterd/b...  26 seconds ago  Up 27 seconds ago                          machine-user

    This shows the container running.

    Note

    To view stopped containers, use podman ps -a or docker ps -a.

  3. Start the container if necessary:

    $ podman start machine-user

13.4. Checking policies

As a developer you might not be aware of the Skupper policy applied to your site. Follow this procedure to explore the policies applied to the site.

Procedure

  1. Log into a namespace where a Skupper site has been initialized.
  2. Check whether incoming links are permitted:

    $ kubectl exec deploy/skupper-service-controller -- get policies incominglink
    
    ALLOWED POLICY ENABLED ERROR                                                   ALLOWED BY
    false   true           Policy validation error: incoming links are not allowed

    In this example incoming links are not allowed by policy.

  3. Check other policies:

    $ kubectl exec deploy/skupper-service-controller -- get policies
    Validates existing policies
    
    Usage:
      get policies [command]
    
    Available Commands:
      expose       Validates if the given resource can be exposed
      incominglink Validates if incoming links can be created
      outgoinglink Validates if an outgoing link to the given hostname is allowed
      service      Validates if service can be created or imported

    As shown, there are commands to check each policy type by specifying what you want to do, for example, to check if you can expose an nginx deployment:

    $ kubectl  exec deploy/skupper-service-controller -- get policies expose deployment nginx
    ALLOWED POLICY ENABLED ERROR                                                       ALLOWED BY
    false   true           Policy validation error: deployment/nginx cannot be exposed

    If you allowed an nginx deployment, the same command shows that the resource is allowed and displays the name of the policy CR that enabled it:

    $ kubectl  exec deploy/skupper-service-controller -- get policies expose deployment nginx
    ALLOWED POLICY ENABLED ERROR                                                       ALLOWED BY
    true    true                                                                       allowedexposedresources

13.5. Creating a Skupper debug tar file

The debug tar file contains all the logs from the Skupper components for a site and provides detailed information to help debug issues.

  1. Create the debug tar file:

    $  skupper debug dump my-site
    
    Skupper dump details written to compressed archive:  `my-site.tar.gz`
  2. You can expand the file using the following command:

    $ tar -xvf kind-site.tar.gz
    
    k8s-versions.txt
    skupper-versions.txt
    skupper-router-deployment.yaml
    skupper-router-867f5ddcd8-plrcg-skstat-g.txt
    skupper-router-867f5ddcd8-plrcg-skstat-c.txt
    skupper-router-867f5ddcd8-plrcg-skstat-l.txt
    skupper-router-867f5ddcd8-plrcg-skstat-n.txt
    skupper-router-867f5ddcd8-plrcg-skstat-e.txt
    skupper-router-867f5ddcd8-plrcg-skstat-a.txt
    skupper-router-867f5ddcd8-plrcg-skstat-m.txt
    skupper-router-867f5ddcd8-plrcg-skstat-p.txt
    skupper-router-867f5ddcd8-plrcg-router-logs.txt
    skupper-router-867f5ddcd8-plrcg-config-sync-logs.txt
    skupper-service-controller-deployment.yaml
    skupper-service-controller-7485756984-gvrf6-events.txt
    skupper-service-controller-7485756984-gvrf6-service-controller-logs.txt
    skupper-site-configmap.yaml
    skupper-services-configmap.yaml
    skupper-internal-configmap.yaml
    skupper-sasl-config-configmap.yaml

    These files can be used to provide support for Skupper, however some items you can check:

    versions
    See *versions.txt for the versions of various components.
    ingress
    See skupper-site-configmap.yaml to determine the ingress type for the site.
    linking and services
    See the skupper-service-controller-*-events.txt file to view details of token usage and service exposure.

13.6. Understanding Skupper sizing

In September 2023, a number of tests were performed to explore Skupper performance at varying allocations of router CPU. You can view the results in the sizing guide.

The conclusions for router CPU and memory are shown below.

Router CPU

The primary factor to consider when scaling Skupper for your workload is router CPU. (Note that due to the nature of cluster ingress and connection routing, it is important to focus on scaling the router vertically, not horizontally.)

Two CPU cores (2,000 millicores) per router is a good starting point. It includes some headroom and provides low latencies for a large set of workloads.

If the peak throughput required by your workload is low, it is possible to achieve satisfactory latencies with less router CPU.

Some workloads are sensitive to network latency. In these cases, the overhead introduced by the router can limit the achievable throughput. This is when CPU amounts higher than two cores per router may be required.

On the flip side, some workloads are tolerant of network latency. In these cases, one core or less may be sufficient.

These benchmark results are not the last word. They depend on the specifics of our test environment. To get a better idea of how Skupper performs in your environment, you can run these benchmarks yourself.

Router memory

Router memory use scales with the number of open connections. In general, a good starting point is 4G.

Memory

Concurrent open connections

 

512M

8,192

 

1G

16,384

 

2G

32,768

 

4G

65,536

 

8G

131,072

 

16G

262,144

 

32G

524,288

 

64G

104,8576

 

13.7. Improving Skupper router performance

If you encounter Skupper router performance issues, you can scale the Skupper router to address those concerns.

Note

Currently, you must delete and recreate a site to reconfigure the Skupper router.

For example, use this procedure to increase throughput, and if you have many clients, latency.

  1. Delete your site or create a new site in a different namespace.

    Note all configuration and delete your existing site:

    $ skupper delete

    As an alternative, you can create a new namespace and configure a new site with optimized Skupper router performance. After validating the performance improvement, you can delete and recreate your original site.

  2. Create a site with optimal performance CPU settings:

    $ skupper init --router-cpu 5
  3. Recreate your configuration from step 1, recreating links and services.
Note

While you can address availability concerns by scaling the number of routers, typically this is not necessary.

13.8. Resolving common problems

The following issues and workarounds might help you debug simple scenarios when evaluating Skupper.

Cannot initialize skupper

If the skupper init command fails, consider the following options:

  • Check the load balancer.

    If you are evaluating Skupper on minikube, use the following command to create a load balancer:

    $ minikube tunnel

    For other Kubernetes flavors, see the documentation from your provider.

  • Initialize without ingress.

    This option prevents other sites from linking to this site, but linking outwards is supported. Once a link is established, traffic can flow in either direction. Enter the following command:

    $ skupper init --ingress none
    Note

    See the Skupper Podman CLI reference documentation for skupper init.

Cannot link sites

To link two sites, one site must be accessible from the other site. For example, if one site is behind a firewall and the other site is on an AWS cluster, you must:

  1. Create a token on the AWS cluster site.
  2. Create the link on the site inside the firewall.
Note

By default, a token is valid for only 15 minutes and can only be used once. See Using Skupper tokens for more information on creating different types of tokens.

Cannot access Skupper console

Starting with Skupper release 1.3, the console is not enabled by default. To use the new console, see Using the console.

Use skupper status to find the console URL.

Use the following command to display the password for the admin user:doctype: article

$ kubectl get secret/skupper-console-users -o jsonpath={.data.admin} | base64 -d

Cannot create a token for linking clusters

There are several reasons why you might have difficulty creating tokens:

Site not ready

After creating a site, you might see the following message when creating a token:

Error: Failed to create token: Policy validation error: Skupper is not enabled in namespace

Use skupper status to verify the site is working and try to create the token again.

No ingress

You might see the following note after using the skupper token create command:

Token written to <path> (Note: token will only be valid for local cluster)

This output indicates that the site was deployed without an ingress option. For example skupper init --ingress none. You must specify an ingress to allow sites on other clusters to link to your site.

You can also use the skupper token create command to check if an ingress was specified when the site was created.

13.9. Deleting services from the service network

This section describes how services can be disabled for a service network.

Prerequisites

  • A service network
  • An exposed service

Procedure

  1. Navigate to the context where the service was exposed.
  2. Delete the service:

    $ skupper service delete <service-name>

    where service-name is the name of the service you want to remove.

    Note

    TCP connections can remain active for an extended duration. After deleting the service, existing connections continue to communicate until the TCP connection is terminated. For example, if you exposed a database connection over the service network, and then delete the service. New database connections cannot be established. However, deleting the service does not affect existing connections. To terminate the existing connection, manually stop the database connection.

  3. Check that the service is removed.

    $ skupper service status

    The service should not be listed.

    Note

    Typically, if the service is still listed, it is because you issued the delete command from the wrong site context. By default, when you expose a service from a site, that service becomes available on all other sites, however you can delete the service only from the original site context.

Red Hat logoGithubRedditYoutubeTwitter

Apprendre

Essayez, achetez et vendez

Communautés

À propos de la documentation Red Hat

Nous aidons les utilisateurs de Red Hat à innover et à atteindre leurs objectifs grâce à nos produits et services avec un contenu auquel ils peuvent faire confiance.

Rendre l’open source plus inclusif

Red Hat s'engage à remplacer le langage problématique dans notre code, notre documentation et nos propriétés Web. Pour plus de détails, consultez leBlog Red Hat.

À propos de Red Hat

Nous proposons des solutions renforcées qui facilitent le travail des entreprises sur plusieurs plates-formes et environnements, du centre de données central à la périphérie du réseau.

© 2024 Red Hat, Inc.