此内容没有您所选择的语言版本。
Chapter 2. Scaling the tested deployment models
You can scale your Ansible Automation Platform tested deployment models to meet your workload requirements. This document refers to the following scaling methods:
- Vertical scaling
- Horizontal scaling
2.1. Vertical scaling for performance 复制链接链接已复制到粘贴板!
Vertical scaling increases the physical resources available to a service, including the CPU, memory, disk volume, and disk Input/Output Operations per Second (IOPS). Use vertical scaling for deployments with high resource utilization or workload demand.
2.1.1. Benefits of vertical scaling 复制链接链接已复制到粘贴板!
- Relieves resource contention: Applications have access to more resources and this can relieve resource contention or exhaustion.
- Extensive testing required: The installer attempts to tune application and system configurations to leverage additional resources, but not all components of the application automatically scale in relation to machine size. Manually tuning each variable requires extensive testing. For this reason, after an instance size has been verified for an environment, horizontal scaling by adding more instances of the same size is recommended.
- Application-level limitations: For VM-based installation or Containerized deployments, instances with more than 64 CPU cores and 128 GB of RAM might not scale linearly due to system and application-level limits.
- Resource overcommit: Overcommitting virtual machine resources (for example, allocating more virtual CPU/RAM to guests than is physically available on the host) leads to unpredictable performance.
CPU throttling: In OpenShift Container Platform, setting a CPU
limitwithout an equivalentrequestcan lead to CPU throttling, even if the node has spare CPU capacity. This throttling negatively impacts API latency.-
To mitigate this, always set CPU
requestsequal to CPUlimits. -
Monitor CPU throttling using the
container_cpu_cfs_throttled_seconds_totalmetric.
-
To mitigate this, always set CPU
- Database limitations: Scaling the application increases the maximum potential number of database connections from worker processes and the overall memory, I/O, and CPU utilization of the PostgreSQL instance. As you scale past the tested deployment models, deploy your separate Postgres instances per component (platform gateway, Event-Driven Ansible, automation controller, automation hub).
2.2. Horizontal scaling for performance 复制链接链接已复制到粘贴板!
Horizontal scaling involves increasing the number of replicas (pods or virtual machines) for a given service. Like vertical scaling, this approach is useful for high resource utilization or workload scaling.
2.2.1. Benefits of horizontal scaling 复制链接链接已复制到粘贴板!
- Improved availability: Distributes load across more instances to reduce the impact of a single slow or failing node.
- Redundancy: Provides extra capacity, allowing individual service nodes to recover or cool-off without impacting overall availability.
- Increased authentication capacity: Each platform gateway pod includes its own authentication service, so scaling the platform gateway directly increases the platform’s authentication throughput.
- Repeatable scaling procedure: After the instance size and configuration are verified for your environment, deploy identical instances to scale.
- Database limitations: Scaling the application increases the maximum potential number of database connections from worker processes, and the overall memory, I/O, and CPU utilization of the PostgreSQL instance. As you scale past the tested deployment models, deploy your separate Postgres instances per component (platform gateway, Event-Driven Ansible, automation controller, automation hub).
- Health Check Overhead: In a mesh architecture, each Envoy proxy sends health checks to every other cluster member. Horizontal scaling increases this baseline traffic, adding to system overhead.
Considerations for scaling and managing each Ansible Automation Platform component differ based on the installer type. The following table provides information on scaling and management operations for each installer type, as well as other common operations to consider when planning your deployment:
Red Hat OpenShift Container Platform-based deployments provide the most flexibility and customizability, which enables you to adapt the deployment as needs change. OpenShift Container Platform also provides fine-grained observability through its built-in metrics capabilities and integrations, with log aggregation tools that capture all logs from all pods in the Ansible Automation Platform deployment.
| Task | OpenShift Container Platform | VM-based installation | Containerized Deployments (Podman based) |
|---|---|---|---|
| Horizontally scale up | Scale control, execution, automation hub, and Event-Driven Ansible components independently by adjusting replicas. Expanding total capacity by adding worker nodes to OpenShift Container Platform does not disrupt Ansible Automation Platform operation. | Requires updating inventory file and re-running entire installation, which restarts and requires halting use of the platform. | Requires updating inventory file and re-running entire installation, which restarts and requires halting use of the platform. |
| Horizontally scale down |
Reduces replicas handled by operator. For scaling down automation controller task pods, usage of | Requires updating the inventory file and re-running the entire installation. This restarts the platform, halting use. | Requires updating the inventory file and re-running the entire installation. This restarts the platform, halting use. |
| Vertically scaling up or down |
Increases or decreases requests and limits on individual deployment types. The operator deploys a new pod with these resources, and previous pods scale down. For automation controller task pods, usage of | Depending on your virtualization provider, the virtual machine may require shutdown to resize. Attaining the full benefit of vertical scaling requires re-running the installer, which restarts and halts use of the platform. Running the installer adapts any settings that were tuned based on the number of available cores and RAM. | Depending on your virtualization provider, the virtual machine may require shutdown to resize. Attaining the full benefit of vertical scaling requires re-running the installer, which restarts and halts use of the platform. Running the installer adapts any settings that were tuned based on the number of available cores and RAM. |
| Installation | Utilizes OpenShift Container Platform Operators for automated deployment and management. | Ansible Playbook-based installer installs RPMs and configures the platform. |
Ansible Playbook-based installer that configures platform services in podman containers, which are managed by |
| Upgrade |
Handled by OpenShift Container Platform Operators with automated rolling updates. Usage of | Requires running the installer and restarting services, which halts use of the platform. | Requires running the installer and restarting services, which halts use of the platform. |
| Aggregating Application Logs | Centralized logging through OpenShift Container Platform’s built-in logging stack or integrations with external aggregators. | Requires external log aggregation solutions (e.g., ELK stack, Splunk) to collect logs from individual nodes. | Requires external log aggregation solutions (e.g., ELK stack, Splunk) to collect logs from container instances. |
| Monitoring Resource Utilization | Comprehensive monitoring with OpenShift Container Platform’s built-in Prometheus and Grafana dashboards, providing detailed pod and node metrics. | Requires external monitoring tools to collect and visualize resource metrics from nodes. | Requires external monitoring tools to collect and visualize resource metrics from nodes. |
2.4. Motivations for migrating to an enterprise topology 复制链接链接已复制到粘贴板!
Growth topologies model a minimal Ansible Automation Platform installation. Growth topologies are suited to proof-of-concept deployments, small-scale environments, or preliminary evaluations. Using a growth topology simplifies initial setup for your Ansible Automation Platform deployment, but they have inherent limitations.
2.4.1. Inherent limitations of growth topologies 复制链接链接已复制到粘贴板!
Growth topologies include single points of failure, such as a single platform gateway , and other critical components, such as the control plane, execution plane, and web services. These components often share resources on the same node, resulting in resource contention under increasing load. As workloads grow, specific services, such as job processing or API responsiveness, can become bottlenecks due to co-location or single node capacity limits. Consequently, growth topologies generally do not offer robust, high-availability capabilities. For VM-based installation and containerized deployments of Ansible Automation Platform, you can marginally increase possible workloads by vertically scaling the virtual machines or physical hosts within the growth deployment. However, vertical scaling capabilities within a growth topology are limited.
2.4.2. Use cases for migrating to an enterprise topology 复制链接链接已复制到粘贴板!
To scale beyond the limitations within growth topologies, you can migrate to an enterprise topology. Migrating to an enterprise topology may be relevant in the following use cases:
- Vertically scaling a growth topology becomes impractical due to cost or availability.
- The growth topology cannot satisfy high availability and disaster recovery requirements.
- You must scale Ansible Automation Platform services independently, such as API handling, job execution, and database capacity.
- Workload demands consistently overwhelm the capacity of vertically scaled growth topologies.
- You require more complex network architectures, such as segmented networks.
2.4.3. Recommended enterprise topology 复制链接链接已复制到粘贴板!
To maximize flexibility, resilience, and and scalability, migrate to the OpenShift Container Platform-based enterprise topology. This migration includes integration with an externally managed, enterprise-grade PostgreSQL database. operator-based installation offers greater flexibility to scale individual services and adapt the deployment to specific requirements. They also enhance the ability to scale the deployment up and down with reduced downtime and customize workload placement with labels, taints, tolerations, and topology constraints. operator-based installation also benefits from resilience features, such as automatic service recreation if underlying worker nodes experience failure or resource exhaustion.
2.5. Motivations for customizing enterprise topologies 复制链接链接已复制到粘贴板!
Enterprise topologies provide a pattern for scalability and resilience. Organizations typically evolve the tested deployment models into custom deployments, tailoring service configurations and scaling to their specific workflows and performance needs within Red Hat Ansible Automation Platform.
An organization’s unique use of Ansible Automation Platform determines which components require scaling, moving from a generic enterprise topology to a workload-tuned deployment. For example, infrequent automation hub use, numerous small jobs across distributed regions, or API-heavy integrations necessitate different scaling priorities for each component, such as the API service or execution plane. Motivations for customizing the documented enterprise deployment models include achieving high availability, enabling independent scaling of components, such as automation controller API versus execution capacity, to match actual demand, and supporting workload growth or specific SLAs. This requires custom resource allocation and performance tuning based on identified needs, rather than adherence to a general pattern. Before customizing and scaling, you must identify specific bottlenecks within your Ansible Automation Platform environment (for example, in API response, job processing, database performance, or Event-Driven Ansible event handling). Use platform monitoring tools and analytics to identify bottlenecks. After bottlenecks are identified, you can approach scaling each component vertically or horizontally.