此内容没有您所选择的语言版本。

Chapter 2. Scaling the tested deployment models


You can scale your Ansible Automation Platform tested deployment models to meet your workload requirements. This document refers to the following scaling methods:

  • Vertical scaling
  • Horizontal scaling

2.1. Vertical scaling for performance

Vertical scaling increases the physical resources available to a service, including the CPU, memory, disk volume, and disk Input/Output Operations per Second (IOPS). Use vertical scaling for deployments with high resource utilization or workload demand.

2.1.1. Benefits of vertical scaling

  • Relieves resource contention: Applications have access to more resources and this can relieve resource contention or exhaustion.
  • Extensive testing required: The installer attempts to tune application and system configurations to leverage additional resources, but not all components of the application automatically scale in relation to machine size. Manually tuning each variable requires extensive testing. For this reason, after an instance size has been verified for an environment, horizontal scaling by adding more instances of the same size is recommended.
  • Application-level limitations: For VM-based installation or Containerized deployments, instances with more than 64 CPU cores and 128 GB of RAM might not scale linearly due to system and application-level limits.
  • Resource overcommit: Overcommitting virtual machine resources (for example, allocating more virtual CPU/RAM to guests than is physically available on the host) leads to unpredictable performance.
  • CPU throttling: In OpenShift Container Platform, setting a CPU limit without an equivalent request can lead to CPU throttling, even if the node has spare CPU capacity. This throttling negatively impacts API latency.

    • To mitigate this, always set CPU requests equal to CPU limits.
    • Monitor CPU throttling using the container_cpu_cfs_throttled_seconds_total metric.
  • Database limitations: Scaling the application increases the maximum potential number of database connections from worker processes and the overall memory, I/O, and CPU utilization of the PostgreSQL instance. As you scale past the tested deployment models, deploy your separate Postgres instances per component (platform gateway, Event-Driven Ansible, automation controller, automation hub).

2.2. Horizontal scaling for performance

Horizontal scaling involves increasing the number of replicas (pods or virtual machines) for a given service. Like vertical scaling, this approach is useful for high resource utilization or workload scaling.

2.2.1. Benefits of horizontal scaling

  • Improved availability: Distributes load across more instances to reduce the impact of a single slow or failing node.
  • Redundancy: Provides extra capacity, allowing individual service nodes to recover or cool-off without impacting overall availability.
  • Increased authentication capacity: Each platform gateway pod includes its own authentication service, so scaling the platform gateway directly increases the platform’s authentication throughput.
  • Repeatable scaling procedure: After the instance size and configuration are verified for your environment, deploy identical instances to scale.
  • Database limitations: Scaling the application increases the maximum potential number of database connections from worker processes, and the overall memory, I/O, and CPU utilization of the PostgreSQL instance. As you scale past the tested deployment models, deploy your separate Postgres instances per component (platform gateway, Event-Driven Ansible, automation controller, automation hub).
  • Health Check Overhead: In a mesh architecture, each Envoy proxy sends health checks to every other cluster member. Horizontal scaling increases this baseline traffic, adding to system overhead.

Considerations for scaling and managing each Ansible Automation Platform component differ based on the installer type. The following table provides information on scaling and management operations for each installer type, as well as other common operations to consider when planning your deployment:

Note

Red Hat OpenShift Container Platform-based deployments provide the most flexibility and customizability, which enables you to adapt the deployment as needs change. OpenShift Container Platform also provides fine-grained observability through its built-in metrics capabilities and integrations, with log aggregation tools that capture all logs from all pods in the Ansible Automation Platform deployment.

Expand
Table 2.1. Comparison of Scaling and Management Operations by Installer Type
TaskOpenShift Container PlatformVM-based installationContainerized Deployments (Podman based)

Horizontally scale up

Scale control, execution, automation hub, and Event-Driven Ansible components independently by adjusting replicas. Expanding total capacity by adding worker nodes to OpenShift Container Platform does not disrupt Ansible Automation Platform operation.

Requires updating inventory file and re-running entire installation, which restarts and requires halting use of the platform.

Requires updating inventory file and re-running entire installation, which restarts and requires halting use of the platform.

Horizontally scale down

Reduces replicas handled by operator. For scaling down automation controller task pods, usage of termination_grace_period_seconds allows the scale down to occur after jobs are drained from the task pod.

Requires updating the inventory file and re-running the entire installation. This restarts the platform, halting use.

Requires updating the inventory file and re-running the entire installation. This restarts the platform, halting use.

Vertically scaling up or down

Increases or decreases requests and limits on individual deployment types. The operator deploys a new pod with these resources, and previous pods scale down. For automation controller task pods, usage of termination_grace_period_seconds allows the old replicas to scale down after jobs are drained from task pods.

Depending on your virtualization provider, the virtual machine may require shutdown to resize. Attaining the full benefit of vertical scaling requires re-running the installer, which restarts and halts use of the platform. Running the installer adapts any settings that were tuned based on the number of available cores and RAM.

Depending on your virtualization provider, the virtual machine may require shutdown to resize. Attaining the full benefit of vertical scaling requires re-running the installer, which restarts and halts use of the platform. Running the installer adapts any settings that were tuned based on the number of available cores and RAM.

Installation

Utilizes OpenShift Container Platform Operators for automated deployment and management.

Ansible Playbook-based installer installs RPMs and configures the platform.

Ansible Playbook-based installer that configures platform services in podman containers, which are managed by systemd.

Upgrade

Handled by OpenShift Container Platform Operators with automated rolling updates. Usage of termination_grace_period_seconds allows upgrades without downtime and without halting job execution.

Requires running the installer and restarting services, which halts use of the platform.

Requires running the installer and restarting services, which halts use of the platform.

Aggregating Application Logs

Centralized logging through OpenShift Container Platform’s built-in logging stack or integrations with external aggregators.

Requires external log aggregation solutions (e.g., ELK stack, Splunk) to collect logs from individual nodes.

Requires external log aggregation solutions (e.g., ELK stack, Splunk) to collect logs from container instances.

Monitoring Resource Utilization

Comprehensive monitoring with OpenShift Container Platform’s built-in Prometheus and Grafana dashboards, providing detailed pod and node metrics.

Requires external monitoring tools to collect and visualize resource metrics from nodes.

Requires external monitoring tools to collect and visualize resource metrics from nodes.

2.4. Motivations for migrating to an enterprise topology

Growth topologies model a minimal Ansible Automation Platform installation. Growth topologies are suited to proof-of-concept deployments, small-scale environments, or preliminary evaluations. Using a growth topology simplifies initial setup for your Ansible Automation Platform deployment, but they have inherent limitations.

2.4.1. Inherent limitations of growth topologies

Growth topologies include single points of failure, such as a single platform gateway , and other critical components, such as the control plane, execution plane, and web services. These components often share resources on the same node, resulting in resource contention under increasing load. As workloads grow, specific services, such as job processing or API responsiveness, can become bottlenecks due to co-location or single node capacity limits. Consequently, growth topologies generally do not offer robust, high-availability capabilities. For VM-based installation and containerized deployments of Ansible Automation Platform, you can marginally increase possible workloads by vertically scaling the virtual machines or physical hosts within the growth deployment. However, vertical scaling capabilities within a growth topology are limited.

2.4.2. Use cases for migrating to an enterprise topology

To scale beyond the limitations within growth topologies, you can migrate to an enterprise topology. Migrating to an enterprise topology may be relevant in the following use cases:

  • Vertically scaling a growth topology becomes impractical due to cost or availability.
  • The growth topology cannot satisfy high availability and disaster recovery requirements.
  • You must scale Ansible Automation Platform services independently, such as API handling, job execution, and database capacity.
  • Workload demands consistently overwhelm the capacity of vertically scaled growth topologies.
  • You require more complex network architectures, such as segmented networks.

2.5. Motivations for customizing enterprise topologies

Enterprise topologies provide a pattern for scalability and resilience. Organizations typically evolve the tested deployment models into custom deployments, tailoring service configurations and scaling to their specific workflows and performance needs within Red Hat Ansible Automation Platform.

An organization’s unique use of Ansible Automation Platform determines which components require scaling, moving from a generic enterprise topology to a workload-tuned deployment. For example, infrequent automation hub use, numerous small jobs across distributed regions, or API-heavy integrations necessitate different scaling priorities for each component, such as the API service or execution plane. Motivations for customizing the documented enterprise deployment models include achieving high availability, enabling independent scaling of components, such as automation controller API versus execution capacity, to match actual demand, and supporting workload growth or specific SLAs. This requires custom resource allocation and performance tuning based on identified needs, rather than adherence to a general pattern. Before customizing and scaling, you must identify specific bottlenecks within your Ansible Automation Platform environment (for example, in API response, job processing, database performance, or Event-Driven Ansible event handling). Use platform monitoring tools and analytics to identify bottlenecks. After bottlenecks are identified, you can approach scaling each component vertically or horizontally.

返回顶部
Red Hat logoGithubredditYoutubeTwitter

学习

尝试、购买和销售

社区

关于红帽文档

通过我们的产品和服务,以及可以信赖的内容,帮助红帽用户创新并实现他们的目标。 了解我们当前的更新.

让开源更具包容性

红帽致力于替换我们的代码、文档和 Web 属性中存在问题的语言。欲了解更多详情,请参阅红帽博客.

關於紅帽

我们提供强化的解决方案,使企业能够更轻松地跨平台和环境(从核心数据中心到网络边缘)工作。

Theme

© 2025 Red Hat