홈
Learn
AI quickstarts
rh-maas-code-assistant

Accelerate enterprise software development with NVIDIA and MaaS
Copy link

Optimize private app development using NVIDIA Nemotron models through Models-as-a-Service on your own multi-tenant infrastructure in Red Hat AI.

Table of contents
Copy link

Detailed description
- Architecture diagrams
Requirements
Deploy
References
Advanced Deployment
- Prerequisites
- Installation Steps
Tags

Detailed description
Copy link

Developing software with speed and efficiency is a competitive necessity. Developers are often overwhelmed and slowed down by repetitive code, complicated debugging and testing, and the constant need to learn new technologies. AI-powered coding assistance can help, but how do you leverage it securely and cost-effectively?

For organizations restricted by strict data privacy requirements, regulations, or specific performance needs, public AI hosted services often are not an option. As your usage expands, you also need to consider how to keep things as cost-efficient as possible. Models as a Service (MaaS) solves this by enabling centralized IT teams to host and manage private models that remote teams can consume easily and securely. This keeps proprietary data within the organization’s boundaries while providing developers access to the generative AI technology they need. By providing access to the models via API tokens, administrators can also implement specific rate limits and quotas. This approach doesn’t just simplify access and usage, it allows organizations to monitor metrics, forecast capacity and compute needs, and manage chargebacks with precision.

This quickstart demonstrates how you can easily deploy a private AI code assistant powered by NVIDIA Nemotron models and delivered through Red Hat AI's integrated Models as a Service (MaaS) offering. Developers access the assistant through OpenShift DevSpaces, a containerized cloud-native IDE included in OpenShift.

Architecture diagrams
Copy link

Code Assistant w/ MaaS Architecture

This diagram illustrates a models-as-a-service architecture on Red Hat AI including the model deployments in addition to the code assistant application with OpenShift DevSpaces. For more details click here.

Layer/Component	Technology	Purpose/Description
Orchestration	Red Hat AI Enterprise	Container orchestration and comprehensive AI platform
Inference	vLLM and llm-d	High performance inference engine for Gen AI model deployment and kubernetes-native distributed inference capabilities with llm-d
LLM	nemotron-3-nano-30b-a3b-fp8	A quantized 30B-parameter hybrid Mamba-Transformer MoE model with a 1M-token context window, designed for efficient reasoning, chat, and agentic AI applications
Models-as-a-Service	Red Hat AI Enterprise	Integrated LLM governance layer that provides rate-limited model access with usage tracking and chargeback across teams
GPU Acceleration	NVIDIA GPU Operator	Enables GPUs and manages drivers, DCGM, container toolkit, and MIG capabilities for GPU acceleration
Development Environment	OpenShift DevSpaces	Provides IDE instances for development teams to develop and deploy all on the same cluster
Observability	Prometheus Operator	Monitors model inference metrics and GPU telemetry
Dashboard	Grafana	Metrics scraped from Prometheus are then surfaced and shown visually in custom Grafana dashboards

Requirements
Copy link

Minimum hardware requirements
Copy link

One NVIDIA GPU node with 48GB VRAM for Nemotron model
One NVIDIA GPU node with 48GB VRAM for gpt-oss model

Note: Models in this quickstart were tested with 2 L40S GPU instances on AWS (instance type g6e.2xlarge).

Minimum software requirements
Copy link

Red Hat OpenShift 4.20
Red Hat OpenShift AI 3.2
Helm CLI
OpenShift Client CLI
Bash shell available in PATH
sed available in PATH

Required user permissions
Copy link

Regular user permissions for usage of Models-as-a-Service enabled endpoint, access to DevSpaces workspace, and access to Grafana dashboard for viewing usage data.
Cluster Admin access needed for any changes to model deployments or MaaS configurations.

Deploy
Copy link

The following instructions will easily deploy the quickstart to your Red Hat AI environment using an auto-pilot script-based installation. This will configure the necessary prerequisites for your environment and wire everything together, removing the need for additional configuration.

Please see the advanced deployment section for details on setting up your own prerequisites and deploying the quickstart with more control.

Prerequisites
Copy link

OpenShift cluster (specific version is specified in the software requirements section)
- Optional: certificates managed for the OpenShift Router
OpenShift cluster has GPUs available
The NVIDIA GPU Operator is installed and configured with a ClusterPolicy to configure the driver
You do not have other workloads or configurations in the cluster, such as:
- An identity provider deployed and configured
- Red Hat OpenShift AI installed
- Red Hat Connectivity Link deployed and configured
- Red Hat OpenShift Dev Spaces deployed

Installation Steps
Copy link

git clone quickstart repository

git clone https://github.com/rh-ai-quickstart/maas-code-assistant.git

git clone https://github.com/rh-ai-quickstart/maas-code-assistant.git

Copy to Clipboard

Toggle word wrap

cd into the directory

cd maas-code-assistant

cd maas-code-assistant

Copy to Clipboard

Toggle word wrap

Ensure you’re logged into your cluster as a cluster-admin user, such as kube:admin or system:admin:

oc whoami

oc whoami

Copy to Clipboard

Toggle word wrap

Run all-in-one.sh. Enter passwords for the admin and user accounts when prompted.

./all-in-one.sh

./all-in-one.sh

Copy to Clipboard

Toggle word wrap

Delete
Copy link

To remove the core quickstart components (models, Dev Spaces workspaces, etc.) run the following:

helm uninstall maas-code-assistant

helm uninstall maas-code-assistant

Copy to Clipboard

Toggle word wrap

To remove the Developer Preview of MaaS, run this afterwards:

oc delete -k ./dev-preview

oc delete -k ./dev-preview

Copy to Clipboard

Toggle word wrap

To clean up other dependencies, such as Red Hat Connectivity Link and OpenShift AI, follow their documented uninstallation procedures by removing their Operands first, allowing the operators to reconcile and complete removal, before uninstalling the operators themselves.

References
Copy link

vLLM: The High-Throughput and Memory-Efficient inference and serving engine for LLMs.
llm-d: a Kubernetes-native high-performance distributed LLM inference framework.
Red Hat OpenShift DevSpaces: a container-based, in-browser development environment offered by Red Hat that facilitates cloud-native development directly within the OpenShift ecosystem. Included within the OpenShift product offering.
NVIDIA Nemotron: a family of open models with open weights, training data, and recipes, delivering leading efficiency and accuracy for building specialized AI agents.
NVIDIA GPU Operator: uses the operator framework within Kubernetes to automate the management of all NVIDIA software components needed to provision GPU.

Advanced Deployment
Copy link

This advanced deployment option will allow you to control the deployment of all prerequisites separately and tailor it to your specific environment.

Use this deployment path if you:

Have a configured cluster with some or all of the prerequisites already deployed.
Prefer a different configuration path than the defaults set in the quickstart repository installation script.
Are using the cluster for other workloads and therefore need to customize the installation to avoid conflict with existing cluster resources.

Prerequisites
Copy link

The following prerequisites are required in your environment to prevent any conflicts with the quickstart:

Users have been configured with OpenShift OAuth, backed by OIDC or some other auth method such as htpasswd, as documented.
OpenShift cluster and user-workload monitoring is configured, as documented.
Grafana is deployed and managed through the Grafana Operator, in the grafana namespace.
- An example Grafana operand, with all RBAC and resources wired up to User Workload Monitoring, is available in docs/examples/grafana.yaml. It expects that your Grafana Operator installation was namespace scoped, and deployed to the grafana namespace, and that your in-cluster registry is configured.
Red Hat OpenShift Dev Spaces is deployed, as documented.
- A basic CheCluster resource is configured, as in steps 2 and 3 of the above.
Red Hat OpenShift AI version 3.2.0 has been deployed from the fast-3.x channel, as documented.
- A Data Science Cluster has been created that enables at least the Dashboard, KServe, and Llama Stack Operator components, as documented.
- Note that using Manual approval mode with the startingCSV set to rhods-operator.3.2.0 is recommended to stay on the version tested with this code base.
Red Hat Connectivity Link has been deployed from the stable channel, as documented.
- A Kuadrant resource has been installed in the kuadrant-system namespace, as documented.
You have created the openshift-default GatewayClass object for Gateway API in OpenShift, and are able to create Gateway instances using your clusters load balancer and infrastructure configuration. See the documentation for more details about Gateway API in OpenShift.

Installation Steps
Copy link

Ensure you’re logged into your cluster as a cluster-admin user:

oc whoami
oc get nodes

oc whoami
oc get nodes

Copy to Clipboard

Toggle word wrap

Install the developer preview release of Models as a Service.
1. Create a namespace for the developer preview:
```
oc create ns maas-api
```
```
oc create ns maas-api
```
  Copy to Clipboard Toggle word wrap
2. Run, from the root of the cloned repository, the following and ensure the values look correct for your cluster:
```
./dev-preview/render.sh
```
```
./dev-preview/render.sh
```
  Copy to Clipboard Toggle word wrap
3. Apply the rendered developer preview overlay with the following:
```
oc apply -k ./dev-preview
```
```
oc apply -k ./dev-preview
```
  Copy to Clipboard Toggle word wrap
Copy charts/maas-code-assistant/values.yaml to edit it:

cp charts/maas-code-assistant/values.yaml environment.yaml

cp charts/maas-code-assistant/values.yaml environment.yaml

Copy to Clipboard

Toggle word wrap

Edit the file and update the following sections to match your environment:
1. global.wildcardDomain and global.wildcardCertName
  1. You can recover the proper values by running the following:
```
oc get ingresscontroller -n openshift-ingress-operator default -ojsonpath='{.status.domain}{"\n"}'
oc get ingresscontroller -n openshift-ingress-operator default -ojsonpath='{.spec.defaultCertificate.name}{"\n"}'
```
```
oc get ingresscontroller -n openshift-ingress-operator default -ojsonpath='{.status.domain}{"\n"}'
oc get ingresscontroller -n openshift-ingress-operator default -ojsonpath='{.spec.defaultCertificate.name}{"\n"}'
```
  Copy to Clipboard Toggle word wrap
2. grafana.namespace and grafana.selectors
  1. Use the Namespace of your Grafana resource for the Grafana Operator.
  2. Set selectors to match labels on your Grafana instance. For example, if you get the following output:
```
oc get grafana grafana -n grafana -ojsonpath='{.metadata.labels}' | jq .
```
```
oc get grafana grafana -n grafana -ojsonpath='{.metadata.labels}' | jq .
```
  Copy to Clipboard Toggle word wrap
  {
  “app”: “grafana”
  }
  You should set selectors to app: grafana.
3. If you have deployed the openshift-default GatewayClass, as instructed above, configure it to not be managed by the chart by setting openshift-ai.gatewayClass.create to false.
Update the tiers section to map your desired user/tier mapping for the default MaaS tiers.
1. For example, if you have users named “bob,” “sue,” and “tom,” and would like them all to be in the enterprise tier, with user “sally” in the premium tier and “frank” in the free tier, use the following value for tiers:
```
tiers:
  free:
    users:
      - frank
  premium:
    users:
      - sally
  enterprise:
    users:
      - bob
      - sue
      - tom
```
```
tiers:
  free:
    users:
      - frank
  premium:
    users:
      - sally
  enterprise:
    users:
      - bob
      - sue
      - tom
```
Copy to Clipboard Toggle word wrap
Complete any tweaks necessary to the models array to ensure the workloads will place on your GPU-enabled nodes. This may involve changing the tolerations, adjusting the resources, adding the nodeSelector field to each model and configuring it with a valid nodeSelector for the pod template, etc.
Install the quickstart with helm:

helm install maas-code-assistant ./charts/maas-code-assistant -f environment.yaml

helm install maas-code-assistant ./charts/maas-code-assistant -f environment.yaml

Copy to Clipboard

Toggle word wrap

Note that, depending on your environment, the openshift-ai-inference Gateway may already be deployed in your cluster, giving you error output such as Error: INSTALLATION FAILED: Unable to continue with install: Gateway "openshift-ai-inference" in namespace "openshift-ingress" exists and cannot be imported into the current release. If this is the case, update your environment.yaml to include openshift-ai.gateway.create set to false.

Accelerate enterprise software development with NVIDIA and MaaS

Accelerate enterprise software development with NVIDIA and MaaS
Copy link

Table of contents
Copy link

Detailed description
Copy link

Architecture diagrams
Copy link

Requirements
Copy link

Minimum hardware requirements
Copy link

Minimum software requirements
Copy link

Required user permissions
Copy link

Deploy
Copy link

Prerequisites
Copy link

Installation Steps
Copy link

Delete
Copy link

References
Copy link

Advanced Deployment
Copy link

Prerequisites
Copy link

Installation Steps
Copy link

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Accelerate enterprise software development with NVIDIA and MaaS

Accelerate enterprise software development with NVIDIA and MaaSCopy linkLink copied!

Table of contentsCopy linkLink copied!

Detailed descriptionCopy linkLink copied!

Architecture diagramsCopy linkLink copied!

RequirementsCopy linkLink copied!

Minimum hardware requirementsCopy linkLink copied!

Minimum software requirementsCopy linkLink copied!

Required user permissionsCopy linkLink copied!

DeployCopy linkLink copied!

PrerequisitesCopy linkLink copied!

Installation StepsCopy linkLink copied!

DeleteCopy linkLink copied!

ReferencesCopy linkLink copied!

Advanced DeploymentCopy linkLink copied!

PrerequisitesCopy linkLink copied!

Installation StepsCopy linkLink copied!

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat 소개

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links

Accelerate enterprise software development with NVIDIA and MaaS
Copy link

Table of contents
Copy link

Detailed description
Copy link

Architecture diagrams
Copy link

Requirements
Copy link

Minimum hardware requirements
Copy link

Minimum software requirements
Copy link

Required user permissions
Copy link

Deploy
Copy link

Prerequisites
Copy link

Installation Steps
Copy link

Delete
Copy link

References
Copy link

Advanced Deployment
Copy link

Prerequisites
Copy link

Installation Steps
Copy link