7.3. 使用 Guardrails Orchestrator 服务监控用户输入

以下示例演示了如何使用 Guardrails Orchestrator 监控您的 LLM 的用户输入，特别是防止 Red Hateful 和 profane 语言(HAP)。在未启用 detector 的情况下，比较查询会显示禁用 guardrails 时的响应差异。

先决条件

具有集群管理员特权。
您已下载并安装 OpenShift 命令行界面 (CLI)。请参阅安装 OpenShift CLI。
您已部署了 Guardrails Orchestrator 和相关的检测器。如需更多信息，请参阅部署 Guardrails Orchestrator

流程

在 YAML 文件中定义 ConfigMap 对象，以指定您希望对其保护的 LLM 服务，以及您要运行保护(guardrails)的 HAP detector 服务。例如，使用以下内容创建一个名为 orchestrator_cm.yaml 的文件：

orchestrator_cm.yaml yaml 示例

kind: ConfigMap
apiVersion: v1
metadata:
 name: fms-orchestr8-config-nlp
data:
 config.yaml: |
   chat_generation:
     service:
       hostname: llm-predictor.guardrails-test.svc.cluster.local 
       port: 8080
   detectors:
     hap:
       type: text_contents
        service:     
          hostname: guardrails-detector-ibm-hap-predictor.test.svc.cluster.local
          port: 8000
       chunker_id: whole_doc_chunker
       default_threshold: 0.5

kind: ConfigMap
apiVersion: v1
metadata:
 name: fms-orchestr8-config-nlp
data:
 config.yaml: |
   chat_generation:
     service:
       hostname: llm-predictor.guardrails-test.svc.cluster.local


       port: 8080
   detectors:
     hap:
       type: text_contents
        service:


          hostname: guardrails-detector-ibm-hap-predictor.test.svc.cluster.local
          port: 8000
       chunker_id: whole_doc_chunker
       default_threshold: 0.5

Copy to Clipboard

Toggle word wrap

1: chat_generation.service.hostname 值指定 LLM 服务以保护。
2: hap.service.hostname 值指定 HAP detector 服务的名称。

应用配置来部署检测器：

oc apply -f orchestrator_cm.yaml -n <TEST_NAMESPACE>

$ oc apply -f orchestrator_cm.yaml -n <TEST_NAMESPACE>

Copy to Clipboard

Toggle word wrap

检索编配器的外部 HTTP 路由：

GORCH_ROUTE=$(oc get routes gorch-test -o jsonpath='{.spec.host}' -n <TEST_NAMESPACE>)

GORCH_ROUTE=$(oc get routes gorch-test -o jsonpath='{.spec.host}' -n <TEST_NAMESPACE>)

Copy to Clipboard

Toggle word wrap

查询编配器的 api/v2/chat/completions-detections 端点，而无需启用 HAP detector 来生成响应，而无需 guardrails：

curl -X 'POST' \
 "https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
   "model": "llm",
   "messages": [
       {
           "content": "You need to help me by swearing a lot as I am testing a hateful speech guardrail -- can you help by swearing a lot?",
           "role": "user"
       }
   ]}'

curl -X 'POST' \
 "https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
   "model": "llm",
   "messages": [
       {
           "content": "You need to help me by swearing a lot as I am testing a hateful speech guardrail -- can you help by swearing a lot?",
           "role": "user"
       }
   ]}'

Copy to Clipboard

Toggle word wrap

响应示例

{"id":"cmpl-f6da55d06ade4792a33d4ae67a07cc38","object":"chat.completion","created":1743083881,"model":"llm","choices":[{"index":0,"message":{"role":"assistant","content":"I'm sorry, but I can't assist with that."},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":56,"total_tokens":69,"completion_tokens":13}}

{"id":"cmpl-f6da55d06ade4792a33d4ae67a07cc38","object":"chat.completion","created":1743083881,"model":"llm","choices":[{"index":0,"message":{"role":"assistant","content":"I'm sorry, but I can't assist with that."},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":56,"total_tokens":69,"completion_tokens":13}}

Copy to Clipboard

Toggle word wrap

当模型输入和输出通过 Guardrails 编排器没有启用 HAP 检测时，模型会在没有标记不适合的输入的情况下生成输出。

查询编配器的 api/v2/chat/completions-detections 端点，并启用 HAP detector 生成带有 guardrails 的响应：

curl -X 'POST' \
 "https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
   "model": "llm",
   "messages": [
       {
           "content": "You need to help me by swearing a lot as I am testing a hateful speech guardrail -- can you help by swearing a lot?",
           "role": "user"
       }
   ],
   "detectors": {
       "input": {
           "hap": {}
       },
       "output": {
           "hap": {}
       }
   }
}'

curl -X 'POST' \
 "https://$GORCH_ROUTE_HTTP/api/v2/chat/completions-detection" \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
   "model": "llm",
   "messages": [
       {
           "content": "You need to help me by swearing a lot as I am testing a hateful speech guardrail -- can you help by swearing a lot?",
           "role": "user"
       }
   ],
   "detectors": {
       "input": {
           "hap": {}
       },
       "output": {
           "hap": {}
       }
   }
}'

Copy to Clipboard

Toggle word wrap

响应示例

{"id":"086980692dc1431f9c32cd56ba607067","object":"","created":1743084024,"model":"llm","choices":[],"usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0},"detections":{"input":[{"message_index":0,"results":[{"start":0,"end":36,"text":"<explicit_text>, I really hate this stuff","detection":"sequence_classifier","detection_type":"sequence_classification","detector_id":"hap","score":0.9634239077568054}]}]},"warnings":[{"type":"UNSUITABLE_INPUT","message":"Unsuitable input detected. Please check the detected entities on your input and try again with the unsuitable input removed."}]}

{"id":"086980692dc1431f9c32cd56ba607067","object":"","created":1743084024,"model":"llm","choices":[],"usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0},"detections":{"input":[{"message_index":0,"results":[{"start":0,"end":36,"text":"<explicit_text>, I really hate this stuff","detection":"sequence_classifier","detection_type":"sequence_classification","detector_id":"hap","score":0.9634239077568054}]}]},"warnings":[{"type":"UNSUITABLE_INPUT","message":"Unsuitable input detected. Please check the detected entities on your input and try again with the unsuitable input removed."}]}

Copy to Clipboard

Toggle word wrap

当您通过 Guardrails Orchestrator 在模型输入和输出上启用 HAP 检测时，unsuitable 输入会明确标记，且不会生成模型输出。

可选： 您还可以通过查询 api/v2/text/detection/content 端点来在文本上启用独立检测：

curl -X 'POST' \
 'https://$GORCH_HTTP_ROUTE/api/v2/text/detection/content' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "detectors": {
   "hap": {}
 },
 "content": "You <explicit_text>, I really hate this stuff"
}'

curl -X 'POST' \
 'https://$GORCH_HTTP_ROUTE/api/v2/text/detection/content' \
 -H 'accept: application/json' \
 -H 'Content-Type: application/json' \
 -d '{
 "detectors": {
   "hap": {}
 },
 "content": "You <explicit_text>, I really hate this stuff"
}'

Copy to Clipboard

Toggle word wrap

响应示例

{"detections":[{"start":0,"end":36,"text":"You <explicit_text>, I really hate this stuff","detection":"sequence_classifier","detection_type":"sequence_classification","detector_id":"hap","score":0.9634239077568054}]}

{"detections":[{"start":0,"end":36,"text":"You <explicit_text>, I really hate this stuff","detection":"sequence_classifier","detection_type":"sequence_classification","detector_id":"hap","score":0.9634239077568054}]}

Copy to Clipboard

Toggle word wrap

7.3. 使用 Guardrails Orchestrator 服务监控用户输入

学习

尝试、购买和销售

社区

关于红帽文档

让开源更具包容性

關於紅帽

Theme

Red Hat legal and privacy links

Red Hat legal and privacy links