Before creating a BYOK RAG image, collect and prepare your organization's documentation in supported file formats.
- Access to your organization's documentation repository.
- The conversion tool that is appropriate for your file format is installed.
- Write permissions for a local directory to store the prepared documents.
Procedure
- Collect your knowledge sources:
- Identify all documents you want to include in the BYOK knowledge base.
- Create a staging directory for your knowledge sources:
mkdir -p knowledge-sources/staging
- Copy or download all source documents to the staging directory.
- If you have content that is not in Markdown (.md) and plain text (.txt) file format, convert them to Markdown or plain text format before uploading them to the BYOK RAG image:
- PDF conversion: use a tool like docling.
- AsciiDoc conversion: Use custom scripts or the Pandoc tool.
- HTML conversion: Use the Pandoc tool.
- Follow these guidelines when structuring your converted documents for optimal indexing. You can also find examples of each guideline in this collections repository.
- Use clear file names: Name files descriptively to improve searchability.
- Organize by topic: Group related documents in directories by subject area (for example, security-policies, deployment-procedures, compliance).
- Use consistent formatting: Apply uniform heading structures and Markdown conventions across all documents.
- Include metadata: Add document titles and source URLs when configuring the metadata processor.
- Remove sensitive information: Exclude credentials, API keys, or proprietary details before creating the RAG image.
- Test document quality: Verify that documents are readable and technically accurate.
- Optimize storage volume: Keep the BYOK data between 1 Mi and 999 Mi for best performance. While the system supports larger allocations up to 2 Gi, smaller volumes are recommended.
- Copy the converted documents to a final directory for processing them with the
rag-content tool.
After preparing your knowledge sources, use the rag-content tool to create a BYOK RAG image. The BYOK RAG image acts as a searchable knowledge base drawn from your documentation.