Graph RAG

Beta Feature

Graph RAG is currently in beta. Functionality, endpoints, and behavior may change in future releases.

Overview

Graph RAG extends traditional RAG by combining semantic search with knowledge graph retrieval, giving the agent access to both unstructured document content and structured entity/relationship data when generating responses. Under the hood, Graph RAG is powered by the Enterprise Context Engine (ECE).

The agent includes built-in hallucination detection that automatically validates responses against retrieved context and retries when hallucinations are detected.

Configuration

Create a Graph RAG agent:

{
    "name": "KnowledgeAssistant",
    "description": "Assistant with knowledge graph and document retrieval",
    "agentType": "graphRag",
    "notes": "Uses ECE for enriched answers with semantic and graph context",
    "config": {
        "llmModelId": "anthropic.claude-haiku-4-5-20251001-v1:0",
        "knowledgeGraphDomainId": "7a0e6c74-7c62-4636-adbf-857340244e7d",
        "systemPrompt": "Context information is below.\n---------------------\n{context_str}\n---------------------\nGiven the context information and not prior knowledge, answer the query.\nQuery: {query_str}\nAnswer: ",
        "inferenceConfig": {
            "maxTokens": 4000,
            "temperature": 0.7
        },
        "hxqlQuery": "SELECT * FROM SysContent",
        "hybridSearch": true,
        "limit": 5,
        "adjacentChunkRange": 1,
        "adjacentChunkMerge": true,
        "guardrails": ["HAIP-Profanity", "HAIP-Insults-High"]
    }
}

Configuration Parameters

Parameter	Description	Required	Example
`agentType`	Must be `"graphRag"`	Yes	`"graphRag"`
`knowledgeGraphDomainId`	Knowledge Graph domain ID	Yes	`"7a0e6c74-7c62-4636-adbf-857340244e7d"`
`llmModelId`	LLM model identifier	Yes	`"anthropic.claude-haiku-4-5-20251001-v1:0"`
`systemPrompt`	Custom answer generation prompt (see below)	No	`"Context: {context_str}\nQuery: {query_str}\nAnswer: "`
`inferenceConfig`	LLM inference parameters	No	`{"maxTokens": 4000, "temperature": 0.7}`
`hxqlQuery`	HxQL filter for Content Lake retrieval	No	`"SELECT * FROM SysContent"`
`hybridSearch`	Enable hybrid search (embeddings + full-text)	No	`true`
`limit`	Maximum number of chunks to retrieve	No	`5`
`adjacentChunkRange`	Number of adjacent chunks to fetch around each result (0 = disabled)	No	`1`
`adjacentChunkMerge`	Merge adjacent chunks into the parent chunk in document order	No	`true`
`guardrails`	List of guardrail names to apply	No	`["HAIP-Profanity"]`

System Prompt (Synthesis Prompt)

The systemPrompt field controls how the agent generates answers from retrieved context. It uses two template placeholders:

Placeholder	Description
`{context_str}`	Replaced with the retrieved context (semantic and graph sources)
`{query_str}`	Replaced with the user's question

When systemPrompt is provided, it entirely replaces the default answer generation prompt. When omitted, the agent uses a built-in prompt optimized for grounded, factual responses.

Example custom prompt:

Context information is below.
---------------------
{context_str}
---------------------
You are a domain expert. Using ONLY the context above, provide a detailed answer.
If the context does not contain enough information, say so clearly.

Query: {query_str}
Answer:

tip

This follows the same pattern as the RAG agent's systemPrompt. If you already have prompts configured for RAG agents, they work identically with Graph RAG.

Retrieval Parameters

These can be passed at invocation time to override agent defaults:

Parameter	Description	Default
`hxqlQuery`	HxQL query to filter Content Lake documents	Config default
`hybridSearch`	Enable hybrid search (embeddings + full-text)	Config default
`limit`	Number of chunks to retrieve	Config default
`adjacentChunkRange`	Number of adjacent chunks to fetch around each result	Config default
`adjacentChunkMerge`	Merge adjacent chunks into parent chunk	Config default
`rerankerTopN`	Number of results to keep after reranking	Config default

Invocation

Non-Streaming

curl -X POST "/v1/agents/{agent_id}/versions/latest/invoke" \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
        {
            "role": "user",
            "content": "What entities are related to Project Alpha?"
        }
    ]
}'

Invocation with Retrieval Overrides

curl -X POST "/v1/agents/{agent_id}/versions/latest/invoke" \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
        {
            "role": "user",
            "content": "Summarize the HR onboarding process"
        }
    ],
    "hxqlQuery": "SELECT * FROM SysContent WHERE department = '\''HR'\''",
    "hybridSearch": true,
    "limit": 10,
    "adjacentChunkRange": 2
}'

Multi-Turn Conversation

Graph RAG agents support multi-turn conversations by passing previous messages in the messages array. The last message must always have role: "user".

Session-Based Memory

Graph RAG agents support optional session-based short-term memory. Add an X-Session-ID header with a consistent UUID to maintain conversation context across multiple invocations. The platform does not auto-generate session IDs — you must generate and manage them yourself (any valid UUID).

curl -X POST "/v1/agents/{agent_id}/versions/latest/invoke" \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -H "X-Session-ID: 550e8400-e29b-41d4-a716-446655440000" \
  -d '{
    "messages": [
        {
            "role": "user",
            "content": "What entities are related to Project Alpha?"
        },
        {
            "role": "assistant",
            "content": "Project Alpha is related to..."
        },
        {
            "role": "user",
            "content": "Which of those are in the finance department?"
        }
    ]
}'

Response Format

The response includes both the generated answer and source information from both retrieval sources:

{
    "object": "response",
    "model": "anthropic.claude-haiku-4-5-20251001-v1:0",
    "createdAt": 1709654321,
    "output": [
        {
            "type": "message",
            "status": "completed",
            "role": "assistant",
            "content": [
                {
                    "type": "output_text",
                    "text": "Based on the retrieved documents and knowledge graph..."
                }
            ]
        }
    ],
    "customOutputs": {
        "sourceNodes": [
            {
                "docId": "doc-123",
                "chunkId": "chunk-456",
                "score": 0.95,
                "text": "Relevant document excerpt..."
            },
            {
                "docId": "entity-789",
                "chunkId": null,
                "score": null,
                "text": "Knowledge graph entity data..."
            }
        ]
    }
}

The sourceNodes array contains entries from both semantic search and knowledge graph sources.

Hallucination Detection

Graph RAG includes automatic hallucination detection:

The generated answer is validated against the retrieved context
If the answer contains information not supported by the context, it is flagged as hallucinated
The agent retries answer generation (default: up to 2 retries)
If retries are exhausted, the last generated answer is returned

This ensures responses stay grounded in the actual retrieved data rather than relying on the LLM's training knowledge.

Differences from Standard RAG

Feature	RAG	Graph RAG
Semantic search	Yes	Yes
Knowledge graph retrieval	No	Yes
Hallucination detection	Configurable	Built-in
Streaming support	Yes	No (not yet implemented)
Session-based memory	Yes	Yes

Best Practices

Data Preparation
- Ensure your content is ingested into Content Lake with relevant entities and relationships
- Graph RAG uses ECE to provide structured context that complements unstructured document search
System Prompt Design
- Include both {context_str} and {query_str} placeholders
- Instruct the LLM to use only the provided context
- Consider mentioning both semantic and graph sources in your prompt
Retrieval Tuning
- Use hxqlQuery to scope retrieval to relevant document sets
- Adjust limit based on your use case (more chunks = more context but higher latency)
- Use rerankerTopN to control how many chunks survive reranking
Guardrails
- Use the /v1/guardrails endpoint to retrieve available guardrails
- Guardrails can be set at agent creation time or per invocation

Overview​

Configuration​

Configuration Parameters​

System Prompt (Synthesis Prompt)​

Retrieval Parameters​

Invocation​

Non-Streaming​

Invocation with Retrieval Overrides​

Multi-Turn Conversation​

Response Format​

Hallucination Detection​

Differences from Standard RAG​

Best Practices​