OpenSearch

IngestPipeline

Flo0w Framework

RAG tool

Introduced 2.13

The RAGTool performs retrieval-augmented generation (RAG). For more information about RAG, see Conversational search.

RAG calls a large language model (LLM) and supplements its knowledge by providing relevant OpenSearch documents along with the user question. To retrieve relevant documents from an OpenSearch index, you’ll need a text embedding model that facilitates vector search.

The RAG tool supports the following search methods:

Neural search: Dense vector retrieval, which uses a text embedding model.
Neural sparse search: Sparse vector retrieval, which uses a sparse encoding model.

Before you start

To register and deploy a text embedding model and an LLM and ingest data into an index, perform Steps 1–5 of the Agents and tools tutorial.

The following example uses neural search. To configure neural sparse search and deploy a sparse encoding model, see Neural sparse search.

Step 1: Register a flow agent that will run the RAGTool

A flow agent runs a sequence of tools in order and returns the last tool’s output. To create a flow agent, send the following request, providing the text embedding model ID in the embedding_model_id parameter and the LLM model ID in the inference_model_id parameter:

POST /_plugins/_ml/agents/_register
{
  "name": "Test_Agent_For_RagTool",
  "type": "flow",
  "description": "this is a test flow agent",
  "tools": [
  {
    "type": "RAGTool",
    "description": "A description of the tool",
    "parameters": {
      "embedding_model_id": "Hv_PY40Bk4MTqircAVmm",
      "inference_model_id": "SNzSY40B_1JGmyB0WbfI",
      "index": "my_test_data",
      "embedding_field": "embedding",
      "query_type": "neural",
      "source_field": [
        "text"
      ],
      "input": "${parameters.question}",
      "prompt": "\n\nHuman:You are a professional data analyst. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say don't know. \n\n Context:\n${parameters.output_field}\n\nHuman:${parameters.question}\n\nAssistant:"
    }
  }
]
}