Summary

Overview

This repository is designed to facilitate the retrieval of knowledge data which is part of CIN Knowledge Discovery. It consists of several services including the Semantic API, Agent API, and Ingestion Processor Lambda. These services work together to process, analyze, and retrieve data in a meaningful and efficient manner.

Prerequisites

.NET
Terraform
Docker
Node.js and npm (K6 testing)

Architecture

For detailed information about the design and functionality of the services, please refer to the Knowledge Retrieval Design page on Confluence here.

Components

Provisioning

Provisioner: Orchestrates the provisioning of Hxp environment resources using Terraform Cloud.

Ingestion

Ingestion Events Writer Lambda: Delivers incoming events to S3 Bucket, containing objects along with their metadata.
Objects Ingestor Lambda: Imports data from Stage into the objects tables in Snowflake.
Embeddings Ingestor Lambda: Imports emdedding data form Stage into the embeddings table in Snowflake.

Knowledge Retrieval (aka RAG pipeline)

Prompt Processor Lambda: Processes the user's question and generates answer.
Semantic API: Provides Semantic search capabilities over Snowflake vector database.
Agent API: Allows defining and configuring the Agents and submitting queries.
QnA API: Provides access to question submitted by the users and answers generated.

Testing

Parquet Generator: Generates parquet files for testing embedding ingestion.
Ingestion Generator: Generates ingestion events for testing ingestion.

API Testing

The api-scenarios folder contains various scenarios that demonstrate how to interact with the APIs. These scenarios can be used to manually test the functionality of the APIs.

Also in the tests location there are automated tests created with Playwright (API, e2e, smoke tests) and k6 (performance tests).

Performance Tests

The performance-tests folder contains performance tests that can be run using K6 scripts.

Overview​

Prerequisites​

Architecture​

Components​

Provisioning​

Ingestion​

Knowledge Retrieval (aka RAG pipeline)​

Testing​

API Testing​

Performance Tests​