Qdrant (Beta)

The Qdrant Micro-Integration publishes events from a PubSub+ event broker to a Qdrant vector database. It converts event data into vectors using an embedding model. A vector is a list of numbers representing a concept or semantic meaning of a chunk of text. An embedding model is a machine learning model that performs the data conversion.

You can configure Qdrant only as a target.

Qdrant Target Parameters

The following table describes the parameters for configuring Qdrant as a target.

Field Description
Qdrant Target Connection Details

Select Content Splitter

Select the method for splitting incoming content for vectorization. Valid values are:

  • JSON Document Splitter—Splits JSON data. Additional parameters are required. For more information, see JSON Document Splitter Parameters.

  • Character Document Splitter—Splits text data using a single separator. Additional parameters are required. For more information, see Character Document Splitter Parameters.

  • Structured Text Splitter—Splits structured text using one or more separators. Additional parameters are required. For more information, see Structured Text Splitter Parameters.

  • No Splitter to Use with Pre-Split Unstructured Data—Does not split incoming data.

OpenAI-Compatible Embedding Model Provider

Endpoint URL

The API endpoint URL for the embedding model provider. For example: https://api.openai.com/v1/embeddings

Model ID

The specific version or name of the embedding model to use, following the <access-protocol-provider>/<model-id> format, for example, openai/text-embedding-3-large.

The access protocol provider specifies which service to use to access the model. The model ID specifies the embedding model within that provider's catalog.

The value of <access-protocol-provider> must be openai.

Secret Access Key

The API Secret Access Key for the embedding model provider.

Organization ID

The OpenAI Organization ID.

Chunk Size

The size of text segments into which a document is divided by an embedding model for vector database storage. The default is 1000.

Embedding Model Context Length

The maximum number of tokens to embed at once by the embedding model. The default is 8191.

Number of Embedding Dimensions

The number of dimensions in the resulting output embeddings.

Qdrant Vector Database

Endpoint URL

The API endpoint URL for the Qdrant Vector Database. For example: https://xyz-example.eu-central.aws.cloud.qdrant.io:6333

Secret Access Key

The API Secret Access Key for the Qdrant provider.

Collection Name

The name of the Qdrant collection.

A collection in Qdrant is a container that stores vectors where all vectors must have the same dimensions and have used the same model for their creation. Collections allow you to efficiently search for and retrieve vectors that are similar to a vector you specify in a query. For more information, see Collections in the Qdrant documentation.

If the specified collection does not exist, the Micro-Integration creates a new empty basic collection with the specified name.

JSON Document Splitter Parameters

Field Description

Max Chunk Size

The maximum number of tokens per text chunk during document splitting. Controls the size of segments created during text ingestion for vector storage. The default is 2000.

Larger chunks preserve more context but may reduce retrieval precision. Smaller chunks improve granular search but may lose contextual relationships.

Character Document Splitter Parameters

Field Description

Chunk Size

The maximum number of characters allowed in each text segment when splitting a document. The default is 4000.

Chunk Overlap

The number of characters that overlap between consecutive chunks. Overlap helps to maintain context between split segments. The default is 200.

Separator

The character sequence used to split text into chunks, allowing control over split points at natural boundaries like paragraph breaks. The default is "\n\n".

Structured Text Splitter Parameters

Field Description

Chunk Size

The maximum number of characters allowed in each text segment when splitting a document. The default is 4000.

Chunk Overlap

The number of characters that overlap between consecutive chunks. Overlap helps to maintain context between split segments. The default is 200.

Separators

The character sequence used to split text into chunks based on text structure allowing control over split points at natural boundaries like paragraph breaks. This can be a comma-separated list of separators like: "separator1","separator2". Each separator is applied individually. The default is "\n\n", "\n", " ", "".