Embeddings

Model name: embeddings

Model aliases:

openai_embeddings
nim_embeddings

About embeddings

OpenAI embeddings is a text-embedding model that enables use of any OpenAI API-compatible text-embedding model. It's suitable for text classification, clustering, and other text-embedding tasks.

Based on the name of the model, the model provider sets defaults accordingly:

When invoked as embeddings or openai_embeddings, the model provider defaults to using the OpenAI API.
When invoked as nim_embeddings, the model provider defaults to using the NVIDIA NIM API.

Supported aidb operations

encode_text
encode_text_batch

Supported models

Any text-embedding model that's supported by the provider.

Supported OpenAI models

Any text-embedding model that's supported by OpenAI. This includes text-embedding-3-small, and text-embedding-ada-002. See a list of supported OpenAI models here.
The default is text-embedding-3-small.

Supported NIM models

nvidia/nv-embedqa-e5-v5 (default)

Creating the default with OpenAI model

SELECT aidb.create_model('my_openai_embeddings',
                            'openai_embeddings',
                            credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);

As this example is defaulting the model to text-embedding-3-small, you don't need to specify the model in the configuration. But you do need to pass an OpenAI API key in the credentials. For that, you must pass credentials as a named parameter.

Creating a specific model

You can create any supported OpenAI embedding model using the aidb.create_model function. This example creates a text-embedding-3-small model with the name my_openai_model:

SELECT aidb.create_model(
  'my_openai_model',
  'openai_embeddings',
  '{"model": "text-embedding-3-small"}'::JSONB,
  '{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
);

Because this example is passing the configuration options and the credentials, unlike the previous example, you don't need to pass the credentials as a named parameter.

Model configuration settings

The following configuration settings are available for OpenAI models:

model — The OpenAI model to use.
url — The URL of the model to use. This value is optional and can be used to specify a custom model URL.
- If openai_completions (or completions) is the model, url defaults to https://api.openai.com/v1/chat/completions.
- If nim_completions is the model, url defaults to https://integrate.api.nvidia.com/v1/chat/completions.
max_concurrent_requests — The maximum number of concurrent requests to make to the OpenAI model. The default is 25.
max_batch_size — The maximum number of records to send to the model in a single request. The default is 50.000.

Batch and parallel processing

The model providers for embeddings, openai_embeddings, and nim_embeddings support sending batch requests as well as concurrent requests. The two settings max_concurrent_requests and max_batch_size control this behavior. When a model provider receives a set of records (E.g., from a knowledge base pipeline) the following happens:

Assuming the knowledge base pipeline is configured with batch size 10.000.
And the model provider is configured with max_batch_size=1000 and max_concurrent_requests=5.
Then, the provider will collect up to 1000 records and send them in a single request to the model.
And it will send 5 such large requests concurrently, until no more input records are left.
So in this example, the provider needs to send/receive 10 batches in total.
- After sending the first 5, it waits for the responses to return.
- Once a response is received, another request can be sent.
- This means the provider won't wait for all 5 to return before sending off the next 5. Instead, it always keeps up to 5 requests in flight.

Note

The settings max_concurrent_requests and max_batch_size can have a significant impact on model performance. But they highly depend on the hardware and infrastructure.

We recommend leaving the defaults in place and tuning the performance via the knowledge base pipeline batch size. The default max_batch_size of 50.000 is intentionally high to allow the pipeline to control the actual size of the batches.

Model credentials

The following credentials may be required by the service providing these models. Note: api_key and basic_auth are exclusive. Only one of these two options can be used.

api_key — The API key to use for Bearer Token authentication. The api_key will be sent in a header field as Authorization: Bearer <api_key>.
basic_auth — credentials for HTTP Basic authentication. The credentials provided here will be sent verbatim as Authorization: Basic <basic_auth>. If your server requires username/password with HTTP Basic authentication, use the syntax username:password. If only a token is required, only provide the token.

← Prev

Completions

↑ Up

Pipelines supported models

NIM_CLIP

Could this page be better? Report a problem or suggest an addition!