Preparer Chaining Example

Suggest edits

Example of chaining multiple preparers together with auto processing using the ChunkText and SummarizeText operations in AI Accelerator.

Create the first Preparer to chunk text

-- Create source test table
CREATE TABLE source_table__1321
(
    id      INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    content TEXT NOT NULL
);

SELECT aidb.create_table_preparer(
    name => 'chunking_preparer__1321',
    operation => 'ChunkText',
    source_table => 'source_table__1321',
    source_key_column => 'id',
    source_data_column => 'content',
    destination_table => 'chunked_data__1321',
    destination_data_column => 'chunk',
    destination_key_column => 'id',
    options => '{"desired_length": 1, "max_length": 1000}'::JSONB  -- Configuration for the ChunkText operation
);

Create the second Preparer to summarize the chunked text

-- Create the model. It must support the decode_text and decode_text_batch operations.
SELECT aidb.create_model('model__1321', 't5_local');

SELECT aidb.create_table_preparer(
    name => 'summarizing_preparer__1321',
    operation => 'SummarizeText',
    source_table => 'chunked_data__1321',         -- Reference the output from the ChunkText preparer
    source_key_column => 'unique_id',             -- Reference the unique column from the output of the ChunkText preparer
    source_data_column => 'chunk',                -- Reference the output from the ChunkText preparer
    destination_table => 'summarized_data__1321',
    destination_data_column => 'summary',
    destination_key_column => 'chunk_unique_id',
    options => '{"model": "model__1321"}'::JSONB  -- Configuration for the SummarizeText operation
);

Tip

This operation transforms the shape of the data, automatically unnesting collections by introducing a part_id column. See the unnesting concept for more detail.

Set both Preparers to Live automatic processing

SELECT aidb.set_auto_preparer('chunking_preparer__1321', 'Live');
SELECT aidb.set_auto_preparer('summarizing_preparer__1321', 'Live');

Insert data for processing

Now, when we insert data into the source data table, we see processed results flowing automatically...

INSERT INTO source_table__1321
VALUES (1, 'This is a significantly longer text example that might require splitting into smaller chunks. The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. This enables processing or storage of data in manageable parts.');

Chunks calculated automatically:

SELECT * FROM chunked_data__1321;

Output

 id | part_id | unique_id |                                                                       chunk
----+---------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  |       0 | 1.part.0  | This is a significantly longer text example that might require splitting into smaller chunks.
 1  |       1 | 1.part.1  | The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters.
 1  |       2 | 1.part.2  | This enables processing or storage of data in manageable parts.
(3 rows)

Summaries of the chunks calculated automatically:

SELECT * FROM summarized_data__1321;

Output

 chunk_unique_id |                                               summary
-----------------+------------------------------------------------------------------------------------------------------
 1.part.0        | text example might require splitting into smaller chunks .
 1.part.1        | the purpose of this function is to partition text data into segments of a specified maximum length .
 1.part.2        | enables processing or storage of data in manageable parts .
(3 rows)

The same automatic flow of logic occurs for deletions:

DELETE FROM source_table__1321 WHERE id = 1;

SELECT * FROM chunked_data__1321;

Output

 id | part_id | unique_id | chunk
----+---------+-----------+-------
(0 rows)

SELECT * FROM summarized_data__1321;

Output

 chunk_unique_id | summary
-----------------+---------
(0 rows)

← Prev

AI Accelerator Pipelines - Preparers examples

↑ Up

AI Accelerator Pipelines - Preparers examples

Preparers Auto Processing Example

Could this page be better? Report a problem or suggest an addition!

Preparer Chaining Example

Create the first Preparer to chunk text

Create the second Preparer to summarize the chunked text

Tip

Set both Preparers to Live automatic processing

Insert data for processing

← Prev

↑ Up

Next →