Category: All posts
Aug 12, 2024
Enterprise-ready LLMs from Cohere—now available in the pgai PostgreSQL extension.
Cohere is a leading generative AI company in the field of large language models (LLMs) and retrieval-augmented generation (RAG) systems. What sets Cohere apart from other model developers is its unwavering focus on enterprise needs and support for multiple languages.
Cohere has quickly gained traction among businesses and developers alike, thanks to its suite of models that cater to key steps of the AI application-building process: text embedding, result reranking, and reasoning. Here’s an overview of Cohere models:
Pgai is an open-source PostgreSQL extension that brings AI models closer to your data, simplifying tasks such as embedding creation, text classification, semantic search, and retrieval-augmented generation on data stored in PostgreSQL.
Pgai supports the entire suite of Cohere Embed, Rerank, and Command models. This means that developers can now:
We built pgai to give more PostgreSQL developers AI Engineering superpowers. Pgai makes it easier for database developers familiar with PostgreSQL to become "AI engineers" by providing familiar SQL-based interfaces to AI functionalities like embedding, creating, and model usage.
While Command is the flagship model for building scalable, production-ready AI applications, pgai supports a variety of models in the Cohere lineup. This includes specialized embedding models that support over 100 languages, enabling developers to select the most suitable model for their specific enterprise use case.
Whether it's building internal AI knowledge assistants, security documentation AI, customer feedback analysis, or employee support systems, Cohere's models integrated with pgai and PostgreSQL offer powerful solutions.
Getting started with Cohere models in pgai
Ready to start building with Cohere's suite of powerful language models and PostgreSQL? Pgai is open source under the PostgreSQL License and is available for immediate use in your AI projects. You can find installation instructions on the pgai GitHub repository. You can also access pgai (alongside pgvector and pgvectorscale) on any database service on Timescale’s Cloud PostgreSQL platform. If you’re new to Timescale, you can get started with a free cloud PostgreSQL database here.
Once connected to your database, create the pgai extension by running:
CREATE EXTENSION IF NOT EXISTS ai CASCADE;
Join the Postgres AI Community
Have questions about using Cohere models with pgai? Join the Postgres for AI Discord, where you can share your projects, seek help, and collaborate with a community of peers. You can also open an issue on the pgai GitHub (and while you’re there, stars are always appreciated ⭐).
Next, let's explore the benefits of using Cohere models for building enterprise AI applications. And we’ll close with a real-world example of using Cohere models to build a hybrid search system, combining pgvector semantic search with PostgreSQL full-text search.
Let's dive in...
Cohere's suite of models offers several advantages for enterprise-grade RAG and search applications:
High-quality results: Cohere models excel in understanding business language, providing relevant content, and tackling complex enterprise challenges.
Multi-language support: With support for over 100 languages, Cohere models are ideal for global enterprises. For example, Command R+ is optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. Additionally, pre-training data has been included for the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.
Flexible deployment: Cohere’s flexible deployment options allow businesses to bring Cohere's models to their own data, either on their servers or in private/commercial clouds, addressing critical security and compliance concerns. This flexibility, combined with Cohere's robust API and partnerships with major cloud providers like Amazon Web Services (through the Bedrock AI platform), ensures seamless integration and improved scalability for enterprise applications.
Enterprise focus: Strategic partnerships with industry leaders like Fujitsu, Oracle, and McKinsey & Company underscore Cohere's enterprise-centric approach.
Cohere models enable sophisticated enterprise AI applications such as:
Cohere's models offer a powerful, scalable, and easily implementable solution for enterprise search and RAG applications, balancing high accuracy with efficient performance.
Let’s look at an example of why Cohere's support in pgai is such a game-changer. We’ll perform a hybrid search over a corpus of news articles, combining PostgreSQL full-text search with pgvector’s semantic search, leveraging Cohere Embed and Rerank models in the process.
Here's what we'll cover:
By the end of this tutorial, you'll have a powerful hybrid search system that provides highly relevant search results, leveraging both traditional keyword search and semantic search techniques.
To begin, we'll set up a Python virtual environment and install the necessary libraries. This ensures a clean, isolated environment for our project.
mkdir cohere
cd cohere
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
pip install datasets "psycopg[binary]" python-dotenv
This setup:
datasets
: For loading our sample datasetpsycopg
: PostgreSQL adapter for Pythonpython-dotenv
: For managing environment variablesNext, create a .env
file in your project directory to store your database connection string and Cohere API key:
DB_URL=postgres://username:password@localhost:5432/your_database
COHERE_API_KEY=your_cohere_api_key_here
Replace the placeholders with your actual database credentials and Cohere API key.
Now, let's set up our PostgreSQL database with the necessary extensions and table structure. We’ll create a table to hold our news articles. The embedding
column will hold our embeddings which we’ll create using the Cohere Embed model. The tsv
column will hold our full-text-search vectors.
Connect to your PostgreSQL database and run the following SQL commands:
-- create the pgai extension
create extension if not exists ai cascade;
-- create a table for the news articles
-- the embedding column will store embeddings from cohere
-- the tsv column will store the full-text-search vector
create table cnn_daily_mail
( id bigint not null primary key generated by default as identity
, highlights text
, article text
, embedding vector(1024)
, tsv tsvector generated always as (to_tsvector('english', article)) stored
);
-- index the full-text-search vector
create index on cnn_daily_mail using gin (tsv);
This SQL script:
pgai
extensionNow that our database is set up, we'll load sample data from the CNN/Daily Mail dataset. We'll use Python to download the dataset and efficiently insert it into our PostgreSQL table using the copy
command with the binary
format.
Create a new file called load_data.py
in your project directory and add the following code:
import os
from dotenv import load_dotenv
from datasets import load_dataset
import psycopg
# Load environment variables
load_dotenv()
DB_URL = os.environ["DB_URL"]
COHERE_API_KEY = os.environ["COHERE_API_KEY"]
# Load and prepare the dataset
dataset = load_dataset("cnn_dailymail", "3.0.0")
test = dataset["test"]
test = test.shuffle(seed=42).select(range(0,1000))
# Insert data into PostgreSQL
with psycopg.connect(DB_URL) as con:
with con.cursor(binary=True) as cur:
with cur.copy("copy public.cnn_daily_mail (highlights, article) from stdin (format binary)") as cpy:
cpy.set_types(['text', 'text'])
for row in test:
cpy.write_row((row["highlights"], row["article"]))
print("Data loading complete!")
This script:
To run the script and load the data:
python load_data.py
This approach is much faster than inserting rows one by one, especially for larger datasets. After running the script, you should have 1000 news articles loaded into your cnn_daily_mail table, ready for the next steps in our tutorial.
Note: Before running this script, make sure your .env file is properly set up with the correct database URL.
In this step, we'll use Cohere's Embed model to generate vector embeddings for our news articles. These embeddings will enable semantic search capabilities.
First, set your Cohere API key in your PostgreSQL session:
select set_config('ai.cohere_api_key', '<YOUR-API-KEY>', false) is not null as set_cohere_api_key
The pgai extension will use this key by default for the duration of your database session.
We'll explore two methods to generate embeddings: a simple approach and a more production-ready solution. Choose method 1 if you’re just experimenting and method 2 if you want to see how you’d run this in production.
Method 1: Easy mode (for small datasets)
This is the "easy-mode" approach to generating embeddings for each news article.
-- this is the easy way to create the embeddings
-- but it's one LONG-running update statement
update cnn_daily_mail set embedding = cohere_embed('embed-english-v3.0', article, _input_type=>'search_document');
This method is straightforward but can be slow for large datasets. The downside to this is that each call to Cohere is relatively slow. This makes the whole statement long-running.
Method 2: Production mode (recommended for larger datasets)
This method processes rows one at a time, allowing for better concurrency and error handling. We select and lock only a single row at a time, and we commit our work as we go along.
-- this is a more production-appropriate way to generate the embeddings
-- we only lock one row at a time and we commit each row immediately
do $$
declare
_id bigint;
_article text;
_embedding vector(1024);
begin
loop
select id, article into _id, _article
from cnn_daily_mail
where embedding is null
for update skip locked
limit 1;
if not found then
exit;
end if;
_embedding = cohere_embed('embed-english-v3.0', _article, _input_type=>'search_document');
update cnn_daily_mail set embedding = _embedding where id = _id;
commit;
end loop;
end;
$$;
Note that both methods above use the cohere_embed() function provided by the pgai extension to generate embeddings for each article in the cnn_daily_mail table using the Cohere Embed v3 embedding model.
After running one of the above methods, verify that all rows have embeddings:
-- this should return 0
select count(*) from cnn_daily_mail where embedding is null;
Create a vector search index
To optimize vector similarity searches, create an index on the embedding column:
-- index the embeddings
create index on cnn_daily_mail using hnsw (embedding vector_cosine_ops);
This index uses the pgvector HNSW (hierarchical navigable small world) algorithm, which is efficient for nearest-neighbor searches in high-dimensional spaces.
With these steps completed, your database is now set up for both keyword-based and semantic searches. In the next section, we'll explore how to perform these searches using pgai and Cohere's models.
PostgreSQL's full-text search capabilities allow us to perform complex keyword searches efficiently. We'll use the tsv
column we created earlier, which contains the tsvector representation of each article.
-- a full-text-search
-- must include "death" OR "kill"
-- must include "police" AND "car" AND "dog"
select article
from cnn_daily_mail
where article @@ to_tsquery('english', '(death | kill) & police & car & dog')
;
Let's break down this query:
@@
is the text search match operator in PostgreSQL.to_tsquery('english', '...')
converts our search terms into a tsquery type.This query will return articles that contain:
This method of searching is fast and efficient, especially for exact keyword matches inside the article text. However, it doesn't capture semantic meaning or handle synonyms well. In the next section, we'll explore how to perform semantic searches using the embeddings we created earlier.
Semantic search allows us to find relevant articles based on the meaning of a query rather than just matching keywords. We'll use Cohere's Embed model to convert our search query into a vector embedding, then find the most similar article embeddings using pgvector.
-- get the 15 most relevant stories to our question
with q as
(
select cohere_embed
('embed-english-v3.0'
, 'Show me stories about police reports of deadly happenings involving cars and dogs.'
, _input_type=>'search_query'
) as q
)
select article
from cnn_daily_mail
order by embedding <=> (select q from q limit 1)
limit 15
;
Let's break this query down:
q
to generate the embedding for our search query using Cohere's model (also named q
).cohere_embed()
is a function provided by pgai that interfaces with Cohere's API to create embeddings. See more in the pgai docs._input_type => 'search_query'
to optimize the embedding for search queries.<=>
) between each article's embedding and our query embedding.LIMIT 15
clause returns the top 15 most semantically similar articles.This semantic search can find relevant articles even if they don't contain the exact words used in the query. It understands the context and meaning of the search query and matches it with semantically similar content.
Hybrid search combines the strengths of both keyword and semantic searches and then uses Cohere's Rerank model to improve the relevance of the results further. This approach can provide more accurate and contextually relevant results than either method alone.
Here's how to perform a hybrid search with reranking:
with full_text_search as
(
select article
from cnn_daily_mail
where article @@ to_tsquery('english', '(death | kill) & police & car & dog')
limit 15
)
, vector_query as
(
select cohere_embed
('embed-english-v3.0'
, 'Show me stories about police reports of deadly happenings involving cars and dogs.'
, _input_type=>'search_query'
) as query_embedding
)
, vector_search as
(
select article
from cnn_daily_mail
order by embedding <=> (select query_embedding from vector_query limit 1)
limit 15
)
, rerank as
(
select cohere_rerank
( 'rerank-english-v3.0'
, 'Show me stories about police reports of deadly happenings involving cars and dogs.'
, (
select jsonb_agg(x.article)
from
(
select *
from full_text_search
union
select * from vector_search
) x
)
, _top_n => 5
, _return_documents => true
) as response
)
select
x.index
, x.document->>'text' as article
, x.relevance_score
from rerank
cross join lateral jsonb_to_recordset(rerank.response->'results') x(document jsonb, index int, relevance_score float8)
order by relevance_score desc
;
Let's break down this query:
full_text_search
: Performs a keyword search using PostgreSQL's full-text search capabilities.vector_query
: Creates an embedding for our search query using Cohere's Embed model.vector_search
: Performs a semantic search using the query embedding.combined_results
: Combines the results from both keyword and semantic searches.reranked_results
: Uses Cohere's Rerank model to reorder the combined results based on relevance to the query.SELECT
statement extracts and formats the reranked results.This hybrid approach offers several advantages:
To use this query effectively:
to_tsquery
parameters to match your specific keyword search needs.cohere_embed
and cohere_rerank
functions to match your search intent.LIMIT
values and the _top_n
parameter to balance between recall and precision.Remember, while this approach can provide highly relevant results, it does involve multiple API calls to Cohere (for embedding and reranking), which may impact performance for large result sets or high-volume applications.
This hybrid search method demonstrates the power of combining traditional database search techniques with advanced AI models, all within your PostgreSQL database using pgai.
The integration of Cohere’s Command, Embed, and Rerank model into pgai marks a significant milestone in our vision to help PostgreSQL evolve into an AI database. By bringing these state-of-the-art language models directly into your database environment, you can unlock new levels of efficiency, intelligence, and innovation in your projects.
Pgai is open source under the PostgreSQL License and is available for you to use in your AI projects today. You can find installation instructions on the pgai GitHub repository. You can also access pgai on any database service on Timescale’s cloud PostgreSQL platform. If you’re new to Timescale, you can get started with a free cloud PostgreSQL database here.
Pgai is an effort to enrich the PostgreSQL ecosystem for AI. If you’d like to help, here’s how you can get involved:
We're excited to see what you'll build with PostgreSQL and pgai!