Use Anthropic Claude Sonnet 3.5 in PostgreSQL With Pgai

Anthropic, a leading AI research company known for their commitment to developing safe and ethical AI systems, has been making waves in the field of AI with their Claude family of large language models (LLMs). 

Anthropic has quickly become a favorite among developers and businesses alike thanks to Claude 3.5 Sonnet, their latest and most advanced model. Sonnet 3.5 has been receiving rave reviews from the developer community for its exceptional performance across a wide range of tasks.

Claude’s ability to understand context, generate human-like responses, and tackle complex problems has made it a go-to choice for many AI applications. In fact, Claude 3.5 Sonnet has been so well-received that it has replaced GPT-4o as the default LLM for many developers worldwide (see here for example). Its combination of intelligence, speed, and cost-effectiveness has set a new standard in the AI industry.

Anthropic also has a major partnership with AWS, making Claude a great choice for developers building AI applications on AWS cloud infrastructure. AWS offers Claude models on Amazon Bedrock, a managed service offering foundation models via API. For developers, this means easier access to Claude models through AWS, along with improved scalability and performance thanks to AWS's robust cloud infrastructure.

Today, we're thrilled to announce that PostgreSQL developers can now access the entire family of Anthropic's Claude models directly within their databases using pgai. Pgai is an open-source PostgreSQL extension that brings AI models closer to your data, making tasks like embedding creation and reasoning on data in PostgreSQL much easier.

While Claude 3.5 Sonnet is the star of the show (at least at the time of writing), pgai also supports other models in the Anthropic lineup, including Claude 3 Opus and Claude 3 Haiku. This variety allows developers to choose the most appropriate model for their specific use case, balancing factors like performance, speed, and cost.

Getting started with Anthropic Claude models in pgai

Want to start integrating Anthropic's Claude models into your PostgreSQL workflows today? Pgai is open source under the PostgreSQL License and is available for you to use in your AI projects today. You can find installation instructions on the pgai GitHub repository. You can also access pgai (alongside pgvector and pgvectorscale) on any database service on Timescale’s Cloud PostgreSQL platform.

Once you’ve connected to your database, create the pgai extension by running the following:

CREATE EXTENSION IF NOT EXISTS ai CASCADE;

Join the Postgres for AI community

Have questions about using Claude models with pgai? Join the Postgres for AI Discord, where you can share what you’re working on and help or get helped by a community of peers. You can also open an issue on the pgai GitHub (and while you’re there, stars are always appreciated ⭐).

Next, let’s discuss the benefits of using Anthropic Claude models for building AI applications. We’ll also showcase examples (with code!) of using pgai to reason over a dataset in PostgreSQL to get you familiar with the workflow and syntax to apply to your own projects. Let’s proceed.

Fun fact: The Claude family of models is named after Claude Shannon, widely known as “the father of information theory” and a significant contributor to the field of computer science and mathematics.

Why Use Anthropic Claude to Power AI Applications?

Claude Sonnet 3.5 is Anthropic’s most intelligent model but is also remarkably cost-efficient (image source).

Claude's integration into pgai offers key benefits for PostgreSQL developers:

  1. Unparalleled intelligence: Claude 3.5 Sonnet demonstrates remarkable intelligence across various domains. From complex problem-solving to nuanced language understanding, it excels in tasks that require deep comprehension and analytical thinking. This makes it an invaluable tool for developers working on sophisticated data analysis, natural language processing, and decision-making systems on data housed in PostgreSQL.
  2. Lightning-fast performance: One of the standout features of Claude 3.5 Sonnet is its speed. Anthropic has optimized the model for rapid response times, allowing for near real-time interactions. This speed is crucial for applications that require quick turnaround times, such as chatbots, SQL generation and execution, data analysis, or dynamic content generation directly from your database.
  3. Cost-effectiveness: In today's competitive landscape, balancing performance with cost is essential. Claude 3.5 Sonnet offers an excellent price-to-performance ratio, making it an attractive option for businesses of all sizes. By integrating Claude models through pgai, you can leverage this cost-effectiveness directly within your PostgreSQL workflows, potentially reducing overall operational costs compared to using other models.
  4. Enhanced privacy and security: Anthropic has placed a strong emphasis on user privacy and data security in the development of their models.
Claude 3.5 Sonnet measured on different evaluation datasets (image source).

How Claude Sonnet 3.5 Enables LLM Reasoning on PostgreSQL Data With Pgai

With pgai and Anthropic's Claude models, developers can now perform sophisticated retrieval-augmented generation (RAG) and LLM reasoning tasks directly on data stored in PostgreSQL tables, all without an external data pipeline or the need for data to leave the database. 

Examples of reasoning tasks include:

  • Summarization: Generate concise summaries of large text fields or documents stored in your database.
  • Categorization: Automatically classify and organize data based on content and context.
  • Data enrichment: Enhance existing data with AI-generated insights and additional information.
  • Natural language querying: Interact with your database using natural language queries.

All of these tasks can now be performed using simple SQL queries, eliminating the need for complex data pipelines or external processing!

Example: Movie analysis with pgai and Claude

Let’s look at an example of using Claude Sonnet 3.5 to reason over data stored in a PostgreSQL table. We’ll create a table of movies and ask Claude to generate an idea for a new science fiction movie and write an overview based on the information in the table.

First, let’s install some necessary Python packages to download the dataset with the following command:

pip install pandas pyarrow fsspec huggingface_hub

Run the following Python script to download the parquet file containing the movies and convert it to a CSV file:

import pandas as pd


df = pd.read_parquet("hf://datasets/Cohere/movies/movies.parquet")
df.to_csv('movies.csv')

Next, let’s install the pgai extension on our PostgreSQL database. You can do this in PopSQL, psql, or any Postgres admin tool of your choice. This command installs the pgai extension in PostgreSQL if it's not already installed. The cascade option ensures that any dependencies of the pgai extension are also installed, like pgvector:

create extension if not exists ai cascade;

Next, let’s create a movie table in our Postgres database with columns for ID, title, overview, genres, producer, and cast. And import the data from our CSV into the table:

create table movie
( id int not null primary key
, title text not null
, overview text
, genres text
, producer text
, "cast" text
);


\copy movie from 'movies.csv' with (format csv, header on)

Finally, let’s ask Claude to write and generate an idea for a new science fiction movie by referencing other movies contained in the movie table with the following prompt:

“Given the overviews of some example science fiction movies below, write an overview for a new science fiction movie with a theme similar to the examples”

with overviews as
(
   select string_agg(format(E'* title: %s\n  %s', title, overview), E'\n\n') as overviews
   from movie
   where genres like '%Science Fiction%'
   limit 15
)
select anthropic_generate
( 'claude-3-5-sonnet-20240620'
, jsonb_build_array
 ( jsonb_build_object
   ( 'role', 'user'
   , 'content',
       concat
       ( E'Given the overviews of some example science fiction movies below, write an overview for a new science fiction movie with a theme similar to the examples.\n\n'
       , (select overviews from overviews)
       )
   )
 )
)->'content'->0->>'text' as new_movie_idea
;

The query above does the following:

  1. It selects overviews of up to 15 science fiction movies from the movie table.
  2. It then uses the anthropic_generate function (part of the pgai extension) to generate a new movie idea
  3. The function is called with the Claude 3.5 Sonnet model and a prompt that includes the selected movie overviews.
  4. The result is extracted from the JSON response and aliased as new_movie_idea,.

Here’s an example of the fun results Claude Sonnet 3.5 came up with:

Thank you for providing those movie overviews. Based on the examples, here is an original overview for a new science fiction movie with similar themes:


Title: Nexus


In the year 2157, humanity has established its first interstellar colony on a distant exoplanet. When communication with Earth suddenly ceases, colonial administrator Dr. Elara Chen must uncover the truth behind the silence. As strange phenomena begin occurring on the colony world, Elara and her team race to solve the mystery before it's too late. They soon discover their new home harbors an ancient alien presence that could hold the key to humanity's future - or spell its doom. With limited resources and time running out, Elara must make an impossible choice that will determine the fate of two worlds.

And,

Here's an overview for a new science fiction movie with themes similar to the examples:


Title: Quantum Convergence


In the year 2157, humanity has established colonies across the solar system. Dr. Elara Chen, a brilliant quantum physicist, accidentally discovers a way to create microscopic wormholes. Her groundbreaking technology catches the attention of both the Earth government and a militant separatist group from Mars.


As tensions between Earth and Mars escalate towards war, Dr. Chen realizes her invention could be weaponized to devastating effect. She goes into hiding with a small team of trusted colleagues, racing to refine the technology into a tool for instantaneous communication and transportation that could unite humanity across the vast distances of space.


Meanwhile, covert operatives from Earth and Mars hunt for Dr. Chen, each hoping to claim the quantum breakthrough for their own side. As the conflict intensifies, Dr. Chen and her team find themselves caught in the crossfire, forced to evade capture while working against the clock.


With time running out before all-out war erupts, Dr. Chen must risk everything on one final, ambitious experiment that could either save humanity or tear it apart forever. The fate of Earth, Mars, and all of human civilization hangs in the balance as science, politics, and human nature collide on an interplanetary scale.


Pretty cool, right?! This is just a glimpse into the LLM reasoning capabilities that Anthropic’s Claude models can unlock on your PostgreSQL data, all right inside the database, thanks to pgai.

Get Started With Anthropic Claude and Pgai Today

The integration of Anthropic's Claude models, particularly Claude 3.5 Sonnet, into pgai marks a significant milestone in our vision to help PostgreSQL evolve into an AI database. By bringing these state-of-the-art language models directly into your database environment, you can unlock new levels of efficiency, intelligence, and innovation in your projects.

Pgai is open source under the PostgreSQL License and is available for you to use in your AI projects today. You can find installation instructions on the pgai GitHub repository. You can also access pgai on any database service on Timescale’s Cloud PostgreSQL platform.

Pgai is an effort to enrich the PostgreSQL ecosystem for AI. If you’d like to help, here’s how you can get involved:

  • Got questions about using Anthropic Claude models in pgai? Join the Postgres for AI Discord, a community of developers building AI applications with PostgreSQL. Share what you’re working on, and help or get helped by a community of peers.
  • Share the news with your friends and colleagues: Share our posts about pgai on X/Twitter, LinkedIn, and Threads. We promise to RT back.
  • Submit issues and feature requests: We encourage you to submit issues and feature requests for functionality you’d like to see, bugs you find, and suggestions you think would improve pgai. Head over to the pgai GitHub repo to share your ideas.
  • Make a contribution: We welcome community contributions for pgai. Pgai is written in Python and PL/Python. Let us know which models you want to see supported, particularly for open-source embedding and generation models. See the pgai GitHub for instructions to contribute.
  • Offer the pgai extension on your PostgreSQL cloud: Pgai is an open-source project under the PostgreSQL License. We encourage you to offer pgai on your managed PostgreSQL database-as-a-service platform and can even help you spread the word. Get in touch via our Contact Us form and mention pgai to discuss further.

We're excited to see what you'll build with this powerful combination of PostgreSQL, pgai, and Anthropic's Claude models. Start exploring the possibilities today, and don't hesitate to reach out to the community for support and inspiration.

Let's make Postgres a better database for AI, together!