Jul 30, 2024
Anthropic, a leading AI research company known for their commitment to developing safe and ethical AI systems, has been making waves in the field of AI with their Claude family of large language models (LLMs).
Anthropic has quickly become a favorite among developers and businesses alike thanks to Claude 3.5 Sonnet, their latest and most advanced model. Sonnet 3.5 has been receiving rave reviews from the developer community for its exceptional performance across a wide range of tasks.
Claude’s ability to understand context, generate human-like responses, and tackle complex problems has made it a go-to choice for many AI applications. In fact, Claude 3.5 Sonnet has been so well-received that it has replaced GPT-4o as the default LLM for many developers worldwide (see here for example). Its combination of intelligence, speed, and cost-effectiveness has set a new standard in the AI industry.
Anthropic also has a major partnership with AWS, making Claude a great choice for developers building AI applications on AWS cloud infrastructure. AWS offers Claude models on Amazon Bedrock, a managed service offering foundation models via API. For developers, this means easier access to Claude models through AWS, along with improved scalability and performance thanks to AWS's robust cloud infrastructure.
Today, we're thrilled to announce that PostgreSQL developers can now access the entire family of Anthropic's Claude models directly within their databases using pgai. Pgai is an open-source PostgreSQL extension that brings AI models closer to your data, making tasks like embedding creation and reasoning on data in PostgreSQL much easier.
While Claude 3.5 Sonnet is the star of the show (at least at the time of writing), pgai also supports other models in the Anthropic lineup, including Claude 3 Opus and Claude 3 Haiku. This variety allows developers to choose the most appropriate model for their specific use case, balancing factors like performance, speed, and cost.
Getting started with Anthropic Claude models in pgai
Want to start integrating Anthropic's Claude models into your PostgreSQL workflows today? Pgai is open source under the PostgreSQL License and is available for you to use in your AI projects today. You can find installation instructions on the pgai GitHub repository. You can also access pgai (alongside pgvector and pgvectorscale) on any database service on Timescale’s Cloud PostgreSQL platform.
Once you’ve connected to your database, create the pgai extension by running the following:
CREATE EXTENSION IF NOT EXISTS ai CASCADE;
Join the Postgres for AI community
Have questions about using Claude models with pgai? Join the Postgres for AI Discord, where you can share what you’re working on and help or get helped by a community of peers. You can also open an issue on the pgai GitHub (and while you’re there, stars are always appreciated ⭐).
Next, let’s discuss the benefits of using Anthropic Claude models for building AI applications. We’ll also showcase examples (with code!) of using pgai to reason over a dataset in PostgreSQL to get you familiar with the workflow and syntax to apply to your own projects. Let’s proceed.
Fun fact: The Claude family of models is named after Claude Shannon, widely known as “the father of information theory” and a significant contributor to the field of computer science and mathematics.
Claude's integration into pgai offers key benefits for PostgreSQL developers:
With pgai and Anthropic's Claude models, developers can now perform sophisticated retrieval-augmented generation (RAG) and LLM reasoning tasks directly on data stored in PostgreSQL tables, all without an external data pipeline or the need for data to leave the database.
Examples of reasoning tasks include:
All of these tasks can now be performed using simple SQL queries, eliminating the need for complex data pipelines or external processing!
Example: Movie analysis with pgai and Claude
Let’s look at an example of using Claude Sonnet 3.5 to reason over data stored in a PostgreSQL table. We’ll create a table of movies and ask Claude to generate an idea for a new science fiction movie and write an overview based on the information in the table.
First, let’s install some necessary Python packages to download the dataset with the following command:
pip install pandas pyarrow fsspec huggingface_hub
Run the following Python script to download the parquet file containing the movies and convert it to a CSV file:
import pandas as pd
df = pd.read_parquet("hf://datasets/Cohere/movies/movies.parquet")
df.to_csv('movies.csv')
Next, let’s install the pgai extension on our PostgreSQL database. You can do this in PopSQL, psql, or any Postgres admin tool of your choice. This command installs the pgai extension in PostgreSQL if it's not already installed. The cascade
option ensures that any dependencies of the pgai extension are also installed, like pgvector:
create extension if not exists ai cascade;
Next, let’s create a movie
table in our Postgres database with columns for ID, title, overview, genres, producer, and cast. And import the data from our CSV into the table:
create table movie
( id int not null primary key
, title text not null
, overview text
, genres text
, producer text
, "cast" text
);
\copy movie from 'movies.csv' with (format csv, header on)
Finally, let’s ask Claude to write and generate an idea for a new science fiction movie by referencing other movies contained in the movie
table with the following prompt:
“Given the overviews of some example science fiction movies below, write an overview for a new science fiction movie with a theme similar to the examples”
with overviews as
(
select string_agg(format(E'* title: %s\n %s', title, overview), E'\n\n') as overviews
from movie
where genres like '%Science Fiction%'
limit 15
)
select anthropic_generate
( 'claude-3-5-sonnet-20240620'
, jsonb_build_array
( jsonb_build_object
( 'role', 'user'
, 'content',
concat
( E'Given the overviews of some example science fiction movies below, write an overview for a new science fiction movie with a theme similar to the examples.\n\n'
, (select overviews from overviews)
)
)
)
)->'content'->0->>'text' as new_movie_idea
;
The query above does the following:
movie
table.anthropic_generate
function (part of the pgai extension) to generate a new movie ideanew_movie_idea
,.Here’s an example of the fun results Claude Sonnet 3.5 came up with:
Thank you for providing those movie overviews. Based on the examples, here is an original overview for a new science fiction movie with similar themes: Title: Nexus In the year 2157, humanity has established its first interstellar colony on a distant exoplanet. When communication with Earth suddenly ceases, colonial administrator Dr. Elara Chen must uncover the truth behind the silence. As strange phenomena begin occurring on the colony world, Elara and her team race to solve the mystery before it's too late. They soon discover their new home harbors an ancient alien presence that could hold the key to humanity's future - or spell its doom. With limited resources and time running out, Elara must make an impossible choice that will determine the fate of two worlds. |
And,
Here's an overview for a new science fiction movie with themes similar to the examples: Title: Quantum Convergence In the year 2157, humanity has established colonies across the solar system. Dr. Elara Chen, a brilliant quantum physicist, accidentally discovers a way to create microscopic wormholes. Her groundbreaking technology catches the attention of both the Earth government and a militant separatist group from Mars. As tensions between Earth and Mars escalate towards war, Dr. Chen realizes her invention could be weaponized to devastating effect. She goes into hiding with a small team of trusted colleagues, racing to refine the technology into a tool for instantaneous communication and transportation that could unite humanity across the vast distances of space. Meanwhile, covert operatives from Earth and Mars hunt for Dr. Chen, each hoping to claim the quantum breakthrough for their own side. As the conflict intensifies, Dr. Chen and her team find themselves caught in the crossfire, forced to evade capture while working against the clock. With time running out before all-out war erupts, Dr. Chen must risk everything on one final, ambitious experiment that could either save humanity or tear it apart forever. The fate of Earth, Mars, and all of human civilization hangs in the balance as science, politics, and human nature collide on an interplanetary scale. |
Pretty cool, right?! This is just a glimpse into the LLM reasoning capabilities that Anthropic’s Claude models can unlock on your PostgreSQL data, all right inside the database, thanks to pgai.
The integration of Anthropic's Claude models, particularly Claude 3.5 Sonnet, into pgai marks a significant milestone in our vision to help PostgreSQL evolve into an AI database. By bringing these state-of-the-art language models directly into your database environment, you can unlock new levels of efficiency, intelligence, and innovation in your projects.
Pgai is open source under the PostgreSQL License and is available for you to use in your AI projects today. You can find installation instructions on the pgai GitHub repository. You can also access pgai on any database service on Timescale’s Cloud PostgreSQL platform.
Pgai is an effort to enrich the PostgreSQL ecosystem for AI. If you’d like to help, here’s how you can get involved:
We're excited to see what you'll build with this powerful combination of PostgreSQL, pgai, and Anthropic's Claude models. Start exploring the possibilities today, and don't hesitate to reach out to the community for support and inspiration.
Let's make Postgres a better database for AI, together!