Category: All posts
May 22, 2024
Introducing the Timescale Vector Python client library: a new library for storing, indexing, and querying vector embeddings in PostgreSQL. Easily store millions of embeddings using PostgreSQL as a vector database. Complete with optimized schema, batch ingestion, hybrid search, and time-based vector search. Learn more about its key features. And then take it for a spin: try Timescale Vector today, free for 90 days.
Python is the lingua franca of AI. And today, it gets even better for building AI applications with PostgreSQL as a vector database. Introducing the Timescale Vector Python client library, which enables Python developers to easily store, index, and query millions of vector embeddings using PostgreSQL.
The Python client library is the simplest way to integrate Timescale Vector’s best-in-class similarity search and hybrid search performance into your generative AI application.
Here’s an overview of how the Timescale Vector Python client makes it easier than ever to build AI applications with PostgreSQL:
In the remainder of this post, we’ll delve into each of these points with code examples!
To get started with the Timescale Vector python client, sign up to the Timescale cloud PostgreSQL platform, create a new database, and then run the following in your terminal:
pip install timescale_vector
Then follow this up and running with Timescale Vector tutorial (be sure to download the .env
file with your database credentials, you’ll need it to follow the tutorial).
Use the Timescale Vector Python library with a cloud PostgreSQL database, free for 90 days.
Timescale Vector Python client creates an optimized schema to efficiently store vector embeddings and associated metadata for fast search and retrieval. All you need to create a table is a Timescale service URL, your table name, and the dimension of the vectors you want to store.
# Table information
TABLE_NAME = "company_documents"
EMBEDDING_DIMENSIONS = 1536
# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL,
TABLE_NAME,
EMBEDDING_DIMENSIONS)
# create the table and the library handles the schema!
await vec.create_tables()
The create_tables()
function will create a table with the following schema:
id | metadata | contents | embedding
id
is the UUID that uniquely identifies each vector.metadata
is a JSONB column that stores the metadata associated with each vector.contents
is the text column that stores the content we want vectorized.embedding
is the vector column that stores the vector embedding representation of the content.Most Generative AI applications require inserting tens of thousands of records (embeddings plus metadata) into a table at a time. Timescale Vector makes it easy to batch ingest these records without extra configuration using the .upsert()
method:
# batch upsert vectors into table
await vec.upsert(records)
With a single line of code, you can create indices on your vectors to speed up similarity search on millions of embeddings.
The Timescale Vector Python library supports the timescale-vector
index inspired by the DiskANN algorithm, which achieves 3x search speed vs. specialized vector database Weaviate, and between 40% to 1,590% performance improvement over pgvector when performing ANN searches on one million OpenAI embeddings.
You can create a timescale vector
(DiskANN) index in a single line of code:
# Create a timescale vector (DiskANN) search index on the embedding column
await vec.create_embedding_index(client.TimescaleVectorIndex())
What’s more, the library also supports pgvector’s HNSW and IVFFlat indexing algorithms, along with smart defaults for all three index types. Advanced users can, of course, specify index parameters when creating an index via the index creation method arguments.
# Create HNSW search index on the embedding column
await vec.create_embedding_index(client.HNSWIndex())
# Create IVFFLAT search index on the embedding column
await vec.create_embedding_index(client.IvfflatIndex())
The Timescale Vector Python library provides a method for easy similarity search.
As a refresher, similarity search is where we find the vectors most similar in meaning to our query vector—more similar vectors are closer to each other, while less similar vectors are further away in the N-dimensional embedding space. Without indexes, this will default to performing exact nearest neighbor (KNN) search, but with the indexes discussed above enabled, you’ll perform approximate nearest neighbor (ANN) search.
# define search query and query_embedding
query_string = "What's new with Project X"
query_embedding = get_embeddings(query_string)
# search table for similar vectors to query_embedding
records = await vec.search(query_embedding)
In addition to simple similarity search (without metadata filters), the Timescale Vector Python library makes it simple to perform hybrid search on your vectors and metadata, where you not only query by vector similarity but by an additional metadata filter or LIMIT:
Filters can be specified as a dictionary where all fields and their values are matched exactly. You can also specify a list of dictionaries that uses OR semantics such that a row is returned if it matches any of the dictionaries.
We also support using more advanced metadata filters using Predicates. (See our documentation for more details.)
Our optimized schema design creates a GIN index on the metadata, allowing optimized searches for many metadata queries.
Timescale Vector optimizes time-based vector search queries, leveraging the automatic time-based partitioning and indexing of Timescale’s hypertables.
Time-based filtering is useful to efficiently find recent embeddings, constrain vector search by a time range or document age, and store and retrieve large language model (LLM) responses and chat history with ease. Time-based semantic search also enables you to use RAG with time-based context retrieval to give users more useful LLM responses.
You can use efficient time-based similarity search via the Timescale Vector Python library by first ensuring to create your client with the time_partition_interval
argument set to the time range you want your data partitioned by as follows:
# Table information
TABLE_NAME = "commit_history"
EMBEDDING_DIMENSIONS = 1536
# Partition interval
TIME_PARTITION_INTERVAL = timedelta(days=7)
# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL,
TABLE_NAME,
EMBEDDING_DIMENSIONS,
time_partition_interval=TIME_PARTITION_INTERVAL)
# create table
await vec.create_tables()
In the code block above, we set the time_partition_interval
argument in the client creation function to enable automatic time-based partitioning of the table. This will partition the table into time-based chunks and create indexes on the time-based chunks to speed up time-based queries.
Each partition will consist of data for the specified length of time. We use seven (7) days for simplicity, but you can pick whatever value makes sense for your use case. For example, if you query recent vectors frequently, you might want to use a smaller time delta like one (1) day, or if you query vectors over a decade-long time period then you might want to use a larger time delta like six (6) months or one (1) year.
Once we’ve created the table with time partitioning enabled, we can perform time-based similarity searches as follows:
# Time filter variables for query
# Start date = 1 August 2023, 22:10:35
start_date = datetime(2023, 8, 1, 22, 10, 35)
# End date = 30 August 2023, 22:10:35
end_date = datetime(2023, 8, 30, 22, 10, 35)
# Similarity search with time filter
records_time_filtered = await vec.search(query_embedding,limit=3,
uuid_time_filter=client.UUIDTimeRange(start_date, end_date))
This will ensure our similarity search only returns vectors that have times between the start_date
and end_date
.
Here’s some intuition for why Timescale Vector’s time-based partitioning speeds up ANN queries with time-based filters:
Timescale Vector partitions the data by time and creates ANN indexes on each partition individually. Then, during search, we perform a three-step process:
Timescale Vector leverages TimescaleDB’s hypertables, which automatically partition vectors and associated metadata by a timestamp. This enables efficient querying on vectors by both similarity to a query vector and time, as partitions not in the time window of the query are ignored, making the search a lot more efficient by filtering out whole swaths of data in one go.
Timescale Vector uses the DateTime portion of a UUID v1 to determine which partition a given row should be placed in.
UUID v1
for each record that you want to insert:# Table information
TABLE_NAME = "commit_history"
EMBEDDING_DIMENSIONS = 1536
# Partition interval
TIME_PARTITION_INTERVAL = timedelta(days=7)
# Create client object
vec = client.Async(TIMESCALE_SERVICE_URL,
TABLE_NAME,
EMBEDDING_DIMENSIONS,
time_partition_interval=TIME_PARTITION_INTERVAL)
# create table
await vec.create_tables()
uuid_from_time()
function to generate a uuid v1
from a Python datetime
object, and then use that as your id
for your vector when you insert it into the PostgreSQL database:# Time filter variables for query
# Start date = 1 August 2023, 22:10:35
start_date = datetime(2023, 8, 1, 22, 10, 35)
# End date = 30 August 2023, 22:10:35
end_date = datetime(2023, 8, 30, 22, 10, 35)
# Similarity search with time filter
records_time_filtered = await vec.search(query_embedding,limit=3,
uuid_time_filter=client.UUIDTimeRange(start_date, end_date))
For example, in this tutorial for Timescale Vector, we extract the dates from our metadata and turn them into UUID v1s
, which we then use as the id
part of our record when we ingest into the PostgreSQL table:
import uuid
id = uuid.uuid1()
await vec.upsert([(id, {"key": "val"}, "the brown fox", [1.0, 1.2])])
Let’s put everything together and look at a simplified example of how you can use the Timescale Vector Python library to power retrieval augmented generation where the context retrieved is constrained to a given time range.
Generation: In the example below, we define get_completion_from_messages()
, which makes a call to an LLM and returns a completion response for a given prompt.
Time-based context retrieval: We define get_top_most_similar_docs()
, which takes a given query embedding and returns the top five most similar rows to that embedding in our table that have a time associated with them between start_date
and end_date
.
Finally, we put it all together in process_user_message
, which takes a user_input
, like a question, as well as a start and end date, and returns a retrieval augmented response from the LLM using the time-based context retrieved from the records in our table.
# Make an LLM call and get completion for a given set of messages
def get_completion_from_messages(messages, model="gpt-4-0613", temperature=0, max_tokens=1000):
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
)
return response.choices[0].message["content"]
# Get top 3 most similar document sections within a time range
def get_top_similar_docs(query_embedding, start_date, end_date):
# Get the top most similar documents within the time range
top_docs = await vec.search(query_embedding,limit=3, uuid_time_filter=client.UUIDTimeRange(start_date, end_date))
return top_docs
#construct context string from relevant docs array
def process_user_message(user_input, all_messages, start_date, end_date):
#Get documents related to the user input
related_docs = get_top_similar_docs(get_embeddings(user_input), start_date, end_date)
messages = [
{"role": "system", "content": system_message},
{"role": "user", "content": f"{user_input}"},
{"role": "assistant", "content": f"Relevant information: \n {related_docs[0]} \n {related_docs[1]} \n {related_docs[2]}"}
]
final_response = get_completion_from_messages(all_messages + messages)
This is a simple example of a powerful concept—using time-based context retrieval in your RAG applications can help provide more relevant answers to your users. This time-based context retrieval can be helpful to any dataset with a natural language and time component.
Timescale Vector uniquely enables this thanks to its efficient time-based similarity search capabilities, and taking advantage of it in your Python applications is easy thanks to the Timescale Vector Python client library.
Now that you’ve learned the foundational concepts of Timecale Vector’s Python library, install it for your next project that uses LLMs in Python:
pip install timescale_vector
And then continue your learning journey with the following tutorials and guides:
And a reminder: