Start supercharging your PostgreSQL today.
Written by Semab Tariq
PostgreSQL is a powerful open-source relational database widely used by businesses and developers for managing diverse types of data.
One practical use case is managing video content, which involves storing both video files and related metadata such as titles, descriptions, and user engagement data. While databases are crucial for organizing metadata and enabling fast searches, they can also store videos directly. However, handling large video files efficiently requires thoughtful storage strategies.
In this blog, I will walk you through how to store videos in PostgreSQL, explore different approaches, and discuss best practices for efficient video storage.
There are mainly three ways to store videos in a PostgreSQL database:
Storing videos directly in PostgreSQL
Using external storage with database references
Hybrid approach: balancing database and external storage
Storing a video directly in PostgreSQL means saving the video file as binary data within a table. This is done using data types like BYTEA (which stores binary data in a single row) or large objects (LOBs) (which store data in chunks across multiple rows). When a video is stored in the database, it is converted into a binary format and inserted into a column. Later, when needed, the video data can be retrieved, decoded, and played back. This approach allows PostgreSQL to manage the video as part of its structured data, keeping it within the database rather than storing it separately in a filesystem.
Using external storage with database references implies saving only the location (file path or URL) of the video file instead of the actual video data. The video itself is kept in an external filesystem, cloud storage, or a media server. In PostgreSQL, this is done by storing the file path as a TEXT or VARCHAR column. When accessing the video, the application retrieves its path from the database, fetches the file from its storage location, and then processes it as needed, such as displaying the video or performing other operations.
A hybrid approach to storing videos in PostgreSQL involves combining both direct storage and path-based storage. In this method, the database stores essential video metadata along with a small preview or thumbnail using the BYTEA data type. Meanwhile, the full video file is kept in an external filesystem or cloud storage, with its file path saved in a TEXT or VARCHAR column. When needed, the application retrieves the metadata and preview from the database while fetching the actual video from its storage location.
Below is a table outlining the pros and cons of each approach:
Approach | Advantages | Disadvantages |
Storing videos directly in PostgreSQL | Guaranteed data consistency (ACID compliance). Simplified backups (single system). Built-in security/access controls for files. | Significant database bloat (large files increase storage costs). Slow queries (scanning binary data is resource-heavy). Backups become slower and bulkier. |
Using external storage with database references | Minimal database size (only text paths stored). Independent scalability (scale storage separately from the database). Cheaper storage (use cost-effective object storage). | No transactional guarantees for external files. Manual cleanup is required (orphaned files if metadata is deleted). Security depends on external storage configuration. |
Hybrid approach: balancing database and external storage | Performance optimization (store small previews/thumbnails in DB + full videos externally). Partial ACID benefits for critical metadata. Flexibility to balance cost and performance. | Complexity (manage synchronization between DB and external storage). Partial consistency risk (e.g., previews may not match external video versions). Higher maintenance overhead (two systems to monitor). |
To store videos in PostgreSQL, we can use a data type called BYTEA.
BYTEA (short for "byte array") is a PostgreSQL data type designed to store raw binary data, such as images, documents, or videos.
Think of it as a "digital container" that holds data in its purest form: a sequence of bytes.
Each byte is a tiny unit of data represented by a combination of zeros (0) and ones (1)—binary code.
When you save a video to a BYTEA column, here is what happens:
Conversion: Video gets converted into a long string of bytes.
Storage: These bytes are stored directly in the database.
When we fetch a video stored in a BYTEA column, this happens:
PostgreSQL returns the raw bytes exactly as they were stored.
Application (e.g., Python, Java, etc.) is responsible for converting those bytes back into a usable video file.
Note: If images, videos, or audio files are too large to fit within the BYTEA size limit, we can store them directly in PostgreSQL using large objects (LOB).
First, we need to create a database table where we will store the entire video using the BYTEA data type.
CREATE TABLE IF NOT EXISTS videos (
id SERIAL PRIMARY KEY,
time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
tag TEXT NOT NULL,
video BYTEA
);
This creates a table called videos with three columns:
id: A unique identifier (auto-incremented).
time: Time at which the video was inserted
tag: Video tags
video: Binary data of the video file.
Next, we need to write an SQL query to insert the video into the database by converting it into bytes and then storing it in the table.
INSERT INTO videos (tag, video)
VALUES ('comedy', pg_read_binary_file('/path/to/video/file'));
Here pg_read_binary_file is a function in PostgreSQL that lets you read all or part of a file and returns the content in binary format (BYTEA). Unlike pg_read_file
, it doesn't check text encoding, making it suitable for reading any type of file, including images or videos.
Note: This function is available to only superusers by default, but permissions can be granted to other users.
Let's see what our data looks like in the table:
SELECT * FROM public.video ORDER BY id ASC;
To retrieve and play a video stored in a PostgreSQL table, you need an application to handle the extraction and conversion process. The upcoming code examples will demonstrate how to achieve this.
Next, you will find coding examples in Python, Java, and C++ demonstrating how to complete the following tasks:
Create a table
Insert a video into the table
Retrieve the video from the table
Play the retrieved video
Before running the code, ensure you have the following dependencies installed:
Python: psycopg2
Java: PostgreSQL JDBC Connector
C++: libpqxx
Additionally, update the database credentials to match your system and provide the correct file path for the video on your filesystem.
import psycopg2
import os
import subprocess
# Database connection
connection = psycopg2.connect(
dbname="postgres",
user="postgres",
password="<PASSWORD>",
host="localhost",
port="5432"
)
cursor = connection.cursor()
# Create table if not exists
cursor.execute("""
CREATE TABLE IF NOT EXISTS videos (
id SERIAL PRIMARY KEY,
time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
tag TEXT NOT NULL,
video BYTEA
)
""")
connection.commit()
# Path to video file (Update as needed)
video_path = "<VIDEO_PATH>"
# Ensure file exists before reading
if not os.path.exists(video_path):
print("Error: Video file not found!")
exit()
# Read video as binary
with open(video_path, "rb") as video_file:
video_data = video_file.read()
# Insert into database
cursor.execute("""
INSERT INTO videos (tag, video)
VALUES (%s, %s) RETURNING id
""", ("comedy", video_data))
video_id = cursor.fetchone()[0] # Get inserted row ID
connection.commit()
# Verify video is saved in database
cursor.execute("SELECT COUNT(*) FROM videos WHERE tag = 'comedy'")
count = cursor.fetchone()[0]
if count > 0:
print(f" Video successfully stored in database with ID: {video_id}")
else:
print(" Error: Video was not stored in database!")
exit()
# Retrieve video data from database
cursor.execute("SELECT video FROM videos WHERE id = %s", (video_id,))
video_data = cursor.fetchone()[0]
# Check if video exists in database
if not video_data:
print(" Error: No video found in database!")
exit()
# Save video data to a temporary file
temp_video_path = "<TMP_VIDEO_PATH>"
with open(temp_video_path, "wb") as temp_file:
temp_file.write(video_data)
# Verify if the file was saved
if os.path.exists(temp_video_path):
print(f" Video retrieved and saved as: {temp_video_path}")
else:
print(" Error: Video file could not be created!")
exit()
# Start the video
os.startfile(temp_video_path)
# Cleanup
cursor.close()
connection.close()
import java.sql.*;
import java.io.*;
import java.awt.*;
public class main {
public static void main(String[] args) {
storeVideo();
retrieveVideo();
}
private static void storeVideo() {
try {
Class.forName("org.postgresql.Driver");
Connection connection = DriverManager.getConnection(
"jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>");
// Create table if not exists
String createTableQuery = "CREATE TABLE IF NOT EXISTS videos (" +
"id SERIAL PRIMARY KEY, " +
"time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, " +
"tag TEXT NOT NULL, " +
"video BYTEA NOT NULL)";
Statement statement = connection.createStatement();
statement.executeUpdate(createTableQuery);
// Load video file
File videoFile = new File("<VIDEO_PATH>");
if (!videoFile.exists()) {
System.out.println("Error: Video file not found!");
return;
}
FileInputStream fis = new FileInputStream(videoFile);
// Insert video
String insertQuery = "INSERT INTO videos (tag, video) VALUES (?, ?)";
PreparedStatement preparedStatement = connection.prepareStatement(insertQuery);
preparedStatement.setString(1, "comedy");
preparedStatement.setBinaryStream(2, fis, (int) videoFile.length());
if (preparedStatement.executeUpdate() > 0) {
System.out.println("Video stored successfully!");
}
fis.close();
preparedStatement.close();
statement.close();
connection.close();
} catch (Exception e) {
e.printStackTrace();
}
}
private static void retrieveVideo() {
try {
Class.forName("org.postgresql.Driver");
Connection connection = DriverManager.getConnection(
"jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>");
// Fetch most recent video
String query = "SELECT video FROM videos WHERE tag = 'comedy' ORDER BY time DESC LIMIT 1";
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(query);
if (resultSet.next()) {
byte[] videoData = resultSet.getBytes("video");
// Save to file
FileOutputStream fos = new FileOutputStream("retrieved_video.mp4");
fos.write(videoData);
fos.close();
// Auto-play video
Desktop.getDesktop().open(new File("retrieved_video.mp4"));
System.out.println("Video playing automatically!");
}
resultSet.close();
statement.close();
connection.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
#include <iostream>
#include <pqxx/pqxx>
#include <fstream>
#include <vector>
#include <cstdlib>
using namespace std;
void createTable(pqxx::connection &c) {
pqxx::work w(c);
w.exec(R"(
CREATE TABLE IF NOT EXISTS videos (
id SERIAL PRIMARY KEY,
time TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
tag TEXT NOT NULL,
video BYTEA
)
)");
w.commit();
cout << "Table checked/created successfully." << endl;
}
void insertVideo(pqxx::connection &c, const string &videoPath, const string &tag) {
ifstream file(videoPath, ios::binary);
if (!file) {
cerr << "Error: Unable to open file " << videoPath << endl;
return;
}
vector<char> videoData((istreambuf_iterator<char>(file)), istreambuf_iterator<char>());
file.close();
pqxx::work w(c);
w.exec_params(
"INSERT INTO videos (tag, video) VALUES ($1, $2)",
tag,
pqxx::binarystring(videoData.data(), videoData.size()));
w.commit();
cout << "Video inserted successfully." << endl;
}
void retrieveAndPlayVideo(pqxx::connection &c, const string &tag) {
pqxx::work w(c);
pqxx::result r = w.exec_params("SELECT video FROM public.videos WHERE tag = $1 ORDER BY id ASC", tag);
if (r.empty()) {
cout << "No video found with tag: " << tag << endl;
return;
}
pqxx::binarystring videoData(r[0][0]);
ofstream videoFile("output_video.mp4", ios::binary);
videoFile.write(reinterpret_cast<const char*>(videoData.data()), videoData.size());
videoFile.close();
cout << "Video retrieved and saved as output_video.mp4" << endl;
system("mpv output_video.mp4");
}
int main() {
try {
// Connect to PostgreSQL
pqxx::connection c("dbname=postgres user=postgres password=<PASSWORD> host=localhost");
if (!c.is_open()) {
cerr << "Can't open database" << endl;
return 1;
}
cout << "Opened database successfully: " << c.dbname() << endl;
createTable(c);
string videoPath = "<VIDEO_PATH>";
string tag = "comedy";
insertVideo(c, videoPath, tag);
retrieveAndPlayVideo(c, tag);
} catch (const exception &e) {
cerr << "Error: " << e.what() << endl;
return 1;
}
return 0;
}
Note: We are fully aware that this isn't the most efficient process, as you'll need a file reference and will have to rewrite the mp4 file to read it.
Yes, you read that right! We can store video as timestamped data in a PostgreSQL table. Here’s how.:
For this example, I downloaded a random five5-minute and six6-second traffic video from YouTube. We extracted frames at one1-second intervals, resulting in a total of 306 frames. Each frame was then stored as BYTEA in a PostgreSQL table, along with its corresponding timestamp in the video.
Imagine needing to retrieve a specific frame from a video at an exact timestamp. By simply querying the database with the desired timestamp, we can instantly retrieve the relevant frame. This approach can be invaluable in scenarios like accident investigations, where identifying events and culprits at a specific moment is crucial.
import cv2
import psycopg
from datetime import datetime
import os
# Database connection
conn = psycopg.connect("dbname=postgres user=postgres password=<PASSWORD> host=localhost")
cur = conn.cursor()
# Create table if not exists
cur.execute("""
CREATE TABLE IF NOT EXISTS segments (
id SERIAL PRIMARY KEY,
segment BYTEA,
segment_number INT,
created_at TIMESTAMP,
timeline TEXT
)
""")
conn.commit()
# Load the video
video_path = "D:\\traffic.mp4"
cap = cv2.VideoCapture(video_path)
fps = int(cap.get(cv2.CAP_PROP_FPS))
frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
duration = int(frame_count / fps)
frames_per_segment = fps
# Split video into 1-second segments
for segment_number in range(duration):
frames = []
for _ in range(frames_per_segment):
ret, frame = cap.read()
if not ret:
Break
frames.append(frame)
segment_path = f"segment_{segment_number}.mp4"
out = cv2.VideoWriter(
Segment_path,
cv2.VideoWriter_fourcc(*"mp4v"),
fps,
(int(cap.get(3)), int(cap.get(4)))
)
for frame in frames:
out.write(frame)
out.release()
# Check if file was created
if not os.path.exists(segment_path) or os.path.getsize(segment_path) == 0:
print(f"Error: Segment file {segment_path} not created correctly.")
continue
# Read and store the segment in the database
with open(segment_path, "rb") as file:
segment_bytes = file.read()
# Calculate timeline (HH:MM:SS)
hours = segment_number // 3600
minutes = (segment_number % 3600) // 60
seconds = segment_number % 60
timeline = f"{hours:02d}:{minutes:02d}:{seconds:02d}"
cur.execute(
"INSERT INTO segments (segment, segment_number, created_at, timeline) VALUES (%s, %s, %s, %s)",
(segment_bytes, segment_number, datetime.now(), timeline)
)
conn.commit()
os.remove(segment_path)
cap.release()
# Display stored timelines for verification
cur.execute("SELECT timeline FROM segments ORDER BY timeline")
rows = cur.fetchall()
print("Stored timelines:", [row[0] for row in rows])
# Display specific segments by timeline
timeline = "00:01:40"
cur.execute("SELECT segment FROM segments WHERE timeline = %s", (timeline,))
segment_data = cur.fetchone()
if segment_data:
segment_bytes = segment_data[0]
temp_video_path = f"temp_segment_{timeline.replace(':', '-')}.mp4"
with open(temp_video_path, "wb") as temp_file:
temp_file.write(segment_bytes)
print(f"Playing segment at timeline {timeline}...")
os.startfile(temp_video_path) # Opens with default media player
else:
print(f"Segment with timeline {timeline} not found.")
cur.close()
conn.close()
Here are a few reasons you might consider storing videos directly in PostgreSQL, even though it’s generally not recommended for large files.
Avoid managing separate storage systems (e.g., S3, file servers). Everything (data + videos) lives in one place, reducing operational complexity.
Ensure atomicity: If a transaction involves both video metadata and the video itself (e.g., uploading a video and updating user records), PostgreSQL guarantees they succeed or fail together.
Leverage PostgreSQL’s encryption, role-based access control (RBAC), and auditing features to secure videos without configuring external storage permissions.
Backups include videos and metadata in a single snapshot, simplifying disaster recovery compared to syncing databases with external storage.
For tiny videos (e.g., short clips, thumbnails), the overhead of external storage might outweigh the benefits. PostgreSQL’s BYTEA or large objects work for limited-scale needs.
When: Videos are large (e.g., HD/4K) or numerous. Why: External storage (e.g., AWS S3, Google Cloud Storage) scales infinitely for massive files.
When: Fast read/write operations are critical (e.g., streaming platforms). Why: External storage systems handle parallel access and large file transfers more efficiently.
When: You need to manage metadata (e.g., titles, tags, user access) separately from raw video files. Why: PostgreSQL excels at querying structured metadata, while external storage handles unstructured data. This simplifies architecture and maintenance.
When: Advanced video processing (e.g., transcoding, CDN delivery) is required. Why: Tools like Cloudflare, Cloudinary, or AWS MediaConvert integrate seamlessly with external storage, offering features databases can’t provide (e.g., global caching, adaptive streaming).
Storing videos in PostgreSQL using BYTEA provides a convenient way to manage binary data directly within the database. We explored how to achieve this using SQL, Python, Java, and C++, demonstrating different approaches for inserting and retrieving videos.
While this method ensures data integrity and centralized access, it may not be ideal for large-scale video storage due to performance and scalability concerns. In such cases, external storage with metadata in PostgreSQL is a more efficient alternative.
Now that you've seen how to store video data in PostgreSQL, you're ready to tackle other challenges in your applications, such as real-time analytics on time-series data. While storing videos directly in PostgreSQL works for certain use cases, managing high-volume, real-time data requires optimized solutions.
This is where TimescaleDB comes in.
Built on PostgreSQL, TimescaleDB gives you the best of both worlds: the reliability and SQL interface you love about PostgreSQL, with the performance and scalability needed for demanding workloads.
Whether you're building video analytics platforms, IoT applications, or monitoring systems, TimescaleDB helps you:
Store and query billions of data points efficiently
Maintain high write and read performance at scale
Use familiar PostgreSQL syntax and tools
Keep your entire data stack within one database ecosystem
Try TimescaleDB for free, or check out our documentation to see how it can enhance your specific use case.