How to Store Video in PostgreSQL Using BYTEA

Start supercharging your PostgreSQL today.

PostgreSQL is a powerful open-source relational database widely used by businesses and developers for managing diverse types of data.

One practical use case is managing video content, which involves storing both video files and related metadata such as titles, descriptions, and user engagement data. While databases are crucial for organizing metadata and enabling fast searches, they can also store videos directly. However, handling large video files efficiently requires thoughtful storage strategies.

In this blog, I will walk you through how to store videos in PostgreSQL, explore different approaches, and discuss best practices for efficient video storage.

Different Ways to Store Videos in a PostgreSQL Database

There are mainly three ways to store videos in a PostgreSQL database:

Storing videos directly in PostgreSQL
Using external storage with database references
Hybrid approach: balancing database and external storage

Storing videos directly in PostgreSQL

Storing a video directly in PostgreSQL means saving the video file as binary data within a table. This is done using data types like BYTEA (which stores binary data in a single row) or large objects (LOBs) (which store data in chunks across multiple rows). When a video is stored in the database, it is converted into a binary format and inserted into a column. Later, when needed, the video data can be retrieved, decoded, and played back. This approach allows PostgreSQL to manage the video as part of its structured data, keeping it within the database rather than storing it separately in a filesystem.

Using external storage with database references

Using external storage with database references implies saving only the location (file path or URL) of the video file instead of the actual video data. The video itself is kept in an external filesystem, cloud storage, or a media server. In PostgreSQL, this is done by storing the file path as a TEXT or VARCHAR column. When accessing the video, the application retrieves its path from the database, fetches the file from its storage location, and then processes it as needed, such as displaying the video or performing other operations.

Hybrid approach: Balancing database and external storage

A hybrid approach to storing videos in PostgreSQL involves combining both direct storage and path-based storage. In this method, the database stores essential video metadata along with a small preview or thumbnail using the BYTEA data type. Meanwhile, the full video file is kept in an external filesystem or cloud storage, with its file path saved in a TEXT or VARCHAR column. When needed, the application retrieves the metadata and preview from the database while fetching the actual video from its storage location.

Comparison of Video Storage Approaches

Below is a table outlining the pros and cons of each approach:

Approach	Advantages	Disadvantages
Storing videos directly in PostgreSQL	Guaranteed data consistency (ACID compliance). Simplified backups (single system). Built-in security/access controls for files.	Significant database bloat (large files increase storage costs). Slow queries (scanning binary data is resource-heavy). Backups become slower and bulkier.
Using external storage with database references	Minimal database size (only text paths stored). Independent scalability (scale storage separately from the database). Cheaper storage (use cost-effective object storage).	No transactional guarantees for external files. Manual cleanup is required (orphaned files if metadata is deleted). Security depends on external storage configuration.
Hybrid approach: balancing database and external storage	Performance optimization (store small previews/thumbnails in DB + full videos externally). Partial ACID benefits for critical metadata. Flexibility to balance cost and performance.	Complexity (manage synchronization between DB and external storage). Partial consistency risk (e.g., previews may not match external video versions). Higher maintenance overhead (two systems to monitor).

To store videos in PostgreSQL, we can use a data type called BYTEA.

What Is the BYTEA Data Type?

BYTEA (short for "byte array") is a PostgreSQL data type designed to store raw binary data, such as images, documents, or videos.

Think of it as a "digital container" that holds data in its purest form: a sequence of bytes.

Each byte is a tiny unit of data represented by a combination of zeros (0) and ones (1)—binary code.

When you save a video to a BYTEA column, here is what happens:

Conversion: Video gets converted into a long string of bytes.
Storage: These bytes are stored directly in the database.

When we fetch a video stored in a BYTEA column, this happens:

PostgreSQL returns the raw bytes exactly as they were stored.
Application (e.g., Python, Java, etc.) is responsible for converting those bytes back into a usable video file.

Note: If images, videos, or audio files are too large to fit within the BYTEA size limit, we can store them directly in PostgreSQL using large objects (LOB).

Methods to Store Video in PostgreSQL

Storing video using an SQL query

First, we need to create a database table where we will store the entire video using the BYTEA data type.

CREATE TABLE IF NOT EXISTS videos ( id SERIAL PRIMARY KEY, time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, tag TEXT NOT NULL, video BYTEA );

This creates a table called videos with three columns:

id: A unique identifier (auto-incremented).
time: Time at which the video was inserted
tag: Video tags
video: Binary data of the video file.

Next, we need to write an SQL query to insert the video into the database by converting it into bytes and then storing it in the table.

INSERT INTO videos (tag, video) VALUES ('comedy', pg_read_binary_file('/path/to/video/file'));

Here pg_read_binary_file is a function in PostgreSQL that lets you read all or part of a file and returns the content in binary format (BYTEA). Unlike pg_read_file, it doesn't check text encoding, making it suitable for reading any type of file, including images or videos.

Note: This function is available to only superusers by default, but permissions can be granted to other users.

Let's see what our data looks like in the table:

SELECT * FROM public.video ORDER BY id ASC;

To retrieve and play a video stored in a PostgreSQL table, you need an application to handle the extraction and conversion process. The upcoming code examples will demonstrate how to achieve this.

Next, you will find coding examples in Python, Java, and C++ demonstrating how to complete the following tasks:

Create a table
Insert a video into the table
Retrieve the video from the table
Play the retrieved video

Before running the code, ensure you have the following dependencies installed:

Python: psycopg2
Java: PostgreSQL JDBC Connector
C++: libpqxx

Additionally, update the database credentials to match your system and provide the correct file path for the video on your filesystem.

Storing video using Python

import psycopg2 import os import subprocess

# Database connection connection = psycopg2.connect( dbname="postgres", user="postgres", password="<PASSWORD>", host="localhost", port="5432" ) cursor = connection.cursor()

# Create table if not exists cursor.execute(""" CREATE TABLE IF NOT EXISTS videos ( id SERIAL PRIMARY KEY, time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, tag TEXT NOT NULL, video BYTEA ) """) connection.commit()

# Path to video file (Update as needed) video_path = "<VIDEO_PATH>"

# Ensure file exists before reading if not os.path.exists(video_path): print("Error: Video file not found!") exit()

# Read video as binary with open(video_path, "rb") as video_file: video_data = video_file.read()

# Insert into database cursor.execute(""" INSERT INTO videos (tag, video) VALUES (%s, %s) RETURNING id """, ("comedy", video_data)) video_id = cursor.fetchone()[0] # Get inserted row ID connection.commit()

# Verify video is saved in database cursor.execute("SELECT COUNT(*) FROM videos WHERE tag = 'comedy'") count = cursor.fetchone()[0]

if count > 0: print(f" Video successfully stored in database with ID: {video_id}") else: print(" Error: Video was not stored in database!") exit()

# Retrieve video data from database cursor.execute("SELECT video FROM videos WHERE id = %s", (video_id,)) video_data = cursor.fetchone()[0]

# Check if video exists in database if not video_data: print(" Error: No video found in database!") exit()

# Save video data to a temporary file temp_video_path = "<TMP_VIDEO_PATH>" with open(temp_video_path, "wb") as temp_file: temp_file.write(video_data)

# Verify if the file was saved if os.path.exists(temp_video_path): print(f" Video retrieved and saved as: {temp_video_path}") else: print(" Error: Video file could not be created!") exit()

# Start the video os.startfile(temp_video_path)

# Cleanup cursor.close() connection.close()

Storing video using Java

import java.sql.*; import java.io.*; import java.awt.*; public class main { public static void main(String[] args) { storeVideo(); retrieveVideo(); } private static void storeVideo() { try { Class.forName("org.postgresql.Driver"); Connection connection = DriverManager.getConnection( "jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>"); // Create table if not exists String createTableQuery = "CREATE TABLE IF NOT EXISTS videos (" + "id SERIAL PRIMARY KEY, " + "time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, " + "tag TEXT NOT NULL, " + "video BYTEA NOT NULL)"; Statement statement = connection.createStatement(); statement.executeUpdate(createTableQuery); // Load video file File videoFile = new File("<VIDEO_PATH>"); if (!videoFile.exists()) { System.out.println("Error: Video file not found!"); return; } FileInputStream fis = new FileInputStream(videoFile); // Insert video String insertQuery = "INSERT INTO videos (tag, video) VALUES (?, ?)"; PreparedStatement preparedStatement = connection.prepareStatement(insertQuery); preparedStatement.setString(1, "comedy"); preparedStatement.setBinaryStream(2, fis, (int) videoFile.length()); if (preparedStatement.executeUpdate() > 0) { System.out.println("Video stored successfully!"); } fis.close(); preparedStatement.close(); statement.close(); connection.close(); } catch (Exception e) { e.printStackTrace(); } } private static void retrieveVideo() { try { Class.forName("org.postgresql.Driver"); Connection connection = DriverManager.getConnection( "jdbc:postgresql://localhost:5432/postgres", "postgres", "<PASSWORD>"); // Fetch most recent video String query = "SELECT video FROM videos WHERE tag = 'comedy' ORDER BY time DESC LIMIT 1"; Statement statement = connection.createStatement(); ResultSet resultSet = statement.executeQuery(query); if (resultSet.next()) { byte[] videoData = resultSet.getBytes("video"); // Save to file FileOutputStream fos = new FileOutputStream("retrieved_video.mp4"); fos.write(videoData); fos.close(); // Auto-play video Desktop.getDesktop().open(new File("retrieved_video.mp4")); System.out.println("Video playing automatically!"); } resultSet.close(); statement.close(); connection.close(); } catch (Exception e) { e.printStackTrace(); } } }

Storing video using C++

#include <iostream> #include <pqxx/pqxx> #include <fstream> #include <vector> #include <cstdlib> using namespace std;

void createTable(pqxx::connection &c) { pqxx::work w(c); w.exec(R"( CREATE TABLE IF NOT EXISTS videos ( id SERIAL PRIMARY KEY, time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, tag TEXT NOT NULL, video BYTEA ) )"); w.commit(); cout << "Table checked/created successfully." << endl; }

void insertVideo(pqxx::connection &c, const string &videoPath, const string &tag) { ifstream file(videoPath, ios::binary); if (!file) { cerr << "Error: Unable to open file " << videoPath << endl; return; } vector<char> videoData((istreambuf_iterator<char>(file)), istreambuf_iterator<char>()); file.close();

pqxx::work w(c); w.exec_params( "INSERT INTO videos (tag, video) VALUES ($1, $2)", tag, pqxx::binarystring(videoData.data(), videoData.size())); w.commit(); cout << "Video inserted successfully." << endl; }

void retrieveAndPlayVideo(pqxx::connection &c, const string &tag) { pqxx::work w(c); pqxx::result r = w.exec_params("SELECT video FROM public.videos WHERE tag = $1 ORDER BY id ASC", tag);

if (r.empty()) { cout << "No video found with tag: " << tag << endl; return; }

pqxx::binarystring videoData(r[0][0]); ofstream videoFile("output_video.mp4", ios::binary); videoFile.write(reinterpret_cast<const char*>(videoData.data()), videoData.size()); videoFile.close();

cout << "Video retrieved and saved as output_video.mp4" << endl; system("mpv output_video.mp4"); }

int main() { try { // Connect to PostgreSQL pqxx::connection c("dbname=postgres user=postgres password=<PASSWORD> host=localhost"); if (!c.is_open()) { cerr << "Can't open database" << endl; return 1; } cout << "Opened database successfully: " << c.dbname() << endl;

createTable(c); string videoPath = "<VIDEO_PATH>"; string tag = "comedy"; insertVideo(c, videoPath, tag); retrieveAndPlayVideo(c, tag); } catch (const exception &e) { cerr << "Error: " << e.what() << endl; return 1; } return 0; } Note: This approach involves writing the video to a file every time it's read from the database, which can be inefficient due to repeated disk writes and file dependency.

Storing Video as Timestamped Data in PostgreSQL

Yes, you read that right! We can store video as timestamped data in a PostgreSQL table. Here’s how.:

For this example, I downloaded a random five5-minute and six6-second traffic video from YouTube. We extracted frames at one1-second intervals, resulting in a total of 306 frames. Each frame was then stored as BYTEA in a PostgreSQL table, along with its corresponding timestamp in the video.

Why is this useful?

Imagine needing to retrieve a specific frame from a video at an exact timestamp. By simply querying the database with the desired timestamp, we can instantly retrieve the relevant frame. This approach can be invaluable in scenarios like accident investigations, where identifying events and culprits at a specific moment is crucial.

import cv2 import psycopg from datetime import datetime import os

# Database connection conn = psycopg.connect("dbname=postgres user=postgres password=<PASSWORD> host=localhost") cur = conn.cursor()

# Create table if not exists cur.execute(""" CREATE TABLE IF NOT EXISTS segments ( id SERIAL PRIMARY KEY, segment BYTEA, segment_number INT, created_at TIMESTAMP, timeline TEXT ) """) conn.commit()

# Load the video video_path = "D:\\traffic.mp4" cap = cv2.VideoCapture(video_path) fps = int(cap.get(cv2.CAP_PROP_FPS)) frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) duration = int(frame_count / fps) frames_per_segment = fps # Split video into 1-second segments for segment_number in range(duration): frames = [] for _ in range(frames_per_segment): ret, frame = cap.read() if not ret: Break frames.append(frame) segment_path = f"segment_{segment_number}.mp4" out = cv2.VideoWriter( Segment_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (int(cap.get(3)), int(cap.get(4))) ) for frame in frames: out.write(frame) out.release()

# Check if file was created if not os.path.exists(segment_path) or os.path.getsize(segment_path) == 0: print(f"Error: Segment file {segment_path} not created correctly.") continue

# Read and store the segment in the database with open(segment_path, "rb") as file: segment_bytes = file.read()

# Calculate timeline (HH:MM:SS) hours = segment_number // 3600 minutes = (segment_number % 3600) // 60 seconds = segment_number % 60 timeline = f"{hours:02d}:{minutes:02d}:{seconds:02d}" cur.execute( "INSERT INTO segments (segment, segment_number, created_at, timeline) VALUES (%s, %s, %s, %s)", (segment_bytes, segment_number, datetime.now(), timeline) ) conn.commit() os.remove(segment_path) cap.release()

# Display stored timelines for verification cur.execute("SELECT timeline FROM segments ORDER BY timeline") rows = cur.fetchall() print("Stored timelines:", [row[0] for row in rows])

# Display specific segments by timeline timeline = "00:01:40" cur.execute("SELECT segment FROM segments WHERE timeline = %s", (timeline,)) segment_data = cur.fetchone() if segment_data: segment_bytes = segment_data[0] temp_video_path = f"temp_segment_{timeline.replace(':', '-')}.mp4" with open(temp_video_path, "wb") as temp_file: temp_file.write(segment_bytes) print(f"Playing segment at timeline {timeline}...") os.startfile(temp_video_path) # Opens with default media player else: print(f"Segment with timeline {timeline} not found.") cur.close() conn.close()

When to Store Videos in PostgreSQL

Here are a few reasons you might consider storing videos directly in PostgreSQL, even though it’s generally not recommended for large files.

Simplified architecture

Avoid managing separate storage systems (e.g., S3, file servers). Everything (data + videos) lives in one place, reducing operational complexity.

Transactional consistency

Ensure atomicity: If a transaction involves both video metadata and the video itself (e.g., uploading a video and updating user records), PostgreSQL guarantees they succeed or fail together.

Built-in security

Leverage PostgreSQL’s encryption, role-based access control (RBAC), and auditing features to secure videos without configuring external storage permissions.

Backup/restore simplicity

Backups include videos and metadata in a single snapshot, simplifying disaster recovery compared to syncing databases with external storage.

Small file use cases

For tiny videos (e.g., short clips, thumbnails), the overhead of external storage might outweigh the benefits. PostgreSQL’s BYTEA or large objects work for limited-scale needs.

When to Store Videos Externally and Keep Only Metadata in PostgreSQL

Scalability for large volumes

When: Videos are large (e.g., HD/4K) or numerous. Why: External storage (e.g., AWS S3, Google Cloud Storage) scales infinitely for massive files.

Optimized performance

When: Fast read/write operations are critical (e.g., streaming platforms). Why: External storage systems handle parallel access and large file transfers more efficiently.

Separation of concerns

When: You need to manage metadata (e.g., titles, tags, user access) separately from raw video files. Why: PostgreSQL excels at querying structured metadata, while external storage handles unstructured data. This simplifies architecture and maintenance.

Leverage specialized features

When: Advanced video processing (e.g., transcoding, CDN delivery) is required. Why: Tools like Cloudflare, Cloudinary, or AWS MediaConvert integrate seamlessly with external storage, offering features databases can’t provide (e.g., global caching, adaptive streaming).

Conclusion

Storing videos in PostgreSQL using BYTEA provides a convenient way to manage binary data directly within the database. We explored how to achieve this using SQL, Python, Java, and C++, demonstrating different approaches for inserting and retrieving videos.

While this method ensures data integrity and centralized access, it may not be ideal for large-scale video storage due to performance and scalability concerns. In such cases, external storage with metadata in PostgreSQL is a more efficient alternative.

Now that you've seen how to store video data in PostgreSQL, you're ready to tackle other challenges in your applications, such as real-time analytics on time-series data. While storing videos directly in PostgreSQL works for certain use cases, managing high-volume, real-time data requires optimized solutions.

This is where TimescaleDB comes in.

Built on PostgreSQL, TimescaleDB gives you the best of both worlds: the reliability and SQL interface you love about PostgreSQL, with the performance and scalability needed for demanding workloads.

Whether you're building video analytics platforms, IoT applications, or monitoring systems, TimescaleDB helps you:

Store and query billions of data points efficiently
Maintain high write and read performance at scale
Use familiar PostgreSQL syntax and tools
Keep your entire data stack within one database ecosystem

Try TimescaleDB for free, or check out our documentation to see how it can enhance your specific use case.