Psycopg2 vs Psycopg3 Performance Benchmark
The new and improved PostgreSQL adapter for Python, psycopg3, brings significant improvements compared to its predecessor, psycopg2. It introduces built-in async support, enhancing efficiency for asynchronous tasks and an improved connection pooling feature. These enhancements make the PostgreSQL database interaction experience smoother for developers.
We commented on the latest version's supercharged performance in a previous blog post about building applications using PostgreSQL and psycopg3. In this blog post, we will actually test and compare the performance of psycopg2 and psycopg3 using various benchmarks. Let’s dive into it.
> How to Migrate Your Data to Timescale (3 Ways)
> Building Python Apps With PostgreSQL and psycopg3
> Data Visualization in PostgreSQL With Apache Superset
> 5 Ways to Monitor Your PostgreSQL Database
Psycopg 3 vs. Psycopg2 Benchmark Configurations
Environment
- Cloud Provider: AWS
- EC2 Instance Type: t2.xlarge
- CPU Configuration: 4 cores
- RAM Allocation: 16 GB
- Hard Drive Capacity: 200 GB
- Hard Drive Performance: 3000 IOPS
- Operating System: Ubuntu 22.04 LTS
- Filesystem: ext4
Framework
- Tool: MagicStack/pgbench
- Pull Request: https://github.com/MagicStack/pgbench/pull/10
Versions
- PostgreSQL = 16.2
- Psycopg2-binary = 2.9.9
- Psycopg[binary] = 3.1.18
Test Cases and Results
pg_type
A broad query retrieving all entries from the pg_type
table (approximately 613 rows), similar to an average application query, aims to evaluate overall data decoding efficiency.
The difference is abysmal, with psycopg3 retrieving approximately half a million rows per second while psycopg2 handles 150,000.
generate_series
A query that generates 1,000 rows containing a single integer. This benchmark is used to test the overhead of creating and returning result records.
Psycopg3 does it again with approximately 1.2 million rows per second, double that of psycopg2. However, the true winner is the asynchronous version, slightly surpassing the 1.2 million mark.
large_object
A query returning 100 rows, each containing a 1 KB binary blob. This benchmark is used to stress the I/O and read buffers in particular and is one of three cases in our tests where psycopg2 fared better.
arrays
A query returns 100 rows, each containing an array of 100 integers. This benchmark is designed to test the performance of array decoding. Even if performing only slightly better, psycopg3 async is once again the winner.
copyfrom
This benchmark is used to measure the performance of copying data from standard input (STDIN) into a PostgreSQL table using the COPY
command with specific row data and a large number of iterations (10,000). Psycopg3 async also performed better in this benchmark.
batch
This benchmark assesses the efficiency of batch-inserting data into a PostgreSQL table. We conducted 10,000 iterations to monitor the performance of the batch insertion process, with psycopg2 inserting more rows per second than the latest versions.
oneplusone
This benchmark simply executes the query "SELECT 1+1" to measure the time it takes to perform this basic arithmetic operation. Again, psycopg2 performed better.
Detailed Psycopg Benchmark Data
Here’s the detailed breakdown of results, where we can compare psycopg2 versus psycopg3 versus psycopg3 async.
python-psycopg2 | ||||||||
---|---|---|---|---|---|---|---|---|
Geometric mean | 1-pg_type.json x10 | 2-generate_series.json x10 | 3-large_object.json x10 | 4-arrays.json x10 | 5-copyfrom.json x10 | 6-batch.json x10 | 7-oneplusone.json x10 | |
Queries/sec | 409.39 | 246.18 | 626.67 | 2099.09 | 1091.84 | 44.44 | 13.26 | 9249.65 |
Rows/sec | 102359.51 | 150417.57 | 626673.62 | 209908.72 | 109184.26 | 444406.71 | 13257.48 | 9249.65 |
Min latency | 4.85ms | 5.4ms | 1.91ms | 0.84ms | 1.72ms | 52.76ms | 617.62ms | 0.13ms |
Mean latency | 24.366ms | 40.55ms | 15.936ms | 4.755ms | 9.147ms | 224.094ms | 753.025ms | 1.075ms |
Max latency | 149.216ms | 276.6ms | 326.68ms | 29.21ms | 111.82ms | 765.09ms | 808.67ms | 9.02ms |
Latency variation | 11.976ms (49.15%) | 33.858ms (83.5%) | 25.804ms (161.93%) | 3.31ms (69.61%) | 8.706ms (95.17%) | 74.057ms (33.05%) | 24.015ms (3.19%) | 0.789ms (73.34%) |
python-psycopg3 | ||||||||
---|---|---|---|---|---|---|---|---|
Geometric mean | 1-pg_type.json x10 | 2-generate_series.json x10 | 3-large_object.json x10 | 4-arrays.json x10 | 5-copyfrom.json x10 | 6-batch.json x10 | 7-oneplusone.json x10 | |
Queries/sec | 374.57 | 804.45 | 1250.97 | 1013.91 | 1015.23 | 45.66 | 4.93 | 4436.81 |
Rows/sec | 93665.83 | 491518.87 | 1250967.46 | 101390.58 | 101522.77 | 456625.68 | 4932.76 | 4436.81 |
Min latency | 5.473ms | 2.03ms | 1.06ms | 1.94ms | 1.76ms | 64.86ms | 1715.61ms | 0.18ms |
Mean latency | 26.527ms | 12.418ms | 7.983ms | 9.853ms | 9.839ms | 218.195ms | 1962.827ms | 2.246ms |
Max latency | 80.116ms | 41.09ms | 37.2ms | 21.35ms | 34.99ms | 897.54ms | 2139.78ms | 9.66ms |
Latency variation | 7.384ms (27.83%) | 5.155ms (41.51%) | 4.073ms (51.03%) | 2.355ms (23.9%) | 4.097ms (41.64%) | 108.724ms (49.83%) | 50.299ms (2.56%) | 1.08ms (48.09%) |
python-psycopg3-async | ||||||||
---|---|---|---|---|---|---|---|---|
Geometric mean | 1-pg_type.json x10 | 2-generate_series.json x10 | 3-large_object.json x10 | 4-arrays.json x10 | 5-copyfrom.json x10 | 6-batch.json x10 | 7-oneplusone.json x10 | |
Queries/sec | 474.94 | 749.26 | 1286.34 | 1330.1 | 1139.06 | 47.03 | 11.04 | 7189.77 |
Rows/sec | 118755.18 | 457796.17 | 1286341.35 | 133010.47 | 113906.13 | 470290.05 | 11041.44 | 7189.77 |
Min latency | 9.724ms | 5.92ms | 2.7ms | 5.68ms | 3.07ms | 60.8ms | 881.92ms | 0.55ms |
Mean latency | 21.022ms | 13.336ms | 7.765ms | 7.51ms | 8.771ms | 212.059ms | 905.604ms | 1.385ms |
Max latency | 37.043ms | 14.33ms | 18.98ms | 9.94ms | 21.82ms | 829.92ms | 962.93ms | 2.03ms |
Latency variation | 1.136ms (5.4%) | 0.243ms (1.82%) | 1.938ms (24.96%) | 0.143ms (1.91%) | 0.524ms (5.97%) | 84.791ms (39.98%) | 18.099ms (2.0%) | 0.045ms (3.25%) |
Next Steps
If you are wondering whether to carry on using psycopg2 in your interactions with your PostgreSQL database, we hope this convinces you to upgrade. We know it’s hard to part ways with a decade-long friend, but the performance gains brought by psycopg3 will make a difference in your development—especially if you're interacting asynchronously with your database.
To learn more about leveraging PostgreSQL and Python, read the following resources:
- How to Work With Time Series in Python?
- Tools for Working With Time-Series Analysis in Python
- Jupyter Notebook Tutorial: Setting Up Python & Jupyter Notebooks on macOS for OpenAI Exploration
- A Python Library for Using PostgreSQL as a Vector Database in AI Applications
Create a free Timescale account today to experiment and start building with Python and our supercharged PostgreSQL.