High CPU Usage in PostgreSQL Service

aviral.singh21 · July 12, 2024, 8:49am

Hello,
I have a docker swarm architecture, where 2 different database service is running.
One is TimescaleDB HA Service with 3 Patroni Cluster and 3 ETCD Cluster, one-one container deployed on each node. Another is PostgreSQL HA Service with 2 nodes of Repmgr deployed on each node respectively.

Version Details for TimescaleDB Service:
PostgreSQL: 13.8
TimescaleDB: 2.7.1
Installation Method: Docker(timescale/timescaledb-ha:pg13.7-ts-2.7.1-latest)
Setup: 3 Node of Patroni-HA - ETCD and one PgPool using Docker Swarm.

Version Details for PostgreSQL Service:
PostgreSQL Version: 12.4
Installation Method: Docker(bitnami/postgresql-repmgr:12)
Setup: 2 Node of Repmgr and one PgPool using Docker Swarm.

image (1)

Issue:
There was high CPU Usage in Node-1 which was increasing consistently. On checking specifically for service consuming more CPU, it was observed that it was consumed by PostgreSQL Service.
Then I fetched the connections stats in database, there I found 5 sessions of idle in transaction. And all 5 sessions had same query. Whose state_change time was matching approximately with the time of increase in CPU Usage of PostgreSQL docker service in grafana graph. So I concluded, these 5 sessions were the reason behind cpu consumption. But when I checked the status of resource consumption using process (ps aux | grep postgres) command and top command. In both cases, idle in transaction sessions cpu consumption was 0%.

Sample ps aux details which I ran inside the postgresql docker container:

USER PID %CPU %MEM VSZ RSS TTY 1001 25202 0.0 0.0 2255512 13584 ? 1001 34045 0.0 0.0 2278996 40328 ? 1001 884810 0.0 0.2 2256700 110588 ? 1001 885089 0.0 0.1 2255740 76812 ? 1001 885353 0.0 0.2 2256988 125592 ? 1001 889714 0.0 0.1 2255640 52960 ? 1001 893820 0.0 0.0 2255768 47100 ? 1001 904151 0.0 0.0 2257116 25592 ? 1001 1078593 0.7 0.1 2275896 65060 ? 1001 1084060 0.0 0.1 2256760 96624 ? 1001 1084606 0.0 0.0 2255500 17304 ? 1001 1084918 9.9 3.4 2267276 1683660 ? 1001 1085106 9.3 3.4 2265728 1683328 ? 1001 1085123 0.0 0.0 2262752 47592 ? 1001 1085579 12.9 3.4 2265652 1668496 ? 1001 1085583 0.0 0.0 2255500 17244 ? 1001 1085804 0.0 0.0 2255500 17260 ? 1001 1085812 14.9 3.3 2264092 1629084 ? 1001 1085816 0.0 0.0 2255500 22908 ? STAT START TIME COMMAND
Ss Jun21 0:00 postgres: postgres postgres 10.10.1.195(33692) idle
Ss Jun21 0:13 postgres: asm_user tsdb 10.10.1.195(51664) idle
Ss Jun26 0:00 postgres: asm_user tsdb 10.10.1.53(50158) idle in transaction
Ss Jun26 0:00 postgres: asm_user tsdb 10.10.1.195(59626) idle in transaction
Ss Jun26 0:01 postgres: asm_user tsdb 10.10.1.195(49802) idle in transaction
Ss Jun26 0:00 postgres: asm_user tsdb 10.10.1.53(57404) idle in transaction
Ss Jun26 0:00 postgres: asm_user tsdb 10.10.1.195(47092) idle in transaction
Ss Jun26 0:00 postgres: postgres tsdb 10.10.1.195(46746) idle
Ss 05:29 0:29 postgres: asm_user tsdb 10.10.1.53(36738) idle
Ss 06:04 0:00 postgres: asm_user tsdb 10.10.1.53(35812) idle
Ss 06:08 0:00 postgres: asm_user tsdb 10.10.1.53(43742) idle
Ss 06:10 2:19 postgres: asm_user tsdb 10.10.1.53(43276) idle
Ss 06:11 2:03 postgres: asm_user tsdb 10.10.1.195(58606) idle
Ss 06:11 0:00 postgres: asm_user tsdb 10.10.1.53(43920) idle
Ss 06:14 2:27 postgres: asm_user tsdb 10.10.1.195(45850) idle
Ss 06:14 0:00 postgres: asm_user tsdb 10.10.1.195(45868) idle
Ss 06:16 0:00 postgres: asm_user tsdb 10.10.1.53(33508) idle
Ss 06:16 2:37 postgres: asm_user tsdb 10.10.1.53(33528) idle
Ss 06:16 0:00 postgres: asm_user tsdb 10.10.1.195(40346) idle

Sample result of top command ran inside the postgresql docker container:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1085106 1001 20 0 2265728 1.6g 1.6g R 63.9 3.4 2:07.26 postgres
1086334 1001 20 0 2263284 1.2g 1.2g R 52.3 2.5 6:27.49 postgres
1086745 1001 20 0 2263412 1.1g 1.1g S 45.7 2.4 3:20.88 postgres
1086426 1001 20 0 2263416 1.5g 1.5g R 35.1 3.2 3:51.12 postgres
1086444 1001 20 0 2263292 1.1g 1.1g R 24.2 2.4 2:22.83 postgres
1085812 1001 20 0 2264092 1.6g 1.6g R 23.2 3.3 2:49.42 postgres
1086337 1001 20 0 2263416 979.7m 977.0m R 22.2 2.0 2:08.84 postgres
1086468 1001 20 0 2263416 1.4g 1.4g R 20.2 3.0 2:37.97 postgres
1086460 1001 20 0 2264260 1.5g 1.5g S 18.2 3.1 1:15.70 postgres
1085579 1001 20 0 2265652 1.6g 1.6g R 13.9 3.4 2:35.56 postgres
1085980 1001 20 0 2263416 914716 911908 R 12.6 1.9 2:02.52 postgres
1086819 1001 20 0 2263308 994216 991636 S 8.3 2.0 1:20.39 postgres
1085710 1001 20 0 2266812 1.5g 1.5g S 6.0 3.3 1:44.21 postgres
1085334 1001 20 0 2256728 300696 298464 S 3.0 0.6 0:18.52 postgres
1085295 1001 20 0 2256448 293596 291516 S 2.6 0.6 0:11.88 postgres
200 1001 20 0 2255668 4444 3856 S 1.7 0.0 450:37.32 postgres
193 1001 20 0 2255652 4392 3792 S 1.3 0.0 460:05.24 postgres
1085392 1001 20 0 2256716 96512 94148 S 0.3 0.2 0:00.32 postgres
1085816 1001 20 0 2255500 22908 21460 S 0.3 0.0 0:00.16 postgres
1085989 1001 20 0 2263932 1.5g 1.5g S 0.3 3.3 2:52.27 postgres
1 1001 20 0 15468 1420 1124 S 0.0 0.0 9:10.56 repmgrd
131 1001 20 0 2254288 31960 31544 S 0.0 0.1 88:03.58 postgres
132 1001 20 0 66048 1496 1308 S 0.0 0.0 0:00.81 postgres
138 1001 20 0 2254540 1.6g 1.6g S 0.0 3.5 7:16.90 postgres
139 1001 20 0 2254516 60196 59820 S 0.0 0.1 1:29.27 postgres
140 1001 20 0 2254288 19028 18692 S 0.0 0.0 5:57.71 postgres
141 1001 20 0 2255124 5784 5224 S 0.0 0.0 1:07.43 postgres

I terminated those session and immediately CPU Usage was back to normal. Which I felt I handled it well and resolved it.

But the main issue occur now:

On reporting this issue to developers, they told that there are 3 different queries in a single transaction, which is executed through an application.
Below I have mentioned the overview of sql queries in order they are executed:

SELECT Query from tsdb_bi.bi_eq_ign (PostgreSQL Service)
SELECT Query from mpr, mte (TimescaleDB Service)
INSERT/UPDATE tsdb_bi.bi_eq_ign_stat (PostgreSQL Service)

According to developers, 1st query is executed successfully, but problem is in 2nd query, that query results from timescaledb are not returned properly. And they are suspecting that there is some problem in infra. Developers suspect that there is a problem with timescaledb’s session or that connection information has been lost.

Please someone help me with a way to identify the actual problem.

jonatasdp · July 12, 2024, 3:32pm

can you run explain analyze on the second query?

SELECT Query from mpr, mte (TimescaleDB Service)

For getting more info during your debugging,
I’d also encourage you to set client_min_messages to DEBUG1; and run everything in the same session.

it will output all information int he terminal and help you better understand where it stuck.

aviral.singh21 · July 29, 2024, 2:49pm

Hello @jonatasdp,
Sorry to reply it late as I was occupied in another db version upgrade issues.
As you suggested to run the EXPLAIN ANALYZE on the TSDB queries. I fetched the query plan for both the mpr and mte queries.

For mte Table:

**Finalize Aggregate:**
- This is the last step in the aggregation process where the final result is computed.
- Cost: 299254.80..299254.81, indicating the estimated start-up and total cost.
- Actual time: 821.627..853.743 ms, indicating the actual time taken for this step.

**Gather:**
- Collects the results from parallel workers.
- Cost: 299254.18..299254.79
- Actual time: 819.239..853.731 ms
- Workers Planned: 6, Workers Launched: 6

**Partial Aggregate:**
- Each worker performs a partial aggregation before the results are gathered.
- Cost: 298254.18..298254.19
- Actual time: 782.331..782.605 ms

**Parallel Append:**
- Combines results from multiple child nodes that can run in parallel.
- Cost: 0.00..298254.17
- Actual time: 782.327..782.349 ms

**Parallel Index Scan and Parallel Seq Scan:**
- Several chunks (hypertables in TimescaleDB) are scanned in parallel.
- Index Scans use indices to find relevant rows quickly.
- Sequential Scans check each row in the chunks.
-Conditions involve timestamp ranges and various filters on the columns like create_time, teid, eqnm, eqty, and contents.

For mpr table:

**Finalize Aggregate** (cost=608.64..608.65 rows=1 width=8) (actual time=6.169..6.373 rows=1 loops=1):
- This is the final step in aggregating results from multiple sources or partial aggregates. It combines the results to produce a single output.

**Append** (cost=5.01..608.33 rows=122 width=8) (actual time=0.112..6.350 rows=122 loops=1):
- The Append node indicates that results from multiple child nodes (subqueries or scans) are being combined. The cost, rows, and width indicate the estimated and actual execution metrics.

**Partial Aggregate** (cost=5.01..5.02 rows=1 width=8) (actual time=0.111..0.113 rows=1 loops=1):
- This node represents a partial aggregation of data before it is combined in the Finalize Aggregate step.

**Custom Scan** (DecompressChunk) on _hyper_13_1813_chunk (cost=2.51..2.51 rows=1000 width=0) (actual time=0.106..0.108 rows=0 loops=1):
- The DecompressChunk node indicates that the query is reading from a compressed chunk of the hypertable. The decompress operation is necessary because the data is stored in a compressed format.
- The actual time indicates how long this step took during execution.

**Index Scan** using compress_hyper_22_1924_chunk__compressed_hypertable_22_eqnm__ts on compress_hyper_22_1924_chunk (cost=0.28..2.51 rows=1 width=96) (actual time=0.104..0.105 rows=0 loops=1):
- The Index Scan node shows that an index is being used to locate the relevant rows in the compressed chunk. Index scans are usually faster than sequential scans because they use an index to quickly locate rows matching the conditions.
- The conditions include filtering by timestamps and specific text values for eqnm, teid, and eqty.

**Filter Conditions:**
- The Filter conditions applied at various stages ensure that only rows matching specific criteria (e.g., create_time, to_timestamp(contents ->> 'UTCT', 'YYMMDD HH24MISS.00'), eqnm, teid, eqty) are included in the result set.

Currently both the tables fetches 0 records.

But at the time of error:

Records fetched from mte table: 0
Records fetched from mpr table: 7012238

Also I appreciate for the suggestion related to client_min_messages to DEBUG1. But since I applied the setting of idle_in_transaction_timeout parameter. This issue didn’t occur again.