What is recommended wal_segment_size parameter for TimescaleDB?

ts101 · August 23, 2024, 1:10pm

I use PG v15.4 with TimescaleDB v2.16.0 self-hosted on prem.

I am looking at pg_wal directory and I see individual WAL file is 16 MB in size. I am reading (free to download) book “PostgreSQL 14 Internals” where it is written: “On disk, the WAL is stored in the PGDATA/pg_wal directory as separate files, or segments. Their size is shown by the read-only 16MB “wal_segment_size” parameter. For high-load systems, it makes sense to increase the segment size since it may reduce the overhead.”

Like I understand when 16MB WAL file is filled new WAL file must be created. Regularly requesting operating system to create new file takes some time. In high-load this may not be negligible, so increasing WAL size may benefit of writing transaction faster.

TimescaleDB systems are by definition high-load systems.

QUESTIONS:

What is recommended wal_segment_size parameter for TimescaleDB?
Does someone has some experience, does increasing parameter to 32MB (or more) increase performance?
What are disadvantages of having larger WAL log segments?

jonatasdp · August 26, 2024, 3:07pm

Hi @ts101, have you checked how timescaledb-tune calculates wal?

I think we try to already automate it, so if you just run the tool, you’ll see the recommendation in your file.

Remember that large wal files will lead to more efficiency because it will have less checkpoints but if something goes wrong it will take more time to recover. Same for replicas. Also, larger segments will lead to less PITR to recovery, and use more memory.

ts101 · August 27, 2024, 5:56am

@jonatasdp, thank you for your answer. I have looked at the code, not being much of a programmer myself, but it looks to me this part of the code is for max_wal_size parameter. Not really documented clearly in source code to be sure. What I am asking is about wal_segment_size parameter.

I may not be clear enough. If we look into pg_wal directory we can see WAL files (or “WAL segments” as officially named). Total amount of all WAL files together is controlled by max_wal_size (this is actually soft limit, PosgreSQL can decide to use little bit more). The source code you are referring to is most probably about this parameter, the size of total amount of WAL files.

If we look into pg_wal directory for individual file, we can see one individual file is the size of 16 MB. This is default size of one individual WAL file. To check the size we can execute show wal_segment_size. Parameter is set only at the cluster creation initdb --wal-segsize=X or later with pg_resetwal --wal-segsize=X (this command should be run only if cluster is properly shut-down and should never be run on running server or crashed server). The “X” must be power of two, so 2, 4, 8, 16 (the default), 32 or 64.

Like I understand in heavy load (like time series environment) transactions can easily fill individual 16 MB WAL file. When this happens PostgreSQL must create a new file (usually renames and uses old WAL file that is not required by database anymore, because deleting and creating new files is expensive from performance point of view). But on heavy load systems even this “switching” to new WAL log file can get expensive. For example I see in our production environment in pg_wal directory that in every single minute more then 60 files are filled (time from ls -l command).

That is why I am asking if someone has changed this parameter (with initdb or pg_resetwal). Any experience on this parameter?

Also in source code you are referring to, there is comment:

// WAL segments are 16MB, so it doesn’t make sense not to round up to the nearest 16MB boundary.

WAL segments are 16 MB by default, but it can be changed (with initdb or pg_resetwal command). The source code assumption is wrong.

This parameter wal_segment_size is probably not known as good as other parameters and I am wondering if source code comments are like this, assuming 16 MB fixed WAL segment size, if changing this parameter is smart move at all, not to fall into some trap programmers expecting 16 MB fixed size.

jonatasdp · August 28, 2024, 6:08pm

I think timescaledb is recommending the average but if you have a very large workload, probably increasing will make it more efficient.

Remember the potential risks:

As you correctly pointed out, some parts of the PostgreSQL codebase might assume the default 16 MB size. This could potentially lead to unexpected behavior or bugs.
Larger segment sizes mean longer recovery times if a crash occurs mid-segment.
It may affect replication lag in streaming replication setups.

A good way to enrich this recommendation is you run a benchmark yourself simulating your production workload and check how it goes. Like try to load one day of data, and change the configuration a few times to see what is closer to your reality.

Also, instead of benchmarking only wal_segment_size, you might consider tuning checkpoint_timeout and max_wal_size to better suit your workload.

Every workload will have its own characteristics and will be created with a purpose.

Also, if you have any ideas how we could improve the timescaledb-tune, let us know, it would be interesting to collect numbers that diverge from the actual suggestion and are performing better! We could find a way to offer alternative settings and let the user decide too.