Oct 26, 2021
A selection of FAQs from Timescale Office Hours. During our last session we discussed chunk size, indexing, space partitioning, and more.
Last week, we had a very engaged group join for office hours. There were TimescaleDB users from a variety of industries including fintech, IoT, and manufacturing.
As a reminder, we run open office hours a few times a month to give users an opportunity to ask any questions they might have about TimescaleDB. If you are interested in joining to discuss your use case with our engineers and learn more from other community members, you can sign up for our next session here.
Below, we will share a selection of questions from our last office hours.
Yes, this is fully supported by TimescaleDB. If a new chunk needs to be created under the hood, it will. However, there can be some performance implications. For example, if you are importing all of your recent data, Postgres can store this in memory and speed it up, but if you are entering it randomly in time it won’t be as efficient (but it will still work).
With partial indexes, having duplicated data could help you avoid performance issues. However, using separate tables is probably the way to go.
For sizing, it’s generally based on memory. If most of your data coming in is recent data, you want the entire chunk and its indexes to be stored in memory in the Postgres shared buffers. The chunk size should be 20-25% including indexes. We have a function called chunk_relation_size
which will give you the full size of the table including all the indexes. It’s generally not a problem to go a little over or under the recommended size, but you don’t want a table that takes up all your memory and requires the index to swap in and out of RAM.
For new chunks you can change the chunk size interval. For historic chunks, you can’t change the size without dumping the data.
Yes. Indexes can be huge. The characteristics of indexes in TimescaleDB are pretty similar to those in Postgres. But, the main difference is that TimescaleDB doesn’t create global indexes across all the chunks. Instead, the index is local to that chunk and that’s how we are able to scale. Within TimescaleDB, there’s not a restriction on indexes or how they behave. (Learn more about using composite indexes to speed up time-series queries here.)
HA in Postgres and TimescaleDB is pretty similar since we inherit many characteristics from Postgres. Postgres has the option to do hot standbys or synchronous replicas, and you can run Patroni which watches the health of your databases to determine who is the leader. (More on this topic here.) Internally, we use Kubernetes to manage all of HA. We plan to share more on our methodology for doing this in the upcoming months.
It is possible. However, the challenge here is table management. We’d suggest using a naming convention or metadata tables.
While you can use space partitioning for this use case, you won’t see direct performance benefits. Space partitioning is generally use to get parallelism across multiple disks. On the same topic, TimescaleDB has a reordering policy (docs) which will take your table and reorder it based on an index. This will make your queries a lot faster if you typically query on a specific index.
Stay tuned for more posts like this one and check out our previous recap. We hope to see you at an upcoming office hours!