Category: All posts
Nov 23, 2023
Posted by
Mike Freedman
To close out our final week of “Always Be Launching” month, we are announcing “Automated Full Disk Management” for Timescale Cloud, a new capability to ease operational overhead, protect against unforeseen overages, and keep your database up and running.
Throughout this month, we’ve announced a number of new features that improve Timescale Cloud, our cloud-native database service for time-series, which is designed to give developers a worry-free experience – without sacrificing flexibility and control.
These announcements have included major query performance increases in TimescaleDB (8000x for finding distinct values), significant improvements to TimescaleDB’s best-in-class columnar compression, the ability to scale to larger compute instances and greater storage capacity supporting 100+ TB of pre-compressed data), a new “operations center” for your database, flexible VPC Peering for greater security, and hardened backup/restore mechanisms for ensuring the reliability of your data. And, since Timescale Cloud is powered by TimescaleDB, you get all of these capabilities and use the PostgreSQL environment you know and love. It’s been quite a month!
To continue with this theme of delivering a worry-free platform for time-series data that gives you control when and where you want, we’re launching support for automated full-disk management on Timescale Cloud.
Automated full-disk management will alert you whenever you approach storage limitations on your account, put your service in a read-only state so that your data is not lost, and give you an opportunity to configure your service so that you can resume collecting everything that matters.
Keep reading for more about how we automated full disk management on Timescale Cloud, the four key components of our approach and how they work together to give you more peace of mind about your disk usage.
If you’re new to Timescale, create a free account to get started with a fully-managed Timescale Cloud service (100% free for 30 days, no credit card required).
Once you are using TimescaleDB, please join the Timescale community and ask any questions you may have about time-series data, databases, and more.
And, for those who share our mission and want to join our fully remote, global team: we are hiring broadly across many roles.
Finally, special thanks to Ivan Tolstosheyev and the entire Timescale Cloud team behind the development of this capacity 🙏 .
We’ve all experienced it.You’re trying to take that latest puppy video or photo to share in your group text, but your storage is full. You’re left trying to figure out which caches to clear or other pictures to delete.
Nobody likes a full disk, including your database and the operating system it uses. You try to insert more data into your database, and the “write-ahead-log” (WAL) it uses to ensure all data is reliably and atomically written has no place to write its log. Try to add an index, and there’s no place on disk to store the index pages. And, even if you don’t directly write any new data to the database, things are happening (or, perhaps more accurately, not happening) in the background. Temp files can’t be written. File system blocks can't be allocated. Unexpected things go wrong.
A cloud-native platform like Timescale Cloud circumvents these issues, providing built-in safety mechanisms (or, as we like to say “worry-free”) for your TimescaleDB database. Why should disk resources be finite and fixed, rather than scale as needed over time? Why shouldn’t you have an “escape hatch” to safely recover some space when needed and return user instances to healthy states?
Many of these mechanisms can and should be transparent and “just work” (as one example, Timescale Cloud employs system balloon files for an additional layer of “defense in depth” to full disks). But today, we wanted to share a bit more about new and existing user-facing capabilities for full disk management.
Timescale Cloud's automated full-disk management performs four key tasks in sequential order:
Let’s walk through the components that underpin Timescale Cloud’s automated full disk management:
Timescale Cloud continuously monitors the health and resource consumption of all database services. This real-time data is always available in the “metrics” tab of your cloud console (and also monitored by our 24/7 operations team). When the database’s storage consumption exceeds certain thresholds of available resources, the platform triggers automated actions. (This includes both user-facing actions and behind-the-scenes ones, as described below.)
Timescale Cloud automatically triggers email notifications when your storage exceeds 75%, 85%, and 95% capacity. But, because we’re developers too and know excessive emails are quickly lost or ignored, we’ve added a few parameters to balance signals and noise: alerting thresholds use low- and high-watermarks for thresholds, and messages are capped by time, so that developers should expect at most one email about storage capacity per 24 hours for each database.
We also know that sometimes you might not see or react to alerts immediately, especially if you’re performing a large batch insert on a relatively small disk at 3am. So, the platform will automatically place the database in a safe “read-only” state once it reaches 99% full (and alert you via both email and the cloud console).
At this point, you can still query your database, but cannot insert any new data. So while the database is still otherwise available for queries, the main goal at this point is to determine next steps: resize storage capacity for continued growth or shrink data usage.
Timescale Cloud allows you to increase your storage capacity, from 10GB to 10TB, without any downtime. And, because the platform decouples compute and storage, you can incrementally increase just the resource that’s needed; if you need another 500GB, no need to pay for another 4 CPU because that’s the next VM instance that’s available.
Just navigate to your cloud console and select the disk size that works for you. You’ll see side-by-side comparisons and cost calculations as you make adjustments – and, once you’re all set, hit apply, and additional capacity will be allocated to your service in a few seconds. And, we mean zero downtime - even ongoing queries will be unaffected during the resizing. Once your service’s storage has been increased, the database is automatically taken out of read-only protection and you can start writing freely again.
To shrink storage consumption, users can turn off read-only mode, then perform any needed actions, e.g., compressing data, deleting rows/tables, or dropping old data via data retention policies. You can turn the entire database back to read-write mode through the cloud console if you want, although this isn’t our recommended approach: if your service has any runaway ingest pipelines and applications that will auto-reconnect, these applications might start immediately re-inserting data once you do so (although the automated overload protections will kick in again shortly after).
Better yet, you can make an individual session read-write, while the database overall remains in read-only model (this is a built-in TimescaleDB capability, which Timescale Cloud inherits). You can log in to your database, and enable compression, data retention, or delete rows or tables from within only that session.
As a concrete example, connect to your database via psql
and run the following to turn off read-only protection for that specific session.
SET default_transaction_read_only TO off;
Then from within that same session, you can turn on compression to save 94 - 97% of your storage consumption.
ALTER TABLE purchases SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'sku'
);
SELECT add_compression_policy('purchases', interval '1 day');
Or you can create a data retention policy to only retain, for example, data for 90 days, which will start working on any old data to free up space.
SELECT add_retention_policy('purchases', interval '90 days');
And you are done! As soon as the storage consumption drops beneath the appropriate threshold, the platform’s continuous service monitoring will automatically remove the read-only protections so you can start inserting data again.
With automated full-disk management, Timescale Cloud now provides capabilities for monitoring storage consumption, automatically triggering actions when above a certain threshold, and resizing database storage without any downtime.
But, our larger goal is to provide full auto-scaling. With today’s launch, the triggered actions are sending email alerts and placing a database into read-only mode, enabling you to resize your instance with a single click. A natural next step is allowing you to select (or opt-in) to auto-scaling for your service(s), so that triggered actions also include automatically increasing storage when needed.
Of course, control is still critical for cloud platforms, and a big part of our approach. So we’ll continue to notify users as storage fills up or is resized, and plan to allow developers to specify limits to auto-scaling to avoid unexpected costs. And then, if a database service even hits that preconfigured auto-scale limit, Timescale Cloud’s overload protection will kick in to make sure safe actions can be taken.
We kicked off “Always Be Launching” with our announcement of $40M in new financing. (You can read my co-founder’s post for more details about our new investors and long-term vision for Timescale.)
This is the final post of our #AlwaysBeLaunching month. Throughout twelve announcements this month, our goal was to demonstrate to our customers and the broader industry our commitment to delivering high-quality features and products at a fast pace:
On a personal note, when Ajay and I founded Timescale a few years ago, we were always excited about the possibilities for time-series data, and its need for the right type of database. But what has continuously amazed us is the breadth and variety...and sheer coolness...of its use cases.
From building tiny battery-less sensors that harvest energy from the thin air, to collecting data from outer space from orbital missions, conserving museum artifacts, improving air travel and our busy skies, listening to space weather, improving yields in smart agriculture, empowering retail investors with trading bots, and more. Helping developers measure everything that matters to them, from the mundane to the amazing.
We’re even more excited and passionate about the future of Timescale and time-series data than when we first started. Just wait to see what we have planned in the coming months and beyond!
So as our friendly mascot Eon likes to say, it’s time to make that future into today and “Always Be Launching”.
If you’re new to Timescale, create a free account to get started with a fully-managed Timescale Cloud instance (100% free for 30 days). After creating a new database service, start loading data to use TimescaleDB without worrying about limits – because our automated full disk management is here to protect your service.
And once you are using TimescaleDB, please join the TimescaleDB community and ask any questions you may have about time-series data, databases, and more.
And, for those who share our mission and want to join our fully remote team: we are hiring broadly across many roles.
To the stars! 🐯🚀