Rails migration for TimescaleDB hypertables

clarkbw · February 9, 2022, 8:42pm

I created a rails hypertable migration gist not long ago but figured it could use a bit more explanation here.

First I often see a common misconception that hypertables can be turned on or off as if they are only metadata to a regular postgres table. When in fact a hypertable is an alternate method of storing data and so tables must be converted to hypertables and vice versa.

So here’s the rails migration to and from a hypertable. The table name and time column are variables which can easily be changed to make this work for any table you want to convert. Once run the existing postgres table will be a TimescaleDB hypertable. And if you migrated down your hypertable would be converted back to a regular postgres table.

class ActionsHypertable < ActiveRecord::Migration[7.0]
  def change
    @table_name = 'actions'
    @time_column = 'created_at'
    reversible do |dir|
      dir.up do
        execute <<-SQL
          SELECT create_hypertable('#{@table_name}', '#{@time_column}', if_not_exists => TRUE, migrate_data => TRUE);
        SQL
      end
      dir.down do
        execute <<-SQL
        CREATE TABLE pg_#{@table_name} (LIKE #{@table_name} INCLUDING ALL); -- duplicate table structure
        INSERT INTO pg_#{@table_name} (SELECT * FROM #{@table_name}); -- copy all data
        DROP TABLE #{@table_name}; -- drops hypertable
        ALTER TABLE pg_#{@table_name} RENAME TO #{@table_name}; -- now a regular postgres table again
        SQL
      end
    end
  end
end

Because the migration uses migrate_data it will convert existing tables into a hypertable. Depending on the size of the table data this could take a while to complete. Similarly when it reverts from a hypertable to a regular table it could take time to undo if there is a lot of data to copy.

jonatasdp · February 10, 2022, 5:41pm

Very good, @clarkbw! Thanks for sharing the example.

Another hardcore detail using Rails with Timescale is to make the hypertables metadata available after the migration. In the timescale gem, I hooked SchemaDumper to introduce metadata from hypertables and append to db/schema.rb.
It’s helpful to get data structures loaded in other environments, which is convenient for testing.

Also, I think the point of migrating data inside the same migration can turn the downtime too long. In case a meaningful amount of data is needed to copy, it will hold the server until it’s ready with all the data over.

Probably running it in a background job would be faster to start and use a flag to control when the migration is “done”.

Zero downtime apps will need a series of at least three deployments:

Introduce the hypertable and duplicate data from the old table to the new one. A migration can schedule one job to move the old data “until deployment” to the hypertable. All systems are reading from the old table but duplicating on both storages.
When the background job finishes running the migration, all data is available on both tables, then new PR will move all systems to read from the hypertable.
To be safe and make steps reversible, the third PR makes the clean-up and removes the duplication and old table references. No reads or writes in the old table, only in the hypertable.