Hi TimescaleDB team,
I’ve encountered a problem where joining two tables with the same timestamp column produce different query plan and execution time. I expect that it executes as fast as the first query, but somehow the second query is very slow.
Here is the tables chain.transactions
, chain.receipts
and chain.prices
:
create table chain.transactions (
hash text,
block_number bigint,
block_timestamp timestamptz
-- other columns are removed for simplicity
);
select chain.create_hypertable(
relation => 'chain.transactions',
time_column_name => 'block_timestamp',
chunk_time_interval => INTERVAL '1 day'
);
create unique index transactions_hash_index
on chain.transactions(hash, block_timestamp desc);
create index transactions_block_number_index
on chain.transactions (block_number);
create table chain.receipts (
-- ID
hash text,
block_timestamp timestamptz
-- other columns are removed for simplicity
);
select chain.create_hypertable(
relation => 'chain.receipts',
time_column_name => 'block_timestamp',
chunk_time_interval => INTERVAL '1 day'
);
create unique index receipts_hash_unique_index
on chain.receipts(hash, block_timestamp desc);
create table chain.prices (
token_hash text,
block_timestamp timestamptz,
price double precision
);
create unique index prices_unique_index
on chain.prices(
token_hash,
block_timestamp desc
);
Joining chain.transactions
and chain.receipts
is very fast as expected
explain analyze verbose
select
transaction.*,
receipt.*
from chain.transactions transaction
left join chain.receipts receipt on receipt.hash=transaction.hash and receipt.block_timestamp=transaction.block_timestamp
where receipt.status is not null
order by transaction.block_number desc, transaction.transaction_index asc
limit 100
result:
Planning Time: 2.211 ms
Execution Time: 40.437 ms
Here is the complete query plan: transactions join with receipts | explain.dalibo.com
Meanwhile when I’m trying to join chain.transactions
and chain.prices
is super slow:
set statement_timeout to '1h';
explain analyze verbose
select
transaction.*,
price.*
from chain.transactions transaction
left join chain.prices price on price.block_timestamp=transaction.block_timestamp and price.token_hash='ETH'
order by transaction.block_number desc, transaction.transaction_index asc
limit 10 -- Set limit to 10 because 100 just too slow
result:
Planning Time: 4.352 ms
Execution Time: 179720.454 ms
Here is the complete query plan: transactions join with prices | explain.dalibo.com
I have fixed the second query with CTE and it’s very fast as expected:
-- explain analyze verbose
with transactions as (
select *
from chain.transactions
order by block_number desc, transaction_index asc
limit 100
)
select
transaction.*,
price.*
from transactions transaction
left join chain.prices price on
price.block_timestamp=transaction.block_timestamp
and price.token_hash = 'ETH'
The thing that bother me is that before CTE, two queries join on the same column with the similar index and structure, but still somehow the second query is very slow.
I’m intrigued to know why the second query is very slow? The prices
and receipts
both have the same unique index structure.
Thank you so much!
Edit:
The table size receipts
is greater than prices
, receipts and transactions have more than 20M rows so I don’t think size is the problem here.