Excited to Join the Timescale Community. Hello and a Few Questions!

Jean_Baro · November 24, 2024, 2:06pm

Hello Timescale Community,

I’m thrilled to make my first post here and become part of this amazing community. I’ve been following Timescale’s progress since its early days—back when I was an even bigger PostgreSQL fan and eagerly absorbed everything related to it. As a solutions architect who is passionate about all things database-related, I’m excited to now have the opportunity to work with Timescale (we’re currently benchmarking it as our main time-series option).

I have a few questions and assumptions I’d like to confirm, and I thought this would be a great way to introduce myself:

Using a Single Database for Time-Series and Relational Data: Is it possible with Timescale Cloud to use a single database for both time-series and relational data? For example, can we store relational data in standard PostgreSQL tables and easily join them with time-series hypertables?
Support for PostgreSQL Extensions: Are PostgreSQL extensions supported in Timescale Cloud? Specifically, can we use extensions that are commonly available in managed PostgreSQL services like AWS RDS?
Ensuring Unique Event Capture (Deduplication): What’s the best approach to ensure we capture only the first unique event (e.g., a signal from a device) and avoid duplicates?
Bitemporal Data and Partitioning Strategy (Timestamps in UTC): I plan to store two timestamps for each event—both in UTC: one for when the event happened on the device and another for when it was inserted into Timescale (a bitemporal approach). How would this impact the partitioning strategy?
Setting Up a Secondary Region as Standby: What’s the recommended strategy to set up a secondary region (AWS/Timescale Cloud) as a standby instance, in case the primary region experiences an outage? Ideally, I’d like to minimize costs by paying only for the storage in the secondary region and automatically, within seconds, spinning up the compute resources on demand (triggered by load), similar to a serverless approach, so I don’t need to incur compute costs while the standby remains idle.
Design Considerations for Multi-Region and Bitemporal Data: Are there any additional considerations for designing a time-series database with two regions (primary and secondary) while also supporting bitemporal data?
Multi-Tenancy Modeling, Operations, and Backup Strategies: What is the recommended approach for designing and operating a multi-tenancy model in TimescaleDB? Specifically, are there best practices for managing data segregation, resource allocation, and performance optimization, as well as efficient, tenant-specific backup and recovery strategies that are cost-effective and scalable in a Timescale Cloud environment?

I’m so grateful that this community exists and look forward to learning from all of you. I’m sure I’ll have many more questions in the coming months as we dive deeper into Timescale.

Thank you for your time!

jonatasdp · November 25, 2024, 3:03pm

Welcome to the Timescale community! Let’s go for your questions:

Single Database Usage: Yes, Timescale Cloud fully supports using a single database for both time-series and relational data. You can create standard PostgreSQL tables alongside hypertables and join them efficiently. This is actually one of TimescaleDB’s key advantages over pure time-series databases.
PostgreSQL Extensions: Timescale Cloud supports many common PostgreSQL extensions. Check the list of available extensions and share any specific extension you may need. You can also require an extension directly from the Timescale Cloud Console.

However, some AWS RDS-specific extensions might not be available. Check Timescale’s documentation for the complete list.

Deduplication Strategy: For unique event capture, consider:
- Using a unique constraint on relevant columns (device_id, event_timestamp). Also check this blog for more performance tips.
- Implementing an ON CONFLICT DO NOTHING clause in your INSERT statements
- For batch processing, use a CTE with DISTINCT before insertion
Bitemporal Data Structure: Your approach with dual UTC timestamps is solid, both will compress well too. For partitioning:
- Use the event timestamp for primary chunk partitioning
- Create an index on the insertion timestamp
- Consider smaller chunk intervals if you’ll frequently query by insertion time. Check how to size the chunk properly here.
About secondary regions and multi-region design: I hope this doc can help: Timescale Documentation | High availability
Multi-Tenancy Implementation:
- Separate hypertables per tenant for large deployments. Remember you can also use add_dimension to separate chunks per tenant.

Best practices:

Monitor chunk size and compression ratios
Implement appropriate retention policies
Use prepared statements for better performance
Consider materialized views for common queries
Consider downsampling methods for faster queries.

@Jean_Baro I hope you enjoy our community!

Happy coding!