How to choose the partition number in the add_dimension function?
add_dimension(‘column_name’,by_hash(‘column_name’,partition_number)
How to choose the partition number in the add_dimension function?
add_dimension(‘column_name’,by_hash(‘column_name’,partition_number)
What’s the rule of the thumb to choose the partition_number ?
Hi @pgloader , you should find the optimal number depending on your business cardinality.
Let’s say you want to make a dimension on client_id imagining you’re going to cluster some clients. If you want to have 1 per customer, you should have a larger number bigger than the number of clients.
If you want to cluster your clients in a dimension, you’d use a larger number.
Can you share more about your scenario and what dimensions are you planning to use?