Question
Is it possible to have a multinode setup using helm charts, where the Access node is on one virtual machine and data nodes on different nodes?
To Reproduce
With existing commands, all pods are created on the same node\virtual machine.
Expected behavior
On installation, access node should get deployed on the virtual machine \ Node and data nodes on other virtual machines?
I checked in with our engineering team, and here’s their advice.
The default values for the multinode recipe have a podAntiAffinity for pods of a same release that should theoretically avoid that situation. It is set to preferred though, not required, so the scheduler can ignore it if there are not enough eligible nodes in the cluster.
preferred as a default makes sense, to be as flexible as possible. We’d recommend ensuring the cluster has enough capacity, then changing that default to requiredDuringSchedulingIgnoredDuringExecution
Additional thoughts: that affinityTemplate could also be used to ensure data nodes are in different AZs, if the AZ is available as a node label. Or, it could be used the other way around, to ensure they are in the same AZ to reduce paid cross-zone traffic, if cost is more important than availability.
I too am struggling with this. I have installed several times on a k3s cluster with 3 worker nodes and the installation works but each time I see all pods being created on the same node, which is clearly undesirable. I would expect each pod to be created on a different node in the cluster as it makes no sense to me to have a distributed database which is only running on one node.
I tried the suggestion from Lorraine but it made no difference: I changed the podAntiAffinity as below. Maybe this is wrong, I’m not sure.
One other thing, the post installation instructions for obtaining the passwords for multi node are incorrect. Given a db name of “dbserver”, it tells you to get the postgres password using the command:
I don’t think we have examples of this in our projects, but I have suggestions from the team that you could explore perhaps? Here they are:
Scheduling a pod to a specific node is possible using taints mechanism in Kubernetes. If a user wants to prevent workloads from being scheduled to the same node for HA purposes, it’s also possible to use topology spread constraints.