Redundancy for the plain disk template¶
This document describes how N+1 redundancy is achieved for instanes using the plain disk template.
Current state and shortcomings¶
Ganeti has long considered N+1 redundancy for DRBD, making sure that
on the secondary nodes enough memory is reserved to host the instances,
should one node fail. Recently, htools
have been extended to
also take N+1 redundancy for shared storage into account.
For plain instances, there is no direct notion of redundancy: if the node the instance is running on dies, the instance is lost. However, if the instance can be reinstalled (e.g, because it is providing a stateless service), it does make sense to ask if the remaining nodes have enough free capacity for the instances to be recreated. This form of capacity planning is currently not addressed by current Ganeti.
Proposed changes¶
The basic considerations follow those of N+1 redundancy for shared storage. Also, the changes to the tools follow the same pattern.
Modifications to existing tools¶
The changes to the exisiting tools are literally the same as
for N+1 redundancy for shared storage with the above definition of
N+1 redundancy substituted in for that of redundancy for shared storage.
In particular, gnt-cluster verify
will not be changed and hbal
will use N+1 redundancy as a final filter step to disallow moves
that lead from a redundant to a non-redundant situation.