Allocation for Partitioned Ganeti¶
- Created:
2015-Jan-22
- Status:
Implemented
- Ganeti-Version:
2.15.0
Current state and shortcomings¶
The introduction of Partitioned Ganeti allowed to dedicate resources, in particular storage, exclusively to an instance. The advantage is that such instances have guaranteed latency that is not affected by other instances. Typically, those instances are created once and never moved. Also, typically large chunks (full, half, or quarter) of a node are handed out to individual partitioned instances.
Ganeti’s allocation strategy is to keep the cluster as balanced as possible. In particular, as long as empty nodes are available, new instances, regardless of their size, will be placed there. Therefore, if a couple of small instances are placed on the cluster first, it will no longer be possible to place a big instance on the cluster despite the total usage of the cluster being low.
Proposed changes¶
We propose to change the allocation strategy of hail for
node groups that have the exclusive_storage
flag set,
as detailed below; nothing will be changed for non-exclusive
node groups. The new strategy will try to keep the cluster
as available for new instances as possible.
Dedicated Allocation Metric¶
The instance policy is a set of intervals in which the resources of the instance have to be. Typical choices for dedicated clusters have disjoint intervals with the same monotonicity in every dimension. In this case, the order is obvious. In order to make it well-defined in every case, we specify that we sort the intervals by the lower bound of the disk size. This is motivated by the fact that disk is the most critical aspect of partitioned Ganeti.
For a node the allocation vector is the vector of, for each instance policy interval in decreasing order, the number of instances minimally compliant with that interval that still can be placed on that node. For the drbd template, it is assumed that all newly placed instances have new secondaries.
The lost-allocations vector for an instance on a node is the difference of the allocation vectors for that node before and after placing that instance on that node. Lost-allocation vectors are ordered lexicographically, i.e., a loss of an allocation larger instance size dominates loss of allocations of smaller instance sizes.
If allocating in a node group with exclusive_storage
set
to true, hail will try to minimise the pair of the lost-allocations
vector and the remaining disk space on the node after, ordered
lexicographically.
Example¶
Consider the already mentioned scenario were only full, half, and quarter nodes are given to instances. Here, for the placement of a quarter-node–sized instance we would prefer a three-quarter-filled node (lost allocations: 0, 0, 1 and no left overs) over a quarter-filled node (lost allocations: 0, 0, 1 and half a node left over) over a half-filled node (lost allocations: 0, 1, 1) over an empty node (lost allocations: 1, 1, 1). A half-node sized instance, however, would prefer a half-filled node (lost allocations: 0, 1, 2 and no left-overs) over a quarter-filled node (lost allocations: 0, 1, 2 and a quarter node left over) over an empty node (lost allocations: 1, 1, 2).
Note that the presence of additional policy intervals affects the preferences of instances of other sizes as well. This is by design, as additional available instance sizes make additional remaining node sizes attractive. If, in the given example, we would also allow three-quarter-node–sized instances, for a quarter-node–sized instance it would now be better to be placed on a half-full node (lost allocations: 0, 0, 1, 1) than on a quarter-filled node (lost allocations: 0, 1, 0, 1).