schedulercache

package
v0.0.0-...-8a55389 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 10, 2022 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CreateNodeNameToInfoMap

func CreateNodeNameToInfoMap(pods []*v1.Pod, nodes []*v1.Node) map[string]*NodeInfo

CreateNodeNameToInfoMap obtains a list of pods and pivots that list into a map where the keys are node names and the values are the aggregated information for that node.

Types

type Cache

type Cache interface {
	// AssumePod assumes a pod scheduled and aggregates the pod's information into its node.
	// The implementation also decides the policy to expire pod before being confirmed (receiving Add event).
	// After expiration, its information would be subtracted.
	AssumePod(pod *v1.Pod) error

	// FinishBinding signals that cache for assumed pod can be expired
	FinishBinding(pod *v1.Pod) error

	// ForgetPod removes an assumed pod from cache.
	ForgetPod(pod *v1.Pod) error

	// AddPod either confirms a pod if it's assumed, or adds it back if it's expired.
	// If added back, the pod's information would be added again.
	AddPod(pod *v1.Pod) error

	// UpdatePod removes oldPod's information and adds newPod's information.
	UpdatePod(oldPod, newPod *v1.Pod) error

	// RemovePod removes a pod. The pod's information would be subtracted from assigned node.
	RemovePod(pod *v1.Pod) error

	// AddNode adds overall information about node.
	AddNode(node *v1.Node) error

	// UpdateNode updates overall information about node.
	UpdateNode(oldNode, newNode *v1.Node) error

	// RemoveNode removes overall information about node.
	RemoveNode(node *v1.Node) error

	// UpdateNodeNameToInfoMap updates the passed infoMap to the current contents of Cache.
	// The node info contains aggregated information of pods scheduled (including assumed to be)
	// on this node.
	UpdateNodeNameToInfoMap(infoMap map[string]*NodeInfo) error

	// List lists all cached pods (including assumed ones).
	List(labels.Selector) ([]*v1.Pod, error)
}

Cache collects pods' information and provides node-level aggregated information. It's intended for generic scheduler to do efficient lookup. Cache's operations are pod centric. It does incremental updates based on pod events. Pod events are sent via network. We don't have guaranteed delivery of all events: We use Reflector to list and watch from remote. Reflector might be slow and do a relist, which would lead to missing events.

State Machine of a pod's events in scheduler's cache:

+-------------------------------------------+  +----+
|                            Add            |  |    |
|                                           |  |    | Update
+      Assume                Add            v  v    |

Initial +--------> Assumed +------------+---> Added <--+

^                +   +               |       +
|                |   |               |       |
|                |   |           Add |       | Remove
|                |   |               |       |
|                |   |               +       |
+----------------+   +-----------> Expired   +----> Deleted
      Forget             Expire

Note that an assumed pod can expire, because if we haven't received Add event notifying us for a while, there might be some problems and we shouldn't keep the pod in cache anymore.

Note that "Initial", "Expired", and "Deleted" pods do not actually exist in cache. Based on existing use cases, we are making the following assumptions:

  • No pod would be assumed twice
  • A pod could be added without going through scheduler. In this case, we will see Add but not Assume event.
  • If a pod wasn't added, it wouldn't be removed or updated.
  • Both "Expired" and "Deleted" are valid end states. In case of some problems, e.g. network issue, a pod might have changed its state (e.g. added and deleted) without delivering notification to the cache.

func New

func New(ttl time.Duration, stop <-chan struct{}) Cache

New returns a Cache implementation. It automatically starts a go routine that manages expiration of assumed pods. "ttl" is how long the assumed pod will get expired. "stop" is the channel that would close the background goroutine.

type NodeInfo

type NodeInfo struct {
	// contains filtered or unexported fields
}

NodeInfo is node level aggregated information.

func NewNodeInfo

func NewNodeInfo(pods ...*v1.Pod) *NodeInfo

NewNodeInfo returns a ready to use empty NodeInfo object. If any pods are given in arguments, their information will be aggregated in the returned object.

func (*NodeInfo) AllocatableResource

func (n *NodeInfo) AllocatableResource() Resource

AllocatableResource returns allocatable resources on a given node.

func (*NodeInfo) AllowedPodNumber

func (n *NodeInfo) AllowedPodNumber() int

func (*NodeInfo) Clone

func (n *NodeInfo) Clone() *NodeInfo

func (*NodeInfo) DiskPressureCondition

func (n *NodeInfo) DiskPressureCondition() v1.ConditionStatus

func (*NodeInfo) MemoryPressureCondition

func (n *NodeInfo) MemoryPressureCondition() v1.ConditionStatus

func (*NodeInfo) Node

func (n *NodeInfo) Node() *v1.Node

Returns overall information about this node.

func (*NodeInfo) NonZeroRequest

func (n *NodeInfo) NonZeroRequest() Resource

NonZeroRequest returns aggregated nonzero resource request of pods on this node.

func (*NodeInfo) Pods

func (n *NodeInfo) Pods() []*v1.Pod

Pods return all pods scheduled (including assumed to be) on this node.

func (*NodeInfo) PodsWithAffinity

func (n *NodeInfo) PodsWithAffinity() []*v1.Pod

PodsWithAffinity return all pods with (anti)affinity constraints on this node.

func (*NodeInfo) RemoveNode

func (n *NodeInfo) RemoveNode(node *v1.Node) error

Removes the overall information about the node.

func (*NodeInfo) RequestedResource

func (n *NodeInfo) RequestedResource() Resource

RequestedResource returns aggregated resource request of pods on this node.

func (*NodeInfo) SetNode

func (n *NodeInfo) SetNode(node *v1.Node) error

Sets the overall node information.

func (*NodeInfo) String

func (n *NodeInfo) String() string

String returns representation of human readable format of this NodeInfo.

func (*NodeInfo) Taints

func (n *NodeInfo) Taints() ([]v1.Taint, error)

func (*NodeInfo) UsedPorts

func (n *NodeInfo) UsedPorts() map[int]bool

type Resource

type Resource struct {
	MilliCPU       int64
	Memory         int64
	NvidiaGPU      int64
	StorageScratch int64
	StorageOverlay int64
	// We store allowedPodNumber (which is Node.Status.Allocatable.Pods().Value())
	// explicitly as int, to avoid conversions and improve performance.
	AllowedPodNumber  int
	ExtendedResources map[v1.ResourceName]int64
}

Resource is a collection of compute resource.

func NewResource

func NewResource(rl v1.ResourceList) *Resource

New creates a Resource from ResourceList

func (*Resource) Add

func (r *Resource) Add(rl v1.ResourceList)

Add adds ResourceList into Resource.

func (*Resource) AddExtended

func (r *Resource) AddExtended(name v1.ResourceName, quantity int64)

func (*Resource) Clone

func (r *Resource) Clone() *Resource

func (*Resource) ResourceList

func (r *Resource) ResourceList() v1.ResourceList

func (*Resource) SetExtended

func (r *Resource) SetExtended(name v1.ResourceName, quantity int64)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL