fdbased

package
v0.0.0-...-957f62e Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 10, 2023 License: Apache-2.0, MIT Imports: 12 Imported by: 0

Documentation

Overview

Package fdbased provides the implemention of data-link layer endpoints backed by boundary-preserving file descriptors (e.g., TUN devices, seqpacket/datagram sockets).

FD based endpoints can be used in the networking stack by calling New() to create a new endpoint, and then passing it as an argument to Stack.CreateNIC().

FD based endpoints can use more than one file descriptor to read incoming packets. If there are more than one FDs specified and the underlying FD is an AF_PACKET then the endpoint will enable FANOUT mode on the socket so that the host kernel will consistently hash the packets to the sockets. This ensures that packets for the same TCP streams are not reordered.

Similarly if more than one FD's are specified where the underlying FD is not AF_PACKET then it's the caller's responsibility to ensure that all inbound packets on the descriptors are consistently 5 tuple hashed to one of the descriptors to prevent TCP reordering.

Since netstack today does not compute 5 tuple hashes for outgoing packets we only use the first FD to write outbound packets. Once 5 tuple hashes for all outbound packets are available we will make use of all underlying FD's to write outbound packets.

Index

Constants

View Source
const BatchSize = 47

BatchSize is the number of packets to write in each syscall. It is 47 because when GvisorGSO is in use then a single 65KB TCP segment can get split into 46 segments of 1420 bytes and a single 216 byte segment.

View Source
const (
	// MaxMsgsPerRecv is the maximum number of packets we want to retrieve
	// in a single RecvMMsg call.
	MaxMsgsPerRecv = 8
)

Variables

View Source
var BufConfig = []int{128, 256, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768}

BufConfig defines the shape of the buffer used to read packets from the NIC.

Functions

func New

func New(opts *Options) (stack.LinkEndpoint, error)

New creates a new fd-based endpoint.

Makes fd non-blocking, but does not take ownership of fd, which must remain open for the lifetime of the returned endpoint (until after the endpoint has stopped being using and Wait returns).

Types

type InjectableEndpoint

type InjectableEndpoint struct {
	// contains filtered or unexported fields
}

InjectableEndpoint is an injectable fd-based endpoint. The endpoint writes to the FD, but does not read from it. All reads come from injected packets.

func NewInjectable

func NewInjectable(fd int, mtu uint32, capabilities stack.LinkEndpointCapabilities) (*InjectableEndpoint, error)

NewInjectable creates a new fd-based InjectableEndpoint.

func (*InjectableEndpoint) ARPHardwareType

func (e *InjectableEndpoint) ARPHardwareType() header.ARPHardwareType

ARPHardwareType implements stack.LinkEndpoint.ARPHardwareType.

func (*InjectableEndpoint) AddHeader

func (e *InjectableEndpoint) AddHeader(pkt stack.PacketBufferPtr)

AddHeader implements stack.LinkEndpoint.AddHeader.

func (*InjectableEndpoint) Attach

func (e *InjectableEndpoint) Attach(dispatcher stack.NetworkDispatcher)

Attach saves the stack network-layer dispatcher for use later when packets are injected.

func (*InjectableEndpoint) Capabilities

func (e *InjectableEndpoint) Capabilities() stack.LinkEndpointCapabilities

Capabilities implements stack.LinkEndpoint.Capabilities.

func (*InjectableEndpoint) GSOMaxSize

func (e *InjectableEndpoint) GSOMaxSize() uint32

GSOMaxSize implements stack.GSOEndpoint.

func (*InjectableEndpoint) InjectInbound

func (e *InjectableEndpoint) InjectInbound(protocol tcpip.NetworkProtocolNumber, pkt stack.PacketBufferPtr)

InjectInbound injects an inbound packet. If the endpoint is not attached, the packet is not delivered.

func (*InjectableEndpoint) InjectOutbound

func (e *InjectableEndpoint) InjectOutbound(dest tcpip.Address, packet *buffer.View) tcpip.Error

InjectOutbound implements stack.InjectableEndpoint.InjectOutbound.

func (*InjectableEndpoint) IsAttached

func (e *InjectableEndpoint) IsAttached() bool

IsAttached implements stack.LinkEndpoint.IsAttached.

func (*InjectableEndpoint) LinkAddress

func (e *InjectableEndpoint) LinkAddress() tcpip.LinkAddress

LinkAddress returns the link address of this endpoint.

func (*InjectableEndpoint) MTU

func (e *InjectableEndpoint) MTU() uint32

MTU implements stack.LinkEndpoint.MTU. It returns the value initialized during construction.

func (*InjectableEndpoint) MaxHeaderLength

func (e *InjectableEndpoint) MaxHeaderLength() uint16

MaxHeaderLength returns the maximum size of the link-layer header.

func (*InjectableEndpoint) ParseHeader

func (e *InjectableEndpoint) ParseHeader(pkt stack.PacketBufferPtr) bool

ParseHeader implements stack.LinkEndpoint.ParseHeader.

func (*InjectableEndpoint) SupportedGSO

func (e *InjectableEndpoint) SupportedGSO() stack.SupportedGSO

SupportedGSO implements stack.GSOEndpoint.

func (*InjectableEndpoint) Wait

func (e *InjectableEndpoint) Wait()

Wait implements stack.LinkEndpoint.Wait. It waits for the endpoint to stop reading from its FD.

func (*InjectableEndpoint) WritePackets

func (e *InjectableEndpoint) WritePackets(pkts stack.PacketBufferList) (int, tcpip.Error)

WritePackets writes outbound packets to the underlying file descriptors. If one is not currently writable, the packet is dropped.

Being a batch API, each packet in pkts should have the following fields populated:

  • pkt.EgressRoute
  • pkt.GSOOptions
  • pkt.NetworkProtocolNumber

type Options

type Options struct {
	// FDs is a set of FDs used to read/write packets.
	FDs []int

	// MTU is the mtu to use for this endpoint.
	MTU uint32

	// EthernetHeader if true, indicates that the endpoint should read/write
	// ethernet frames instead of IP packets.
	EthernetHeader bool

	// ClosedFunc is a function to be called when an endpoint's peer (if
	// any) closes its end of the communication pipe.
	ClosedFunc func(tcpip.Error)

	// Address is the link address for this endpoint. Only used if
	// EthernetHeader is true.
	Address tcpip.LinkAddress

	// SaveRestore if true, indicates that this NIC capability set should
	// include CapabilitySaveRestore
	SaveRestore bool

	// DisconnectOk if true, indicates that this NIC capability set should
	// include CapabilityDisconnectOk.
	DisconnectOk bool

	// GSOMaxSize is the maximum GSO packet size. It is zero if GSO is
	// disabled.
	GSOMaxSize uint32

	// GvisorGSOEnabled indicates whether Gvisor GSO is enabled or not.
	GvisorGSOEnabled bool

	// PacketDispatchMode specifies the type of inbound dispatcher to be
	// used for this endpoint.
	PacketDispatchMode PacketDispatchMode

	// TXChecksumOffload if true, indicates that this endpoints capability
	// set should include CapabilityTXChecksumOffload.
	TXChecksumOffload bool

	// RXChecksumOffload if true, indicates that this endpoints capability
	// set should include CapabilityRXChecksumOffload.
	RXChecksumOffload bool

	// If MaxSyscallHeaderBytes is non-zero, it is the maximum number of bytes
	// of struct iovec, msghdr, and mmsghdr that may be passed by each host
	// system call.
	MaxSyscallHeaderBytes int

	// AFXDPFD is used with the experimental AF_XDP mode.
	// TODO(b/240191988): Use multiple sockets.
	// TODO(b/240191988): How do we handle the MTU issue?
	AFXDPFD *int

	// InterfaceIndex is the interface index of the underlying device.
	InterfaceIndex int
}

Options specify the details about the fd-based endpoint to be created.

type PacketDispatchMode

type PacketDispatchMode int

PacketDispatchMode are the various supported methods of receiving and dispatching packets from the underlying FD.

const (
	// Readv is the default dispatch mode and is the least performant of the
	// dispatch options but the one that is supported by all underlying FD
	// types.
	Readv PacketDispatchMode = iota
	// RecvMMsg enables use of recvmmsg() syscall instead of readv() to
	// read inbound packets. This reduces # of syscalls needed to process
	// packets.
	//
	// NOTE: recvmmsg() is only supported for sockets, so if the underlying
	// FD is not a socket then the code will still fall back to the readv()
	// path.
	RecvMMsg
	// PacketMMap enables use of PACKET_RX_RING to receive packets from the
	// NIC. PacketMMap requires that the underlying FD be an AF_PACKET. The
	// primary use-case for this is runsc which uses an AF_PACKET FD to
	// receive packets from the veth device.
	PacketMMap
)

func (PacketDispatchMode) String

func (p PacketDispatchMode) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL