kubelink

module
v0.0.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 17, 2021 License: Apache-2.0, BSD-3-Clause, MIT

README

A Network Bridge for Service Networks of Multiple Kubernetes Clusters

This projects implements a kubernetes controller manager that supports VPN tunnels connecting the service networks of a mesh of kubernetes clusters.

The only precondition is, that these service networks are disjoint. The connections to other clusters are just configured by a new custom resource. Every connection is defined by a dedicated instance of this resource. To establish a connection between two clusters both sides must provide an appropriate link resource.

Concept

Every kubernetes cluster that should participate in such a virtual service network needs an own IP address in a network cidr defined for the mesh. Like the service cidrs this network might be a private one, for example 192.168.0.0/24, which will offer the possibility to connect 253 clusters. It must be disjoint from all node networks, pod networks and service networks of all involved clusters. But the node and pod networks might be identical in all clusters.

To connect the service networks of multiple clusters their service ip ranges must be disjoint, for example 100.64.0.0/20, 100.64.0.16.0/20 and so on.

Every cluster participating in such a mesh requires an endpoint with an externally accessible IP address with an assigned DNS name. This is typically achieved by defining a kubernetes service of type Loadbalancer. The DNS name can automatically be provisioned by using the DNS controllers provided by the external-dns-management project

For the connection among the clusters two modes are available:

  • Wireing using TLS secured TCP connections maintained by the kubelink-broker. The required server certificate is taken from a secret. It might best be maintained by an ACME-protocol based certificate provider. For this the project cert-management can be used, which offers a kubernetes controller manager working together with the above mentioned DNS eco-system.
  • Kernel based VPN solution offered by wireguard. Here again the kubelink-broker is used to maintain a wireguard device, which then handled the network traffic by its own.

The examples just provide such a setup.

Besides this connection management controller there is an additional controller (kubelink-router) that manages the required routes on the cluster nodes. Hereby the node where the broker is running is used as gateway.

TLS Based Connection

On the gateway the broker maintains a tun device with the cidr of the cluster mesh and the IP address assigned to the dedicated cluster. This address is also used to provide an S-NAT for outgoing traffic to the other clusters. Depending on the destination address the broker dials to the dedicated broker service for this cluster. Therefore it uses its own server certificate as TLS client certificate. This way TLS is used for key exchange and no additional mechanism is required. The common name of the client certificate (always the FQDN of the external address of the broker for this cluster) is used to authenticate an incoming connection. It is guaranteed by the certificate authorithy that has issued the server certificate.

The certificate, private key and CA certificate is taken from a standard kubernetes TLS secret. Here it is maintained by the certificate service, but it can also be maintained manually.

Wireguard

On the gateway the broker maintains a wireguard device with the cidr of the cluster mesh and the IP address assigned to the dedicated cluster. This address is also used to provide an S-NAT for outgoing traffic to the other clusters.

The broker does not handle the inter-cluster network traffic, which is now completely handled by the wireguard device. Therefore the broker configures the wireguard device and its peers according the desired cluster links, which must provide a public wireguard key for the dedicated endpoint.

The connections between two clusters can be configured dynamically just be adding an instance of the KubeLink custom resource.

apiVersion: kubelink.mandelsoft.org/v1alpha1
kind: KubeLink
metadata:
  name: <remote cluster name>
spec:
  cidr: <service cidr of remote cluster>
  clusterAddress: <cidr notation of IP and netmask of the connected cluster in the cluster mesh network>
  endpoint: <the FQDN of the mesh endpoint of this cluster>
  publicKey: <public key, for wireguard, only>

That's it. For the TLS bases user space solution no certificate, key, nothing else required in best case when using the DNS and server certificate provisioning controllers proposed above.

For the wireguard solution only the public key of the foreign sites are required (and the local private key, which is maintained in secret)

Constraints

For example on AWS routing of foreign traffic on the node network is not possible by default (source/desctination check). In such cases an IPIP tunnel has to be used on the node network. This can be handled by the router daemon set.

If calico is used there is typically already a configured tunl0 device which is just reused. But the the setup phase of calico is not able to handle foreign tun devices (used by the broker) correctly and will discover the wrong node IP, network and interface. So far the only way to circumvent this problem is to use the interface detection method. Hereby the primary interface of the nodes has to be configured (for example interface=eth0). Unfortunately this interface name is dependent on the used operation system/image. For heterogeneous clusters with nodes using different operating systems, this will not work (only if the set of interfaces is known in advance for configuring the calico daemon set). Newer calico versions now support the detection method cidr, which allows detecting the correct interface by the node subnet CIDR, which is operating system agnostic.

If Gardener is used to maintain the involved Kubernetes clusters the required calico config can be directly described in the shoot manifest. The section networking has to be adapted as follows (change to the node cidr of your environment) .

    networking:
      type: calico
      providerConfig:
        apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1
        kind: NetworkConfig
        backend: bird
        ipam:
          type: host-local
          cidr: usePodCIDR
        ipv4:
          autoDetectionMethod: cidr=<your node cidr>

Configure calico to use the new detection method cidr. Older versions of calico do not support this detection method, here you have to use the method interface, which unfortunately is operating system dependent. The default method does NOT reliably work in all environments together with the additional network device.

Implementation

The two used controllers are bundled into one controller manager (kubelink) using the controllermanager-library. It is provided in a single image (see Dockerfile). The example manifests just choose the appropriate controller(s) by a dedicated command line option.

Example

The folder examples contains the required manifests for two interconnected clusters.

In your scenario you have to adapt some values accordingly. The assumptions here in the example files are:

Cluster mesh network cidr: 192.168.0.0/24

The sample clusters here were kindly provided by the Gardener kubernetes fleet management environment. It supports the certificate and DNS management out of the box, so the manifests can be used without installing additional components.

cluster name kubelink1 kubelink2
node cidr 10.250.0.0/16 10.250.0.0/16
pod cidr 100.96.0.0/11 100.96.0.0/11
service cidr (disjoint) 100.64.0.0/20 100.64.16.0/20
cluster address and netmask 192.168.0.11/24 192.168.0.12/24
FQDN kubelink.kubelink1.ringdev.shoot.dev.k8s-hana.ondemand.com kubelink.kubelink1.ringdev.shoot.dev.k8s-hana.ondemand.com

You can see, that the node and pod networks used in the involved clusters are identical.

On every cluster the kubelink service is deployed to the kube-system namespace, but any other namespace is possible.

Here the rbac roles and service account must be deployed. The crd will automatically be deployed by the controller manager but can also be deployed manually (ATTENTION: no auto update possible if manually deployed).

The certificate request is optional and is specific to the dedicated cluster. Here two folders are provided examples/kubelink1and examples/kubelink2 that contain the appriopriate example files.

The certificate requests work together with the DNS annotation of the kubernetes services used by the kubelink-broker deployments. Finally the kubelink-router daemon sets have to be deployed into the clusters.

Depending on your network policies it might be required to enable access from the kube dns server to the additional coredns deployment used by kubelink.

Now the instrumentation is done and it is possible to define the mesh by applying the appropriate kubelink resources. For every connection a pair of such instances have to be deployed into the involved clusters.

Wireguard

For wireguard

  • select the option --mode=wireguard for the broker pod.
  • the secret now must specify the private key for the local wireguard device (field WireGuardPrivateKey).
  • the broker service must be changed to the UDP protocol.
  • for AWS a network loadbalancer must be enforced by adding the annotation service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

The default port assumed for the link objects then is 8777. In the link objects additionally the field publicKey must be provided.

And now?

Now you can deploy a service into one of the clusters, for example kubelink2. The echo service from the examples just deploys a tiny http server echoing every request. It does neither offer a load balancer nor an ingress, so it's a completely cluster-local service. Looking at the service object you get a ClusterIP for this service, for example 100.64.16.20.

Now you can create a busybox pod

$ kubectl run -it --image busybox busybox --restart=Never

and call (replace by actual IP address)

$ wget -O - 100.64.16.20

which reaches the private echo service in the remote cluster.

DNS Propagation for Services

The broker supports the propagation of service DNS names. This is done by an own coredns deployment, which can be automatically configured by the broker. There are two different modes, witch are selected by the option --dns-propagation (default is none):

  • kubernetes: According to the established links and available API server access information the coredns DNS server is hereby configured with a separate kubernetes plugin for every active foreign cluster and therefore needs access to the foreign API servers. This connectivity is again done by the cluster mesh access by using the service ip of the foreign cluster's API server (service kubernetes) This mode requires explicit cross-cluster traffic for the kubernetes plugin of coredns to access the foreign API servers.

  • dns: This new mode directly uses the dns service of the foreign clusters. The coredns DNS server is hereby configured with a separate rewrite and forward plugin for every active foreign cluster. It does not need any API server access or credentials, but the address of the foreign dns service (typically IP 10 in the service address range) and its cluster domain (typically cluster.local)

The coredns deployment is specific for a dedicated cluster, because it contains a dedicated service IP of the cluster (the cluster DNS service uses IP 10, and the kubelink IP service is intended to use the IP 11 of the cluster's service IP range)

This deployment provides an own (coredns) DNS server serving the kubelink. domain (default for option --mesh-domain). Every cluster of the mesh that supports the proliferation of service DNS entries is mapped to an own sub domain, according to its cluster name (name of the KubeLink object). Here the typical service structure is exposed (<service>.<namespace>.svc....).

This DNS server can be embedded into the local cluster DNS service by reconfiguring the cluster DNS service. For clusters using coreos this can easily be done by configuring an own server in the Corefile forwarding the kubelink.domain to the kubelink coredns service (therefore the fixed cluster IP from above is used). This can be done for example by deploying a coredns-custom configmap.

Depending on your network policies a dediacted access from the kube's coredns server to the kubelink dns server must be enabled.

There are several ways this DNS support can be used:

  • Explicit Configuration of the DNS access information of the foreign clusters at the KubeLink objects.

    • For the kubernetes mode the spec field apiAccess has to be set to a valid secret reference. The secret must have the data fields:
      • token and
      • certificate-authority-data taken from a service account of the foreign cluster with the appropriate permissions (see kubeconfig plugin)
    • For the dns mode the DNS access information has to be maintained in the spec field dns:
      • dnsIP: the (service) IP address of the foreign DNS service
      • baseDomain; the cluster domain used in the foreign cluster.
  • Automatic Advertisement of the DNS access info. Here the broker requires the option --dns-advertisement or --service-account=<name>. The settings of the foreign KubeLink objects are now maintained automatically by the broker according to information advertised by the foreign clusters.

    Please note: This mode is NOT supported by the wireguard mode, because here the broker cannot exchange data with other clusters using the connection handshake.

    With this option the Outbound Advertisement is enabled. The local information is advertised to all foreign members of the mesh, which update their KubeLink objects accordingly.

    The Inbound Update is always active. Whenever a foreign broker advertises its info, it is updated in its local KubeLinkobject.

If the foreign API server is used by the kubernetes plugin of coredns, its advertised service account (or the manually maintained credentials) is used to access the service objects of the foreign cluster and must have the appropriate permissions.

In all cases the option --dns-propgation must be set to enable the DNS feature. By default, only the foreign clusters are included in the mesh's top-level domain. To provide a uniform DNS hierarchy in all clusters of the mesh including the local cluster, the optional option --cluster-name can be set, which provides a dedicated sub domain in the kubelink.top-level domain for the local cluster.

For accessing the cluster DNS service defaults are used:

  • cluster.local for the cluster domain
  • IP 10 in the clusters service IP range for the address of the cluster DNS service. For the advertisement it can be overridden by command line options.

If your cluster uses coredns for the local cluster DNS service, which supports the coredns-custom config map, the option --coredns-configure can be used to automatically connect the mesh DNS with your cluster DNS.

A typical option set for the broker to enable the full service looks like this (for the example cluster kubelink1):

            - --dns-advertisement
            - --service-account=coredns
            - --dns-propagation=kubernetes
            - --cluster-name=kubelink1 # change-me
            - --coredns-configure

With the option --coredns-deployment (default kubelink-coredns) it is possible to override the name of the coredns deployment used to handle the mesh DNS domain. Whenever the configuration changes, the deployment is restarted.

Basically this coredns deployment is intentionally not maintained by the broker, because this would require extensice permissions (to configure service account and RBAC policies).

Command Line Reference

Kubelink manages network links among kubernetes clusters

Usage:
  kubelink [flags]

Flags:
      --advertised-port int                         Advertised broker port for auto-connect
      --auto-connect                                Automatically register cluster for authenticated incoming requests
      --bind-address-http string                    HTTP server bind address
      --broker-port int                             Port for broker
      --broker.advertised-port int                  Advertised broker port for auto-connect of controller broker (default 80)
      --broker.auto-connect                         Automatically register cluster for authenticated incoming requests of controller broker
      --broker.broker-port int                      Port for broker of controller broker (default 8088)
      --broker.cacertfile string                    TLS ca certificate file of controller broker
      --broker.certfile string                      TLS certificate file of controller broker
      --broker.cluster-domain string                Cluster Domain of Cluster DNS Service (for DNS Info Propagation) of controller broker (default "cluster.local")
      --broker.cluster-name string                  Name of local cluster in cluster mesh of controller broker
      --broker.coredns-configure                    Enable automatic configuration of cluster DNS (coredns) of controller broker
      --broker.coredns-deployment string            Name of coredns deployment used by kubelink of controller broker (default "kubelink-coredns")
      --broker.coredns-secret string                Name of dns secret used by kubelink of controller broker (default "kubelink-coredns")
      --broker.coredns-service-ip string            Service IP of coredns deployment used by kubelink of controller broker
      --broker.default.pool.size int                Worker pool size for pool default of controller broker (default 1)
      --broker.dns-advertisement                    Enable automatic advertisement of DNS access info of controller broker
      --broker.dns-name string                      DNS Name for managed certificate of controller broker
      --broker.dns-propagation string               Mode for accessing foreign DNS information (none, dns or kubernetes) of controller broker (default "none")
      --broker.dns-service-ip string                IP of Cluster DNS Service (for DNS Info Propagation) of controller broker
      --broker.ifce-name string                     Name of the tun interface of controller broker
      --broker.ipip string                          ip-ip tunnel mode (none, shared, configure of controller broker (default "IPIP_NONE")
      --broker.keyfile string                       TLS certificate key file of controller broker
      --broker.link-address string                  CIDR of cluster in cluster network of controller broker
      --broker.mesh-domain string                   Base domain for cluster mesh services of controller broker (default "kubelink")
      --broker.mode string                          VPN mode (bridge, wireguard, none)  of controller broker
      --broker.node-cidr string                     CIDR of node network of cluster of controller broker
      --broker.pool.resync-period duration          Period for resynchronization of controller broker
      --broker.pool.size int                        Worker pool size of controller broker
      --broker.secret string                        TLS/wireguard secret of controller broker
      --broker.secret-manage-mode string            Manage mode for TLS secret of controller broker (default "none")
      --broker.secrets.pool.size int                Worker pool size for pool secrets of controller broker (default 1)
      --broker.served-links string                  Comma separated list of links to serve of controller broker (default "all")
      --broker.service string                       Service name for managed certificate of controller broker
      --broker.service-account string               Service Account for API Access propagation of controller broker
      --broker.service-cidr string                  CIDR of local service network of controller broker
      --broker.tasks.pool.size int                  Worker pool size for pool tasks of controller broker (default 1)
      --broker.update.pool.resync-period duration   Period for resynchronization for pool update of controller broker (default 20s)
      --broker.update.pool.size int                 Worker pool size for pool update of controller broker (default 1)
      --cacertfile string                           TLS ca certificate file
      --certfile string                             TLS certificate file
      --cluster-domain string                       Cluster Domain of Cluster DNS Service (for DNS Info Propagation)
      --cluster-name string                         Name of local cluster in cluster mesh
      --config string                               config file
  -c, --controllers string                          comma separated list of controllers to start (<name>,<group>,all) (default "all")
      --coredns-configure                           Enable automatic configuration of cluster DNS (coredns)
      --coredns-deployment string                   Name of coredns deployment used by kubelink
      --coredns-secret string                       Name of dns secret used by kubelink
      --coredns-service-ip string                   Service IP of coredns deployment used by kubelink
      --cpuprofile string                           set file for cpu profiling
      --default.pool.size int                       Worker pool size for pool default
      --disable-namespace-restriction               disable access restriction for namespace local access only
      --dns-advertisement                           Enable automatic advertisement of DNS access info
      --dns-name string                             DNS Name for managed certificate
      --dns-propagation string                      Mode for accessing foreign DNS information (none, dns or kubernetes)
      --dns-service-ip string                       IP of Cluster DNS Service (for DNS Info Propagation)
      --grace-period duration                       inactivity grace period for detecting end of cleanup for shutdown
  -h, --help                                        help for kubelink
      --ifce-name string                            Name of the tun interface
      --ipip string                                 ip-ip tunnel mode (none, shared, configure
      --keyfile string                              TLS certificate key file
      --kubeconfig string                           default cluster access
      --kubeconfig.disable-deploy-crds              disable deployment of required crds for cluster default
      --kubeconfig.id string                        id for cluster default
      --lease-name string                           name for lease object
      --link-address string                         CIDR of cluster in cluster network
  -D, --log-level string                            logrus log level
      --maintainer string                           maintainer key for crds (defaulted by manager name)
      --mesh-domain string                          Base domain for cluster mesh services
      --mode string                                 VPN mode (bridge, wireguard, none)
      --name string                                 name used for controller manager
      --namespace string                            namespace for lease (default "kube-system")
  -n, --namespace-local-access-only                 enable access restriction for namespace local access only (deprecated)
      --node-cidr string                            CIDR of node network of cluster
      --omit-lease                                  omit lease for development
      --plugin-file string                          directory containing go plugins
      --pod-cidr string                             CIDR of pod network of cluster
      --pool.resync-period duration                 Period for resynchronization
      --pool.size int                               Worker pool size
      --router.default.pool.size int                Worker pool size for pool default of controller router (default 1)
      --router.ipip string                          ip-ip tunnel mode (none, shared, configure of controller router (default "IPIP_NONE")
      --router.node-cidr string                     CIDR of node network of cluster of controller router
      --router.pod-cidr string                      CIDR of pod network of cluster of controller router
      --router.pool.resync-period duration          Period for resynchronization of controller router
      --router.pool.size int                        Worker pool size of controller router
      --router.update.pool.resync-period duration   Period for resynchronization for pool update of controller router (default 20s)
      --router.update.pool.size int                 Worker pool size for pool update of controller router (default 1)
      --secret string                               TLS/wireguard secret
      --secret-manage-mode string                   Manage mode for TLS secret
      --secrets.pool.size int                       Worker pool size for pool secrets
      --served-links string                         Comma separated list of links to serve
      --server-port-http int                        HTTP server port (serving /healthz, /metrics, ...)
      --service string                              Service name for managed certificate
      --service-account string                      Service Account for API Access propagation
      --service-cidr string                         CIDR of local service network
      --tasks.pool.size int                         Worker pool size for pool tasks
      --update.pool.resync-period duration          Period for resynchronization for pool update
      --update.pool.size int                        Worker pool size for pool update
      --version                                     version for kubelink

Directories

Path Synopsis
cmd
ipt
pkg
apis/kubelink/v1alpha1
Package v1alpha1 is the v1alpha1 version of the API.
Package v1alpha1 is the v1alpha1 version of the API.
client/kubelink/clientset/internalversion
This package has the automatically generated clientset.
This package has the automatically generated clientset.
client/kubelink/clientset/internalversion/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
client/kubelink/clientset/internalversion/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
client/kubelink/clientset/internalversion/typed/kubelink/internalversion
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/kubelink/clientset/internalversion/typed/kubelink/internalversion/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
client/kubelink/clientset/versioned
This package has the automatically generated clientset.
This package has the automatically generated clientset.
client/kubelink/clientset/versioned/fake
This package has the automatically generated fake clientset.
This package has the automatically generated fake clientset.
client/kubelink/clientset/versioned/scheme
This package contains the scheme of the automatically generated clientset.
This package contains the scheme of the automatically generated clientset.
client/kubelink/clientset/versioned/typed/kubelink/v1alpha1
This package has the automatically generated typed clients.
This package has the automatically generated typed clients.
client/kubelink/clientset/versioned/typed/kubelink/v1alpha1/fake
Package fake has the automatically generated clients.
Package fake has the automatically generated clients.
taptun
Package taptun provides an interface to the user level network TAP / TUN device.
Package taptun provides an interface to the user level network TAP / TUN device.
tcp

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL