Installing PubSub+ Cloud in Amazon Elastic Kubernetes Service (EKS)
Amazon Elastic Kubernetes Service (Amazon EKS) gives you (the customer) the flexibility to start, run, and scale Kubernetes applications in the AWS cloud or on-premises. Amazon EKS helps you provide highly-available and secure clusters and automates key tasks such as patching, node provisioning, and updates. For more information about EKS, see the Amazon EKS documentation.
This deployment guide is intended for customers installing PubSub+ Cloud in a Customer-Controlled Region. For a list of deployment options, see PubSub+ Cloud Deployment Ownership Models.
There are a number of environment-specific steps that you must perform to install
Before you perform the environment-specific steps described below, ensure that you review and fulfill the general requirements listed in Common Kubernetes Prerequisites.
Solace does not support event broker service integration with service meshes. Service meshes include Istio, Cilium, Linkerd, Consul, and others. If deploying to a cluster with a service mesh, you must:
- exclude the
target-namespace
used by PubSub+ Cloud services from the service mesh. - set up connectivity to event broker service in the cluster using LoadBalancer or NodePort. See Exposing Event Broker Services to External Traffic for more information.
- Create a Kubernetes cluster. For customer-owned deployments, you are responsible for the set up of the Kubernetes cluster and the maintenance and operation of the cluster. The following information can help you to understand the requirements of that Kubernetes cluster that you create:
- The Kubernetes cluster must fulfill the Amazon EKS Prerequisites
- Review and choose from various considerations to deploy your EKS cluster.
- One way to create and understand the requirements for the Kubernetes EKS cluster is to use the examples (Terraform module and deployment scripts) available from Solace.
Solace provide
You can download the reference Terraform projects from
Beware that all sample scripts, Terraform modules, and examples are provided as-is. You can modify the files as required and are responsible for maintaining the modified files for your Kubernetes cluster.
Amazon EKS Prerequisites
The following are the technical prerequisites for an Amazon EKS deployment to deploy event broker services:
- VPC Security Group
- Before you begin, you must open an AWS support ticket and request an increase to the Rules per VPC Security Group to 200 for the region you intend to deploy your EKS. This increase is required for event broker services to support the default protocols, which are as follows:
- SSH
- WEB-messaging
- SEMP
- AMQP TLS
- MQTT web tls
- MQTT TLS
- REST
- SMF/TLS
- Load Balancer
- Permissions
- An AWS account with the following permissions. These permissions are required only by the individual when deployment is done using a Terraform module:
- All the permissions that
are required to create and manage the EKS cluster (
eksClusterRole
). - Permission to
create IAM roles and IAM policies in the EKS cluster. These permissions are required by the
Terraform module. The example module creates these IAM roles and policies that are used by the EKS
cluster, and the following permissions to create and manage resources in the EKS cluster:
- IAM Role
- Gives permissions to the following:
- EKS cluster control plane
- EKS cluster worker nodes
- EKS Load balancer controller
- EKS auto-scaler
- IAM Policy
- Creates a set of permissions that the is required by the following in the deployment:
- EKS load balancer controller (for more information, see the Amazon load balancer controller installation document)
- Auto-scaler (for more information about creating this policy, see Cluster Autoscaler in the Amazon documentation)
- All EC2 resources
- The Kubernetes and Terraform modules require this permission to access tags and the metadata of resources. The autoscaler and Load Balancer provisioning look at what security group each instance has, and modifies the security groups to add rules for the load balancer services.
- VPC
- The Terraform module requires this permission to create the VPC. The Kubernetes module also requires this permission to scan VPC and subnets to retrieve networking parameters.
- EBS
- Permission is required for Kubernetes dynamic PVCS that require access to EBS in order to dynamically create values and attach on the correct host. The host also requires EC2 access as described above.
- Security Groups
- The Kubernetes module requires this permission to create security groups. The load balancer services also requires rules added as well.
- Routing tables
- The Terraform module requires this permission to create routing tables.
- Internet Gateways
- Terraform module requires this permission to create an Internet gateway.
- Elastic IPs
- The Terraform module requires this permission to attach Elastic IP addresses (EIPs) to the NAT gateway.
- NAT Gateways
- The Terraform module requires this permission to create the NAT gateway and attach EIPs to the NAT gateway.
- OIDC Provider
- The Terraform module requires permission to create an OIDC provider that will be used to authenticate Kubernetes modules against IAM roles. The Auto-scaler and AWS Load Balancer Controller are two of the Kubernetes modules that use OIDC to authenticate the IAM role.
- Elastic Load Balancers
- Permission is required for the Kubernetes module to create ELBs for the Load Balancer services. The Load Balancer services requires rules that are added in the Security Groups mentioned below. Additional permissions required by the AWS Load Balancer Controller and must be updated to match those specified at https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.0/docs/install/iam_policy.json.
- Networking
- You must create an Elastic IP address (EIP) for each NAT Gateway that you intend to use with the following considerations:
- The EIPs for the NAT gateways must be created upfront.
- Solacerecommends two EIPs are created. A minimum of one EIP allocation ID is required.
Considerations for Deploying PubSub+ Cloud on an Amazon EKS Cluster
You (the customer) should consider the following limitations and recommendations regarding your Amazon EKS cluster for a private data center deployment of PubSub+ Cloud:
-
A minimum size of
/24
is required for an EKS cluster that's dedicated to event broker services. A larger size is required if the cluster contains other services. For more information, see IP Range. - Solace recommends that you have two NAT Gateways and two EIPs for redundancy. You can choose to use one NAT Gateway with one EIP with the consideration that all event broker service features [e.g., VPN bridges, Dynamic Message Routing (DMR), Disaster Recovery (DR), REST Destination Points (RDP)] that rely on external connections will fail to function if the zone of the NAT Gateways fails.
- Solace recommends that you have a least two Bastion hosts, and that these hosts are spread out evenly over the three Availability zones. This recommendation allows you to remotely access the deployment should a zone fail. It's important to note that remote access to your deployment if a zone fails is not a requirement; you can choose to use the minimum of one Bastion host instead.
Considerations for Deployments in China
There are additional considerations for deployments to Kubernetes clusters in China. For more information, see Deployments in China.
EKS Cluster Specifications
Before you (the customer) install the Mission Control Agent, you must configure the EKS cluster with the technical specifications listed in the following sections:
- Node Groups
- Instance Type Requirements
- Installing PubSub+ Cloud in Amazon Elastic Kubernetes Service (EKS)
- Storage Class
- Networking
- NAT Gateway
- Load Balancer
- IP Range
- Autoscaling
For more detailed information about using Amazon EKS, see the User Guide on the Amazon EKS documentation site.
Node Groups
The Kubernetes cluster autoscaler must use two node groups. The nodegroups can be configured to start from zero instances, which means that it has a full set of node groups for each scaling tier without requiring that instances run in each.
The node groups must be configured to start at 0 . Hints should be provided to the autoscaler as to which labels and taints are set on the worker nodes by using stags on the auto-scaling group (ASG). There is no mechanism available to tag the ASG when you create the node groups, but can be accomplished as described in Managed Nodes Scale to Zero and Cluster Autoscaler does not start new nodes when Taints and NodeSelector are used in EKS.
Instance Type Requirements
Because of the additional resources required to run Kubernetes, the instance types that are required for some of the scaling tiers are larger than their instance-based cousins. the following are the instance size type requirements for an EKS. For details about the core and RAM requirements for each scaling tier, see General Resource Requirements for Kubernetes and Default Port Configuration.
Scaling Tier | Instance Type Required |
---|---|
Monitor | T3.medium |
|
R5.large |
Enterprise 250 | R5.large |
Enterprise 1K | R5.large |
Enterprise 5K | R5.xlarge |
Enterprise 10K | R5.xlarge |
Enterprise 50K | R5.2xlarge |
Enterprise 100K | R5.2xlarge |
Storage Class
The EKS storage class (type
) can use either GP2 or GP3. Consider the following when choosing which storage class to use:
-
With GP2, the performance improves as the size of the disk increases. In cases where the your disk size is greater than 1 TB, using GP2 is a better option than GP3.
-
With GP3, the default performance is 3k IOPS regardless of disk size. GP3 is recommended for disk sizes that are less than 1TB. For cases where the disk size is greater than 1 TB, GP2 is the better choice unless you configure the GP3 storage class with a provisioned IOPS that's higher than 3k. It's important to note that if you increase the IOPS on the storage class, then all event broker services get the same IOPS. For example, if you increase to 6k IOPS, then all event broker services get 6k IOPS disks and you may incur an extra 3k IOPs costs as a result of doing this.
- For event broker services 10.6.1 and earlier, the disk requirement is twice the Message Spool size specified when you create an event broker service. For example, if you configure an event broker service to use a Message Spool of 500 GiB, you require a 1 TB disk size.
- For event broker services version 10.7.1 and later, the disk size requirement is 30% greater than the message spool size for the event broker service class. For example, the Enterprise 250 class has a message spool size of 50 GiB, requiring a 65 GiB disk size.
To deploy PubSub+ Cloud, you must configure the StorageClass
in EKS to use the WaitForFirstConsumer
binding mode (volumeBindingMode
). To support scale-up, the StorageClass
must contain the allowVolumeExpansion
property, and have it set to "true
". You should always use XFS as the filesystem type (fsType
).
After creating the cluster, create the storage class using the references storage class yaml examples in the reference EKS Terraform on GitHub:
Networking
To spread an event broker service, or rather the event brokers over the Availability Zones (AZ), an anti-pod affinity should be used. When the region has multiple availability zones, the topologyKey
key should be set to kubernetes.io/zone
, whereas when AZs not are available, set it to topology.kubernetes.io/hostname
.
NAT Gateway
You require one Elastic IP (EIP) for each NAT gateway for your cluster. Solace recommends that you have two Elastic IPs (and two NAT gateways) for a production system.
You can have up to three EIPSs and NAT gateways, which allows you to have multi-AZ NAT redundancy . This requires that you have three EIPs. These NAT EIPs must be created upfront. If you use a Terafom module, ensure you use the EIPs.
Load Balancer
PubSub+ Cloud requires the deployment of the AWS Load Balancer Controller to create Network Load Balancers (NLB) to front event broker services. You can find instructions for deploying the AWS Load Balancer Controlled at https://kubernetes-sigs.github.io/aws-load-balancer-controller
Solace configures the NLBs used in PubSub+ Cloud with IP targets and cross-zone enabled. This results in the fastest possible failover times.
Solace uses the following service annotations to configure the NLB:
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true" service.beta.kubernetes.io/aws-load-balancer-type: external service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing # this one is removed for internal (private) services service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: '2' service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: '2' service.beta.kubernetes.io/aws-load-balancer-healthcheck-port: '5550' service.beta.kubernetes.io/aws-load-balancer-healthcheck-protocol: http service.beta.kubernetes.io/aws-load-balancer-healthcheck-path: /health-check/guaranteed-active service.beta.kubernetes.io/aws-load-balancer-healthcheck-timeout: '6' service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: '10'
If you configured your EKS cluster using Solace's custom AWS Load Balancer configurations and require information about the custom AWS Load Balancer see Custom AWS Load Balancer for PubSub+ Cloud.
IP Range
A minimum size of /24
is required for an EKS cluster that's dedicated to event broker services. A larger size is required if the cluster contains other services.
Amazon's VPC Container Network Interface (CNI) allocates IPs directly from the VPC’s subnets. The number of IPs that are allocated is directly proportional to the number of worker nodes, which is also proportional to the number of event broker services.
The calculations below are based on custom settings for WARM_IP_TARGET
and WARM_ENI_TARGET
:
kubectl set env ds aws-node -n kube-system WARM_IP_TARGET=1 kubectl set env ds aws-node -n kube-system WARM_ENI_TARGET=0
Details about these settings are available in the Amazon Kubernetes VPC CNI documentation on GitHub.
CIDR Calculator
You can use the Solace provided downloadable excel-based CIDR calculator to calculate your CIDR requirements.
Limitations
Currently, the limit for event broker services in a cluster using the standard aws-load-balancer-controller
is 11 due to security group rule limits. You can avoid this limitation by using a modified aws-load-balancer-controller
provided by Solace.
Autoscaling
Your cluster requires autoscaling to provide the appropriate level of available resources for your event broker services as their demands change. Solace recommends using the Kubernetes Cluster Autoscaler, which you can find in the Kubernetes GitHub repository at: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler.
See the Autoscaling documentation on the Amazon EKS documentation site for information about implementing a Cluster Autoscaler.