Installing PubSub+ Cloud in Alibaba Cloud Container Service for Kubernetes (ACK)

Alibaba Cloud Container Service for Kubernetes (ACK) integrates virtualization, storage, networking, and security capabilities. ACK allows you to deploy applications in high-performance and scalable containers and provides full lifecycle management of enterprise-class containerized applications. For more information about ACK, see the Alibaba Container Service for Kubernetes documentation.

This deployment guide is intended for customers installing PubSub+ Cloud in a Customer-Controlled Region. For a list of deployment options, see PubSub+ Cloud Deployment Ownership Models.

There are a number of environment-specific steps that you must perform to install PubSub+ Cloud.

Before you perform the environment-specific steps described below, ensure that you review and fulfill the general requirements listed in Common Kubernetes Prerequisites.

Solace does not support event broker service integration with service meshes. Service meshes include Istio, Cilium, Linkerd, Consul, and others. If deploying to a cluster with a service mesh, you must:

  • exclude the target-namespace used by PubSub+ Cloud services from the service mesh.
  • set up connectivity to event broker service in the cluster using LoadBalancer or NodePort. See Exposing Event Broker Services to External Traffic for more information.

The process to deploy event broker services in an ACK private data center is completed in collaboration with a Solace representative. The steps that you (the customer) perform are as follows:

  1. Create a Kubernetes cluster as described in the ACK Cluster Specifications. For customer-owned deployments, you are responsible for the set up of the Kubernetes cluster, and the maintenance and operation of the cluster.

ACK Cluster Specifications

Before you (the customer) install the Mission Control Agent, you must configure the ACK cluster with the technical specifications listed in the sections that follow.

For more information about using Alibaba ACK, see the User Guide for Kubernetes Clusters on the Alibaba Cloud site.

When created with the following specifications, the ACK cluster has multiple node pools, and is designed to be auto-scaled when new event broker services are created. Each node pool provides the exact resources required by each plan to help optimize the cluster's utilization.

Example of an ACK cluster as outlined in the preceding paragraph

Alibaba ACK Prerequisites

The following are the technical prerequisites for an Alibaba ACK deployment to deploy event broker services:

Permissions
An Alibaba account with the following permissions is required to create an ACK cluster:
  • Resource Access Management (RAM) and role-based access control (RBAC) must be activated and configured appropriately in the RAM console, see Authorization overview in the Alibaba Cloud documentation.
  • Auto Scaling must be activated in the Auto Scaling console.

Considerations for Deployments in China Regions

Additional considerations are required if the private region you deploy to is within China.

  • Instead of the GCP registry, you must use the Azure China registry. The Azure China registry uses a different secret that's required when you later deploy the Mission Control Agent. For example, run the following command where the <username> and <password> is provided by Solace.

    kubectl create secret docker-registry cn-reg-secret --namespace kube-system \ 
    --docker-server=solacecloud.azurecr.cn --docker-username=<username> --docker-password=<password>

Planning Your Network Requirements

When creating your ACK cluster, you must specify CIDR blocks for the virtual private cloud (VPC), vSwitches, pods, and Services. It is important to ensure that the CIDR block of the pods and services do not overlap with the CIDR block of the VPC. Therefore Solace recommends that you plan the IP address of each before you create an ACK cluster. The table below outlines the configuration requirements that should be met for each CIDR block.

Parameter Configuration
VPC CIDR

You can choose to use an existing VPC, or create a VPC to host your cluster. If creating a new VPC, you will need to define an IPv4 CIDR Block for it and also for the first vSwitch which be created at the same time as the VPC. .

The primary IPv4 CIDR Block for the VPC can use one of the following:

  • 10.0.0.0/8
  • 172.16.0.0/12
  • 192.168.0.0/16
  • Custom CIDR block (except 100.64.0.0/10, 224.0.0.0/4, 127.0.0.0/8, 169.254.0.0/16, and their subnets)

The first vSwitch also requires a CIDR Block within the following limits:

  • The CIDR block of a vSwitch must be a proper subset of the CIDR block of the VPC to which the vSwitch belongs. For example, if the CIDR block of a VPC is 192.168.0.0/16, the CIDR block of a vSwitch in the VPC can range from 192.168.0.0/17 to 192.168.0.0/29.

  • The first IP address and last three IP addresses of a vSwitch CIDR block are reserved. For example, if a vSwitch CIDR block is 192.168.1.0/24, the IP addresses 192.168.1.0, 192.168.1.253, 192.168.1.254, and 192.168.1.255 are reserved.

  • If a vSwitch is required to communicate with vSwitches in other VPCs or with data centers, make sure that the CIDR block of the vSwitch does not overlap with the destination CIDR blocks.

vSwitch CIDR

The IP addresses of ECS (Elastic Compute Service) instances are assigned from the vSwitches. This allows nodes in the cluster to communicate with each other. The CIDR blocks that you specify when you creating vSwitches in the VPC must be subsets of the VPC CIDR block. This means that the vSwitch CIDR blocks must fall within or be the same as the VPC CIDR block. When you set this parameter, take note of the following:

  • IP addresses from the CIDR block of a vSwitch are allocated to the ECS instances that are attached to the vSwitch.
  • You can create multiple vSwitches in a VPC. However, the CIDR blocks of these vSwitches cannot overlap with each other.

POD CIDR Block

The IP addresses of pods are allocated from the pod CIDR block, allowing pods to communicate with each other. When you set configure the this parameter, take note of the following items:

  • The CIDR block of the pods cannot overlap with the CIDR blocks of the vSwitches.
  • The CIDR block of the pods cannot overlap with the CIDR block specified by a Service.

For example, if the VPC CIDR block is 172.16.0.0/12, the CIDR block of pods cannot be 172.16.0.0/16 or 172.17.0.0/16, because these CIDR blocks are subsets of 172.16.0.0/12.

Service CIDR

The CIDR block of Services. Service is an abstraction in Kubernetes. When you set this parameter, take note of the following items:

  • The IP address of a Service is effective only within the Kubernetes cluster.
  • The CIDR block of Services cannot overlap with the CIDR blocks of vSwitches.
  • The CIDR block of Services cannot overlap with Pod CIDR Block.

In brief, considering Solace requires only one VPC, and the CIDR block for the VPC is specified when it is created, you simply need to ensure that the CIDR block of pods and the Services do not overlap with the CIDR block of the VPC.

VPC and vSwitch Requirements

One you have planned the CIDR blocks, the VPC and vSwitches can be created.

The cluster requires a single VPC, create it, select its region, name it anything you choose, enter a description, and select a resource group to which it will belong.

During this step, you will must specify an IPv4 CIDR block for the VPC following the information outlined in Planning Your Network Requirements above.

At this stage you also add the vSwitches. Solace recommends that you create at least three vSwitches for every VPC. Each vSwitch must be deployed in different zones to implement cross-zone disaster recovery. For example, the vSwitches could be configured as follows:

  • vSwitch-1 in Zone A

  • vSwitch-2 in Zone B

  • vSwitch-3 in Zone C

When you create the vSwitches, you must specify their private IP address ranges in CIDR notation, see Planning Your Network Requirements for information.

NAT Gateway Requirements

The cluster requires two NAT gateways, each set up in a different zone. This is required so the Mission Control Agent can communicate with the PubSub+ Home Cloud and our monitoring solution can ship metrics and logs.

When configuring the NAT gateways, they must be configured so that vSwitch-1 and vSwitch-2 do not share the same gateway.

vSwitch-3 can be configured to share one of the NAT gateways with either vSwitch-1 or vSwitch-2. This ensures that one zone is always capable of maintaining outgoing internet connections if the other zone goes down.

Creating the ACK Cluster

Once the VPC, vSwitches, and a NAT Gateway are configured, the cluster can be created from the ACK Console. When creating the cluster, configure the parameters with the settings as outlined in the table below. For any parameters where the settings are not specified in the table, you can leave them in their default value.

Parameter Configuration
Container runtime Docker 19.03.5
K8S Version Select 1.16 or above
vSwitche zones Select the three vSwitches you created when creating the VPC.
CNI Flannel
Configure SNAT Enabled
Service Account Token Volume Projection Enabled
Keypair Create a new keypair or use existing
Operating System CentOS based image
Remaining Settings Default values
Worker Nodes 2 x ecs.n2.large (4 core, 16GB)

Node Pool Configuration

Every node pool you configure must have the following settings:

  • Auto scaling

  • Data disk of 40 GB.

  • CentOS

The following table lists any additional resource and setting requirements for each node pool. Solace recommends that you configure the node pools at the same time as the cluster.

NodePool Requirements
prod1k-a
  • At least 4 cores and 16 GB of RAM
  • Must be associated with vSwitch-1 in Zone A
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod1k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod1k:NoExecute
prod1k-b
  • At least 4 core and 16 GB of RAM
  • Must be associated with vSwitch-2 in Zone B
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod1k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod1k:NoExecute
prod10k-a
  • At least 4 cores and 32 GB of RAM
  • Must be associated with vSwitch-1 in Zone A
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod10k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod10k:NoExecute
prod10k-b
  • At least 4 cores and 32 GB of RAM
  • Must be associated with vSwitch-2 in Zone B
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod10k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod10k:NoExecute
prod100k-a
  • At least 8 Cores and 64 GB of RAM
  • Must be associated with vSwitch-1 in Zone A
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod100k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod100k:NoExecute
prod100k-b
  • At least 8 Cores and 64 GB of RAM
  • Must be associated with vSwitch-1 in Zone B
  • Must have the following labels:
    • nodeType: messaging
    • serviceClass: prod100k
  • Must have the following taints:
    • nodeType=messaging:NoExecute
    • serviceClass=prod100k:NoExecute
monitor
  • At least 4 Cores and 8 GB of RAM
  • Must be associated with vSwitch-3 in Zone C
  • Must have the following labels:
    • nodeType: monitoring
  • Must have the following taints:
    • nodeType=monitoring:NoExecute

Instance Types Requirements

Alibaba cloud recommends that worker nodes have at least 4 cores. Solace also requires all instances to be Intel compatible and meet the requirements outlined in the following table:

Node Pools Scaling Tier Plans Included Worker Node Cores Worker Node RAM (GiB) Example Instance Types
prod1k-a
prod1k-b
prod1k Developer, Enterprise 250 and Enterprise 1K 4 16 ecs.g6.xlarge
ecs.g5.xlarge
prod10k-a
prod10k-b
prod10k Enterprise 5K and Enterprise 10K 4 32 ecs.r6.xlarge
ecs.r5.xlarge
prod100k-a
prod100k-b
prod100k Enterprise 50K and Enterprise 100K 8 64 ecs.r6.2xlarge
exc.r5.2xlarge
monitor monitor All Enterprise (HA) plans 2 8 ecs.n4.xlarge
ecs.n1.large
ecs.c6.xlarge

Autoscaling

Your cluster requires autoscaling in order to provide the appropriate level of available resources for your event broker services as their demands change. Solace recommends using the Kubernetes Cluster Autoscaler, which you can find in the Kuberenetes GitHub repository here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler.

You should refer to the Auto scaling of nodes documentation on the Container Service for Kubernetes (ACK), User guide for Kubernetes Clusters documentation site to implement the Cluster Autoscaler.

Storage Class Configuration for Autoscaling

Three scaling groups will be required by an auto scaleable cluster. Each of the scaling groups must be configured to use only one of the vSwitches so that each scaling group covers a specific availability zone. This is necessary as the auto-scaling is unable to trigger scale-ups a single group spanning multiple availability zones when using the availability zone nodeSelectors.

The autoscaling for ACK clusters requires a primary and backup StorageClasses for each zone. This allows the autoscaler's predicates to detect which of the zones need to be scaled up. These StorageClasses must be created before deploying the Mission Control Agent.

To support scale-up, the StorageClass must contain the allowVolumeExpansion property, and have it set to "true"

When deploying the Mission Control Agent, the helm parameters for primaryStorageClass and backupStorageClass must be set to the names of the two storage classes you created. For example, in the StorageClass.yaml file below, the names of the storage classes are: alicloud-disk-ssd-primary and alicloud-disk-ssd-backup .

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-disk-ssd-primary
parameters:
  type: cloud_ssd
  csi.storage.k8s.io/fstype: xfs
provisioner: diskplugin.csi.alibabacloud.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowedTopologies:
- matchLabelExpressions:
  - key: topology.diskplugin.csi.alibabacloud.com/zone
    values:
    - <Zone of vSwitch1>
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-disk-ssd-backup
parameters:
  type: cloud_ssd
provisioner: diskplugin.csi.alibabacloud.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowedTopologies:
- matchLabelExpressions:
  - key: topology.diskplugin.csi.alibabacloud.com/zone
    values:
    - <Zone of vSwitch2>
---
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: alicloud-disk-ssd-topology
parameters:
  type: cloud_ssd
provisioner: diskplugin.csi.alibabacloud.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

To apply the storage classes run the following command:

kubectl apply -f StorageClass.yaml

LoadBalancer Configuration

Each PubSub+ event broker service created will use one Server Load Balancer (SLB). By default, a shared SLB with a public IP address (EIP) will be created for each service.

The Mission Control Agent accepts a list of arbitrary annotations that will be passed to the LoadBalancer services that it creates for event broker services. This allows it to define various configurations settings on the SLB it creates.

For example to make an SLB private (No EIP public IP) you would need to specify the "service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: intranet" annotation, along with another annotation to configure where the SLB will be connecting to (vSwitch or VPC).