This blog post is aimed to provide a quick reference for the list of design considerations for architecting and implementing VMware on AWS.
The post is divided into two sections. First section lists the key design considerations in a tabular format and the second one will have more details on the features/considerations.
List of key design considerations / features
The Subscription model is similar to an AWS Regional Reserved Instance.
Subscriptions are fixed on the following attributes:
VMware Cloud Org
Host instance type
Customers should understand the shared responsibility model.
VMWare is responsible for the lifecycle of vSphere, VSAN and NSX.
Customers are responsible for configuration of VMs, networks and applications they manage.
Cloud Services Console
VMware Cloud Services Console provides UI to manage
When deploying SDDC, customers must provide a number of hosts and management ip range.
A VPC is created in VMWare managed aws account using the SDDC management IP address range provided during the deployment process, with several additional subnets also created within that VPC at the same time.
Any services utilized in the customer account of connected VPC are paid by the customer
i3 (36 cores, 512GiB RAM, and 10.37TiB raw storage). Suitable for:
i3en (96 cores, 768 GiB RAM and 45.84TB raw storage). Suitable for:
Cluster conversion from i3 to i3en can be done by contacting VMware.
Once a cluster increases to 3 or more, the cluster cannot be shrunk to 1 or 2.
Does not protect against regional level failures.
Add or remove hosts are done in pairs, one in each AZ.
Max hosts per cluster is the same as a standard cluster (16).
Minimum hosts supported for a stretched cluster is 2.
Once a stretched cluster increases to 6 or more, the cluster cannot be shrunk to 2 or 4 hosts.
Base / Primary Cluster (Cluster-1)
Base cluster, the first cluster deployed in a SDDC, hosts management appliances and NSX edges.
Datastores created in a base cluster:
Resource pools created in a base cluster:
Logical networks created in SDDC are automatically shared across all clusters.
They contain only workloadDatastores
Specify just the number of CPU cores you need per host.
Reduces costs for running applications licensed per-core.
Changing the number of cores does not affect the price of the host.
Number of clusters
Large clusters vs Small clusters:
Elastic DRS automatically adds or removes hosts from a cluster in response to utilization of cpu, ram and storage.
eDRS algorithm runs every 5 mins.
Configured at cluster level (in vcenter).
Elastic DRS baseline policy is always running and cannot be disabled
Elastic DRS Thresholds
All workloads should be placed within Compute-ResourcePool.
It is possible to create workloads outside of Compute-ResourcePool. However, it is strongly discouraged due to the potential for creating resource contention between workloads which exist within Resource Pools and those that do not
Management appliance sizes
Supported sizes medium and large.
By default, a new SDDC is created with medium-sized NSX Edge and vCenter Server appliances.
Large-sized appliances are recommended for deployments with more than 30 hosts or 3000 VMs
Directly attached storage using VSAN
AWS Storage Services
Managed Services Provider Storage
APN Storage partners
VSAN Logical Datastores
vSAN has been modified to present two logical Datastores from the same underlying physical storage.
Separation exists purely as a means of enforcing permissions on the storage for management appliances
The free space, used space, and total capacity numbers will be identical between the two logical Datastores, because both reflect the same underlying pool of capacity.
VSAN Slack space
VSAN requires slack space for operations such as deduplication, object re-balancing, and for recovering from hardware outages.
Since November 2021 the slack space required is 20%, earlier it was 30%.
To enforce the slack space requirement eDRS will automatically scale up SDDC if storage consumption crosses 79%.
VSAN Storage policies
SDDC – DX Connectivity
Can connect to Direct Connect DX. Both private VIF and public VIF are supported.
Cannot be directly connected to DXGW. Use the SDDC group to connect DXGW through VTGW.
A SDDC group can only have SDDC members in the same region.
Provides below connectivity
Doesn’t allow native VPC to native VPC communication.
One Direct Connect VIF for on-prem connectivity to all SDDCs in the same group.
Additional Bandwidth ( for N-S traffic )
In default configurations SDDC network has a single T-0 router (VMXNET3 NIC)
Additional T-0 routers can be created by creating traffic groups.
Requirements for traffic groups
– Must connect SDDC to VTGW , by creating a SDDC Group.
– SDDCs should have large-size management appliances and at least four hosts
– vmxnet3 10gbps
– TGW attachment 50gbps
SDDC is configured with a default route which points to its upstream Internet Gateway (IGW)
Policy based VPN
when new routes are added, the routing tables must be updated on both ends of the network
Appropriate choice if
– On-prem VPN does not support BGP
– Only a few networks on either ends of VPN.
Layer 2 extension
Two ways to extend a network from on-prem.
– Layer-2 VPN Tunnel using NSX Autonomous edge appliance
– HCX network extension
BGP route filtering on VMC side
Subnet for ENIs in Connected VPC
– Recommended subnet size
DNS Forwarders in VMC
How does conditional forwarding work?
When a DNS query is received, the DNS forwarder compares the domain name in the query with the domain names in the FQDN DNS zones.
Each zone contains the following details
VMC DNS Gotcha
NSX-T Tier 1 Gateways (MGW and CGW) only act as forwarders, relaying the queries from VMs to the actual DNS servers specified.
– Onprem DNS Servers
– Local DNS Servers
– Customer managed DNS on EC2
– AWS managed DNS using AWS Directory services
– Route53 resolver inbound endpoint
Don’t use Amazon’s provided DNS a.k.a .2 dns resolver. It has a limit of 1,024 queries per second from an endpoint, and in the case of VMC the limit is shared with all the VMs running within the SDDC.
Route 53 Resolver
Route 53 Resolver DNS Firewall
The following traffic is filtered by the firewall.
DNS Forwarders in windows DNS Servers
– Use public IPs provided by VMC
To use the IPs provided by VMC, you can request them in the VMC console and configure NAT.
For BYOIP, NLBs have to be used.
– In the Single SDDC environments, bring IPs to aws account which has connected VPC and assign the IPs to NLBs that have VMC vms as targets.
– In multiregion environments , create NLBs in custom VPCs and attach the VPCs to VTGW.
Reference architecture here
Upgrades and Maintenance
VMware sends a notification email, 7 days before a regular update and 2 days before an emergency update
During upgrades the following operations fail
– Provisioning new vm
– Hot or Cold migrations
HCX is provided free to all SDDCs.
Log Insight trial version is enabled by default.
NSX Advanced firewall
Monitoring SDDC traffic
VMC supports IPFIX and port mirroring.
Hybrid linked mode
Every cloud SDDC has a dedicated vCenter instance that can, if required, be linked to an on-premises vCenter instance using the Hybrid Linked Mode (HLM) feature
Additional roles in vcenter
Integration with native aws services
SQL server licensing
VMWare AWS Account
Whenever a new cloud services Organization (Org) is created, VMware creates a sub-account within this master account which acts as the parent for all AWS resources used by that Org.
SDDC Underlay VPC
Cross-Account ENIs are used to create a connection between every host within the base Cluster of the SDDC to a subnet within the cross-linked VPC.
The cross-link provides the SDDC with a network forwarding path to services maintained within the customer-owned AWS account
The Availability Zone (AZ) of the cross-link subnet will be used to determine the AZ placement of the hosts.
VSAN Encryption is enabled by default on each cluster deployed in your SDDC, and can’t be turned off.
vSAN uses AWS KMS to generate a Customer Master Key (CMK). CMK is used to generate KEK.
KEK is used to encrypt DEKs generated for each disk.
Uplinks of tier-0 edge router
Windows and SQL License
Additional details for the above points
Shared Responsibility Model
VMWare Cloud Organization
SDDC to DX over public VIF
SDDC to DX over private VIF
- VMWare Office Hours – Youtube
- VMWare AWS operations guide – here
- Vmc techzone sddc design – here
- Custom Cluster – VMWare blog
- Cluster types and sizes – vmctechzone
- Elastic DRS – vmctechzone
- Cluster design – vmware vmware
- SQL server licensing – vmware blog
- Very important links – vmc-aws-design-guide , vmc-aws-arch-guide
- VMWare Cloud: Solution Design
- VMWare Cloud on AWS: SDDC Design
- vSphere Sizing