VMWare VROPs – Design Notes

Design vROPS

  1. VROPS Cluster Types

    Link: https://blogs.vmware.com/tam/2015/12/vrealize-operations-manager-architecture.html
  2. Analytic Nodes ( Best Practices )
    1. Deploy in same vSphere cluster , if not same geographical location.
    2. Same Storage type.
    3. Apply SDRS Anti-Affinity rules
    4. CPU intensive ( check vCPU:PCPU ratio)
    5. Network:
      1. 5 milliseconds latency between analytical nodes; L2 adjacency recommended
      2. >= 1Gb. 10Gbps recommended
      3. 200 milliseconds latency for remote collectors
  3. Authentication: PSC
  4. HA
    1. Cluster Management: HA (Master Replica).
    2. Cluster Nodes:
      1. vSphere HA
      2. HA Within VROPS
      3. Link: http://www.vbulosity.com/2016/01/vmware-vrops-to-ha-or-not-to-ha.html
    3. Collector Groups
  5. # of VROPs deployments across sites:
    1. http://vxpresss.blogspot.com/2014/11/part-4-vrealize-operations-manager.html
    2. http://vxpresss.blogspot.com/2015/04/part-14-can-i-deploy-deploy-vrops.html
  6. VROPS DR Architecture design: Not Recommended
    1. https://communities.vmware.com/thread/516234
  7. Scaling VROPs
    1. Scale up
      1. Supported node sizes for vCPU and vRAM
      2. Data nodes deployed in the cluster must be the same node size.
      3. https://www.definit.co.uk/2016/04/vrops-what-if-i-want-to-scale-up/
    2. Scale out: Use Remote Collectors vs Scale out nodes use cases
      1. Empowering more users with access to their metrics
      2. A significant increase in the number of collected objects and metrics
      3. http://www.vbulosity.com/2016/01/vmware-vrops-when-and-how-to-scale-your.html
    3. Remote Collectors ( both on local site and remote sites)
      1. http://vxpresss.blogspot.com/2016/10/did-you-know-2-leveraging-vrops-remote.html
      2. http://www.vbulosity.com/2016/01/vmware-vrops-when-and-how-to-scale-your.html
  8. Backup and Restore
    1. All nodes are backed up and restored at the same time. You cannot backup and restore individual nodes.
    2. Backup Pre-Requisites
      1. Disable Quiescing.
      2. Verify that all nodes are powered on and are accessible while the backup is taking place.
    3. Backup Guidelines
      1. Use a resolvable host name and a static IP address for all nodes.
      2. Back up the entire virtual machine. You must back up all VMDK files that are part of the virtual appliance.
      3. Do not stop the cluster while performing the backup.
      4. Do not perform backup while dynamic threshold (DT) calculations are running because this might lead to performance issues or loss of nodes.
    4. Restore Guidelines
      1. Power off the virtual machines in the multi-node cluster that you want to restore.
      2. Before restoring to a different host, power off virtual machines at the original location, and then bring up the environment on the new host to avoid hostname or IP conflict. Verify that the datastore on the new host has sufficient capacity for the new cluster.
      3. Verify that all VMDK files have been assigned to the same datastores.
      4. Note: When you restore vRealize Operations Manager systems by using any tool, be aware that you will need to reset the root password after the restore completes.


  1. Policies
    1. What are Policies? : With policies, you control what data vRealize Operations Manager collects and reports on for specific objects in your environment
  2. Badges
    1. Images
    2. Workload: How hard an object is working (CPU, Memory, disk I/O, Network I/O etc.)
    3. Anomalies: Indicates how an object is behaving currently compared to how it has behaved in the past.
    4. Faults: Configuration Issues / Events on the object (loss of n/w or HBA, HA failover event, etc. )
    5. Time Remaining: Time before object reaches maximum capacity
    6. Capacity Remaining: Remaining number of VMs the object can fit in
    7. Stress: Long-term high workload. ( Workload badge in health -> Instantaneous workload)
    8. Reclaimable Waste: Resources you can get back from your virtual infrastructure
    9. Density: The amount of resources that you can provision before contention or conflict for a resource occurs between objects
    10. Link: https://blogs.vmware.com/management/2014/04/david-davis-on-vcenter-operations-post-8-understanding-vcenter-operations-badges.html
  3. Alerts & Symptoms
    1. Images


  1. Upgrade VROPs to 6.6.1
    1. Depends on current VROPs version you have
      1. From 6.0/6.1: Upgrade to 6.3.1 & from 6.3.1 upgrade to 6.6.1
      2. From 6.2-6.6: Upgrade to 6.6.1
    2. Upgrade steps – High level
      1. Download
        1. OS Upgrade (.pak)
        2. Product Upgrade (.pak)
      2. Take snapshots of vRops VMs ( Optional )
      3. Take Cluster Offline; /admin UI
      4. Upgrade OS
        1. After upgrade cluster comes online
      5. Upgrade Product
      6. Link: https://communities.vmware.com/docs/DOC-35610
    3. What happens to default content
      1. Reset / Not Reset: http://vxpresss.blogspot.com/2016/02/reset-out-of-box-content-during-vrops.html
      2. Best Practice
        1. Never customize default content; Clone Alerts, Symptoms, Recommendations & Policies and customize.
  2. Remote Collectors
    1. Add Collector: https://stevehegarty.wordpress.com/2016/12/15/configure-vr-ops-remote-collectors/
    2. Collector Group : http://pubs.vmware.com/vmware-validated-design-40/index.jsp#com.vmware.vvd.sddc-deploya.doc/GUID-503B14A9-9CA2-4834-8D25-04E5C84086BF.html
    3. Assign instance to collector/collector group
  3. Management Packs
    1. Install management packs
      1. http://pubs.vmware.com/vmware-validated-design-40/index.jsp#com.vmware.vvd.sddc-deploya.doc/GUID-A9F77AB9-01E7-4BCA-A417-EA2F5A388C66.html
    2. Upgrade Management packs ( https://vrops-master/ -> Administrator-> Solutions )
      1. Same process as upgrading VROPS but url is different.
  4. vSphere Predictive DRS
    1. vSphere Cluster -> Enable Predictive DRS
    2. VROPS -> vCenter Adapter vCenter Instance -> Provide Data to vSphere predictive DRS
  5. VRA – Health Badges in VRA : Tenant -> Metric Provider Configuration -> VRA/VROPs -> Select VROPS

Interesting Use Cases

  1. Auto Scaling for Private Cloud: VRA, VRO, VROPS, NSX, Webhook Shims
  2. ServiceNow Integration: vRealize + Webhooks + ServiceNow
  3. Alert Remediation using VRO Workflow: https://blogs.vmware.com/management/2015/04/extending-vrealize-operations-actions-vrealize-orchestrator-solution-workflow-package.html


No tags for this post.