vSphere 5 Cloud Infrastructure Launch Reflection

Its been almost two weeks since the webcast announcing vSphere 5 and despite all the concerns around the new licensing model that VMware has come up with, I feel (as do others in the VMware community) that the major product enhancements are overshadowed by this.

VMware saw that with the Intel and AMD chipsets going to 10 or 12 cores (and more) that the current licensing model will not scale correctly for private and public clouds to keep cost centers in check and prevent “virtual server sprawl”. We’ve all seen it in our environments – those quality assurance, development and staging servers that seem to spin off to become “isolated” platforms for other areas of the business only  to sit out there unused for days, months or even years. This is not to say that they aren’t needed or valuable, it’s just that is isn’t practical to keep them powered on and chewing up valuable resources in the cluster(s).

To review, you are licensed on the memory that is allocated and not the total amount of memory in your environment. Many engineers design N+1 or N+2 clustering in their environments to account for node failures to maintain uptime. The memory that is idle in these type of scenarios is not counted as allocated and therefore is not included in your vSphere 5 licensing cost.

The licensing will still be per processor as a foundation for how the vRAM entitlements will be provisioned. So in other words, when you buy a license, you will be given the “right” to provision how that memory will be shoveled out to virtual machines.

Anyway, the point of this post is to review some of the key features that are introduced with this new product. The main point that VMware is tying to hit home is that customers can now look at the virtual infrastructure to host 100% of their workloads. The fact that vm’s can now support 1TB of memory, 32 vCPU’s and can achieve 1 MILLION IOPs are some very compelling reasons to put your tier 1 workloads in VMware.

Another area that was focused on is the Policy Management enhancements that can provide a new level of operational service that is in line with vCloud Director. The main bullet points here are:

  • Self provisioning with “Auto Deploy” that will be accompanied by a patch process that will fully automate server maintenance.
  • Storage that is “Profile Driven”, meaning that a service level will be assigned to each deployment and will fall into a category that best suits their needs.
  • Distributed Resource Scheduling of Storage! -or- Storage DRS for short. VMware has taken DRS one step further and implemented an analysis process to the storage layer of the virtual machines to determine where the application would be be served in the environment. This also falls in line with the goal of vCloud Director.

ESXi Syslog Redirection and log file locations

The reduced footprint of and ESXi install makes it easier to manage from a file system perspective, but there are a number of best practices when it comes to managing this file structure. One being the location of the syslog’s and the other is just knowing where to find some of the critical log file locations. There seems to be multiple sources to derive this information and I thought it would be a great blog post to get this information on one page.

With the move from ESX to ESXi in all future releases of vSphere, there are few daemons that you should redirect on your cluster nodes especially if you are going to use SD or USB to boot ESXi. This will prevent ESXi from writing to these solid state devices thus increase the longevity of the drives.

One of the most chatty daemons on ESX is the syslogd service which is for logging VMkernel messages as well as other system level components. There are a number of ways to redirect the logging to another location and VMware does recommend that you ship these logs to another location.

1 Use the vSphere Client to either connect to directly to the node or through vCenter to the configuration tab > advanced and set these parameters:

2 Use PowerCLI or better yet – the vCLI in the vMA virtual appliance (see my blog post on vMA for install tips) via a perl script called “vicfg-syslog” that just so happens to be installed in it.

This can be done by running the following command:

#vicfg-syslog.pl –server <ESXi Host> –username root –password <password> –setserver x.x.x.x –setport 514

To verify the redirection is there:

#vicfg-syslog.pl –server <ESXi Host> –username root –password <password> –show

Gotcha: As of VMware vCenter Server 4.1, host profiles are not suitable for redirecting syslog paths (or scratch locations for that matter) since Host Profiles will filter out these advanced settings since they could be host specific. Such as the Syslog.Local.DatastorePath and ScratchConfig.ConfiguredScratchLocation or ScratchConfig.CurrentScratchLocation. I will cover Host Profiles in a separate post and this information will definitely be in there!

A few things to remember about syslog redirection and log files:

  • The syslog daemon will only redirect the logs to one remote address.
  • You cannot use the esxcfg-advcfg command to redirect the syslog.
  • You can remove the redirection on the host by re-issuing the vicfg-syslog.pl command and setting the remote server to a null string.
  • The sysboot.log file was first introduced in ESXi 4.0.
  • Log rotation or removal happens only during a restart of the node.

Scratch partition for log files

As of ESXi 4.0 Update 1, the file system now has a scratch partition (which is recommended, but not required) for the log files. This location is used for storing system level temporary logs, the system swap and other diagnostic data. The benefit of having this partition is to diagnose core dumps (PSOD’s) or other useful system logs to help diagnose problems (many of these being memory issues).

Few things to remember with the scratch partition:

  • If the partition is gone or was deleted, you can simply recreate the directory
  • It may be provisioned on VMFS or even FAT16 partitions
  • If there is no scratch partition, ESXi will store this data in RAMDISK which has significant space limitations
  • A scratch location is automatically created during installation
  • Do NOT share a scratch partition between ESXi hosts.

Log File Locations:

Logging Type Location
VMkernel / vmkwarning / hostd /var/log/messages
System Boot Log /var/log/sysboot.log (ESXi 4.x)
vCenter Agent /var/log/vmware/vpx/vpxa.log
Automatic Availability Manager (aam) /var/log/vmware/aam/vmware_<host>-xxx.log
Host Management Service (hostd) /var/log/vmware/hostd.log


I’ve also heard many people talk about turning the vMA into a log host and Simon Long has a great write-up about it. But without the ability to parse out the massive amount of data that is shipped there, it makes it really hard to locate node and time frame specific data. I would suggest a product such as Splunk that has a really nice interface on it to locate data in a timescale format.



vCloud Director (vCD) Architecture

With all the really cool and intricate discussions around this product, I thought that now would be a good time to take a step back and look at how this product is actually implemented. Terminology is abbreviated to save space, so you will see acronyms such as vCD (which stand for VMware Cloud Director), etc.

Architecture Overview

Think of the product in two distinct layers.  At the core, is vSphere cluster nodes and vCenter that is coupled with vShield Manager. This is the “foundation” (if you will) to providing services to each cloud Director service host (or commonly referred to as a “cell”). I say each, because you can only allocate one cell for each vCenter server. The second layer is comprised of VMware Cloud Director Server hosts that are made up of individual “cells”. Each cell operates off a central vCD database (which as of this posting, needs to be on Oracle) that resides at this layer. The cells form together to create the VMware Cloud Director Cluster.

Diagram 1


vCD Database

Each and every cell in a vCD cluster shares information through a database and each one needs to allow a minimum of 75 connections per/Cell with an additional 50 connections for Oracle.

Database sizing formula: 75 * (number of cells) + 50

vCD Database Guidelines:

  • Do not use the Oracle system account as the Cloud Director database user account.
  • Oracle must be at 10g Std. or Ent. Ed. Rel. 2 (10.2.0.x) -or- 11g Std. or Ent. Ed. (11.1.0.x)
  • A database server configured with 16GB of memory, 100GB storage, and 4 CPUs should be adequate for most Cloud Director clusters.
  • Verify that the database service starts automatically when the database server is rebooted.

Cloud Director Software

As stated above, each host must have the cloud director software installed on it to manage the cell. The only supported platform to date is Red Hat Enterprise Linux 5 (update 4 or 5) and must be 64bit. This is usually a vm in the cluster that has 2GB of memory assigned to is with multiple vCPU’s. Most standard builds of REH will have adequate space for the installation and log files. DNS is another critical component and must have forward and reverse FQDN lookups on the host. Issue this on your REH box:

#nslookup <cloudhost>

#nslookup <cloudhost>.domain.com

A sample return would be something like this:


nslookup x.x.x.x (where x=numerical octet)

Additionally, each host must have 2 IP addresses assigned to it and must have 2 SSL certificates. Each IP address needs to mount the shared transfer server storage at $VCLOUD_HOME/data/transfer (This volume must have write permission for root).

Logging into the vCD console is at:  https://cloudhost.domain.com/cloud (where cloudhost=your server name & domain=your domain name).

vCD Firewall Ports

As with any management portal, protect it from the internet with a firewall of some sort. To allow external management outside your organization, you will only need to allow port 443 (HTTPS) through. For connections internally, or within a vCD cluster:


Port Type Description
111 TCP & UDP NFS portmapper for transfer service
920 TCP & UDP NFS rpd.statd for transfer service
61611 TCP ActiveMQ
61616 TCP ActiveMQ


Port Type Description
111 TCP & UDP NFS Portmapper
443 TCP vCenter and ESXi Connections
514 UDP Syslog
902& 903 TCP vCenter and ESXi Connections
920 TCP & UDP NFS rpc.statd for transfer service
1521 TCP Oracle Database
61611 TCP ActiveMQ
61616 TCP ActiveMQ


Web Administration Browsers

  • Microsoft Internet Explorer (with the exception of IE7 on Win7 32bit or 64bit)
  • The Cloud Director Web Console requires Adobe Flash Player version 10.1 or later
  • Cloud Director requires SSL – versions include SSL 3.0 and TLS 1.0. (more on SSL in the vCD SSL section)

vShield Manager for vCD

  • Each Cloud Director cluster requires access to a vShield Manager host, which in turn provides network services to the Cloud (refer to diagram 1)
  • You must have a unique instance of vShield Manager for each vCenter Server you add to Cloud Director
  • vCenter and vSphere must be at at least on version 4.0 u2 (Build 264050 for vCenter and 261974 for vSphere) or higher.
  • vShield Manager must be at 4.1 (Build 287872)

Deployment Steps for vShield Manager:

  1. Download the OVF template
  2. Deploy the OVF template into your cluster (remember each cluster needs its own vShield Manager!)
  3. Power up the appliance and log in (User: admin & Pass: default)
  4. At the prompt, type “enable” (i.e. manager# enable) – the setup process will begin.
  5. Enter the IP, Subnet and Default Gateway for vShield Manager.
  6. Reboot!

Noteworthy: There is no need to synchronize vShield Manager with vCenter or register the vShield Manager as a vSphere Client plug-in when using vShield Manager with Cloud Director.


Cloud Director requires the use of SSL to secure communications between clients and servers.  You must create two certificates for each member of the cluster and import the certificates into the host keystores. You need to execute this procedure for each host that you intend to use in your Cloud Director cluster!

  • Cloud Director installer places a copy of keytool in /opt/vmware/cloud-director/jre/bin/keytool
  • You can use signed certificates (by a trusted certification authority) or self-signed certificates (most private cloud implementations)

vCD Network Configurations

The network configuration for vCD is comprised of pools which are undifferentiated. These are used in turn to create vApp networks and various types of organizational networks. At the core is vSphere’s network resources that have VLAN, port groups and isolated network segments. vCD takes these network resource pools and creates routed NAT configurations, internal organizational segments and all of the vApp networks. Each vCD organization (within Cloud Director) can have only one network pool, but multiple vCD’s can share the same pool.

There are basically three types of organizational networks:

  1. Direct Connect (External Organization Network)
  2. NAT or Routed (External Organization Network)
  3. Internal Organization Network

General Guidelines of vCD Deployments

  • The database must be configured to use the AL16UTF16 character set.
  • Cloud Director software is installed on each server host and is then connected to the shared database
  • A network pool “resource” must be created for use with vApp networks or Organizational Networks and prior to the build-out of Organizational or vApp networks. If these two entities are not built, only the direct connect option to the provider network will be available.
  • Each host should have access to a Microsoft Sysprep deployment package
  • Network time is critical! Make sure that all Cloud Director hosts are synchronized since the maximum drift on this is 2 SECONDS!