Alan ( 2 years ago )
Thanks! I can't say I see too many Proxmox deployments. I'll have to check it out. The rsaeon I chose to do my own lab using ESXi is because of it's wide deployment in the enterprise. Now, I am not saying that this is the only hypervisor out there but it does have a great market share. Deploying ESXi not only gave me a nice lab but also experience with a handful of VMware products.
Erika ( 2 years ago )
Hi GSThanks for the great question, and one (as you might excpet) with potentially several answers depending on the implementation, i.e. whether using Standard vSwitches / vDS, Nexus 1000V or VM-FEX. Lets take the most common implementation I tend to do which is vSphere using standard vSwitches.Ok, Thats narrowed us down but still a lengthy topic, so I'll concentrate on the Cisco UCS specific aspects and not so much on the standard VMware config, I/O control etc.. which is equally relevant whatever platform is used and I'm sure you are familiar with.So the first question I tend to address with customers is how do they want their hosts networking to look. What I mean by that is, the client may well have a Networking Standard for their ESXi hosts or want to use their standard host templates, which is fine. But Cisco UCS does have some nice features which could greatly simplify the Hosts networking. Features that you may well already be aware of like Hardware Fabric Failover, where you can present a single vNIC to the OS / Hyper-visor and that vNIC is backed by Hardware fabric failover, i.e. if there is any break in the traffic path on the primary fabric that the vNIC is mapped to then UCS Manager will immediately switch the data path to the other fabric, without the OS ever seeing the vNIC go down. This as you may have guessed could potentially half the number of Network interfaces in your hosts (i.e. you could potentially leave out all the Uplink interfaces which are purely there for redundancy, and you can salt and pepper the remaining single vNICs to be mapped primarily to Fabric A and Fabric B to provide load balancing across both Fabrics.The Potential situation to be aware of here though is if a VM which has its active traffic flow via an Uplink mapped to fabric A is communicating with a VM whose traffic flow is mapped via Fabric B then that flow has to be forwarded to beyond the Fabric Interconnects to the upstream LAN switches to be switched at Layer 2 between fabrics even if both VM's are on the same VLAN.So what I tend to do is use a mixture of both single vNICs backed by hardware fabric failover and dual teamed vNICs for vSwitch uplinks which I would like to load balance across both fabrics.But lets assume the customer wants to retain their Physical Host Networking standard so vSphere admins have a consistent view and config for all hosts whatever platform they are hosted on.So a typical ESXi Host would look something like:2 x Teamed vNICs for Management vSwitch eth 0 mapped to fabric A eth 1 mapped to fabric B1 x vNIC for VMware user PortGroups uplinking to a dVS eth 2 mapped to fabric A1 x vNIC with Fabric Failover enabled for vMotion eth 3 mapped to fabric BOf course you can add other vNICs if you have more networking requirements or require more than a simple port-group (802.1q tag) separation. i.e. an add in an iSCSI vSwitch, Backup vSwitch etc.. So the setup would look something like thisThe reason I go with a single fabric failover vNIC for vMotion is for the potential issue pointed out above, which if I have 2 vNIC uplinks to my vMotion vSwitch and were using them in an Active/Active team for redundancy and load balancing I would map one to fabric A and one to fabric B, that could mean that vMotion traffic is potentially taking a very suboptimal route across the network i.e having to go via the upstream swicthes. so by using only 1 vNIC and mapping it so a single fabric all my East/West vMotion traffic will be locally switched within the Fabric Interconnect and not have to be sent to the upstream LAN at all. And if in the event we had a failure within the primary fabric path UCS would switch this traffic transparently from the ESXi host to the other fabric which would again locally switch all vMotion traffic. Also important to note when teaming the vNICs within vSphere to use Port-ID as the hash, this is to prevent hosts flapping between fabrics in the eyes of the upstream LAN switches.OK once the above its setup you do have the option of mapping UCS QoS policies to each of the above vNICs within UCS Manager (by default all traffic is placed in a best effort policy)As a standard I generally set a 1Gbs reservation for the vMotion vNICs and leave the others as default. Bearing in mind that these are multiple 10Gbs links and the QoS would only kick in in the event of congestion. NB) FCoE traffic is inherently prioritised within the 802.1Qbb Priority-based Flow Control standard a sub component of the Data Center Bridging (DCB) standard which Cisco UCS inherently uses. between the Mez Card on the blade and the Fabric Interconnect.Ok, so with reagrds to Northbound load balancing, as you may know when you create the vNIC within the Mez card what you are actually creating is a Veth port within the Fabric Interconnect, as the Mez card (Cisco VIC) is an adapter Fabric Extender.So when you create your teamed pair of vNICs within vSphere that will only get your load balanced traffic to the fabric Interconnects. Now assuming you are running your fabric Interconnects in the default end host mode (Where the FI's appear to the upstream LAN as a Big Server, The FI's obviously need load balancing uplinks into the LAN. Now for redundancy you will likely have a pair of LAN switches hopefully capable of running a Multi-Chassis Ethernet service live Nexus vPC or Catalyst VSS. If thats the case you just size your uplinks to what you want and dual connect your FI's to the upsteam switch pair and channel them at both ends (Standard LACP).As shown belowThe end to end result is that load balancing is done safely and optimally and East/West traffic is maintained within the UCS Infrastructure as much as possible.Hope that answers your question, if not fire back at me, after all us Guru's need to stick together RegardsColin