Start vCenter with Connectivity from ESX when DVS is Not Working

When working with virtual environments like VMware, one of the crucial components for management is vCenter, which acts as the brain of the virtual data center. Everything works wonderfully until, for some reason, we lose network connectivity in vCenter, and this is where things can get really complicated, especially if you are using Distributed Virtual Switches (DVS). In this article, I will guide you step by step on how to recover your vCenter and bring it back to life, even when all seems lost.

What Happens When vCenter Loses Network Connectivity?

The symptoms of this problem are quite clear:

  • The management network only exists on a DVS.
  • There are no ephemeral ports configured on the cluster.
  • vCenter loses network connectivity after an unplanned or planned outage.
  • You cannot reconnect vCenter to a DVS port group on the same hosts or on different ones.
  • You cannot open the vSphere client of vCenter to make changes to the network because the network connection of vCenter is down.

When you try to modify the network configuration on any ESXi host, or if you want to change the network adapters for an ESXi host connected to a DVS with non-ephemeral ports, you will encounter the following error:
“Adding or reconfiguring network adapters connected to non-ephemeral virtual distributed port groups is not supported.”

Causes of the Problem

If vCenter is connected to a Distributed Switch and loses access to the network, vCenter will not be able to connect to a distributed port because it does not have access to the ESXi hosts.
VMware recommends configuring ephemeral ports for the management network in your environment to prevent this problem from occurring again.

Impact and Risks

You should have at least 2 vmnics used for the management network because in one of the steps we will remove a vmnic from the DVS management port group to be able to use it for the temporary Standard Switch.
WARNING: If the vmnics are in an LACP configuration, you will need to break it on the physical switch to avoid downtime. Follow this KB for steps on how to work with an LACP configuration.

If you do not have 2 vmnics on the ESXi host, it is recommended that you follow these steps through the DCUI Shell. Otherwise, you will lose access to SSH when you run the vmnic removal command and will not be able to continue with the process.

Step by Step Solution

Step 1: Remove a vmnic located in the DVS connected to the Management Network
Identify the port ID where the vmnic you want to remove is connected to the DVS:

# esxcli network vswitch dvs vmware list | egrep "Client: vmnic#" -A3

The output will be similar to:

# esxcli network vswitch dvs vmware list | egrep "Client: vmnic1" -A3
Client: vmnic1
DVPortgroup ID: dvportgroup-5008
In Use: true
Port ID: 12

Remove the vmnic:

# esxcfg-vswitch -Q vmnic# -V PortID DVSName

Example using vmnic1, Port ID 12, and DVS Name ProdSwitchDVS:

# esxcfg-vswitch -Q vmnic1 -V 12 ProdSwitchDVS

Step 2: Create a Standard Switch, a Portgroup, and Add the vmnic to the Standard Switch

Create a Standard switch:

# esxcli network vswitch standard add --vswitch-name=vSwitchName

Create a Portgroup:

# esxcli network vswitch standard portgroup add --portgroup-name=PortgroupName --vswitch-name=vSwitchName

Add a vmnic to the Standard Switch:

# esxcli network vswitch standard uplink add --uplink-name=vmnic --vswitch-name=vSwitchName

Step 3: Recover Network Connectivity of the vCenter Virtual Machine

First, we will connect the vCenter virtual machine to the newly created Portgroup of the Standard Switch. This will help to recover access to vCenter’s network, allowing the ESXi hosts to reconnect to the vCenter server, and you will be able to manage your infrastructure again.

  • Log in to the ESXi vSphere client with administrator credentials.
  • Go to “Virtual Machines”.
  • Select the vCenter virtual machine.
  • Click “Actions” > “Edit Settings”.
  • Connect Network Adapter 1 to the newly created Portgroup of the Standard Switch.
  • Click Save.

At this point, you should have recovered the network connectivity of vCenter and you should now be able to connect to its vSphere client. If you still can’t, make sure that the Portgroup of the Standard Switch has the correct VLAN and MTU configuration.
Once you have verified that everything is fine in your vCenter inventory, migrate vCenter back to the DVS to have the same configuration as before the outage.

Step 4: Migrate the vmnic Back to the DVS

Now, let’s return the vmnic to the DVS by following these steps:

  • If you have not logged in to the vCenter vSphere client, log in with administrator credentials.
    Go to the “Networking” tab.
  • Right-click on the DVS and select “Add and Manage Hosts”.
  • Select “Manage the host networking” and click Next.
  • Click on “Attached hosts…”.
  • Select the ESXi host with the vmk and vmnic that you want to add back to the DVS and click OK.
  • Click Next.
  • In the “Management Networks” list, select the vmk and click “Assign” to assign it to the desired management port group. Click Next.
  • In the “Physical Adapters” list, select the vmnic and click “Assign” to assign it to the DVS. Click Next.
  • Click Next and then Finish.

Step 5: Migrate vCenter Back to the DVS

  • Go to “Virtual Machines”.
  • Select the vCenter virtual machine.
  • Click “Actions” > “Edit Settings”.
  • Change Network Adapter 1 back to the original DVS port group.
  • Click Save.

Step 6: Remove the Temporary Standard Switch and Portgroup

Remove the Portgroup from the Standard Switch:

esxcli network vswitch standard portgroup remove --portgroup-name=PortgroupName --vswitch-name=vSwitchName

Remove the vmnic from the Standard Switch:

esxcli network vswitch standard uplink remove --uplink-name=vmnic --vswitch-name=vSwitchName

Remove the Standard Switch:

esxcli network vswitch standard remove --vswitch-name=vSwitchName

Conclusion

This process, though quite complex and meticulous, can save the day when vCenter is down due to a network issue. Ensuring that you have a proper backup and understanding the VMware infrastructure and its dependencies is key to successful recovery. It’s highly recommended to have a plan for such scenarios and ensure that the team managing the infrastructure is familiar with these procedures to ensure a quick and smooth recovery when required.

Leave a Reply