vCenter Maintenance - Issue Non-ephemeral Portgroup

Hi Friends,

We had a issue last day due a miss configuration. We used to keep our vCenters on ephemeral port groups for troubleshooting purposes in case of any issue.

We have a new vCenter now and as per standards our project team put that on a portgroup which was named as "ephemeral", but it was quiet late we found it was not actually setup as an ephemeral port group, but static instead :(. So issue was a config issue, but I wanted to share how we solved it hopping it might help someone in other situations too.

The story is like this.  We planned to test an outage situation, by bringing down vCenter, to test if our VRA pre-build check can identify it. One of our lazy engineers went to Web Client of the ESXi server hosting vCenter and decided to turn the NIC off for the test. He received below error while turning it back on


Description :- Reconfigure this virtual machine
Virtual machine :- Non-PROD_vCenter
State -: Failed - Invalid configuration for device '3'.


He was stuck there and it was escalated to us. So when I tried to re-select the portgroup from the dropdown it clearly said the portgroup is not and ephemeral port group

"Addition or reconfiguration of network adapters attached to non-ephemeral distributed virtual port groups <PG-Name> is not supported."



and we now know what the issue is. As you may know, and ESXi host has control over ephemeral port group. But not static. Static ports can only be managed by vCenter. 

Ok so to fix issue and to get vCenter back online, below was my plan

1) Create a vSphere Stadard Switch (VSS) on the ESXi host
2) Remove one of the two MGMT Uplink of the ESXi which is connected to vSphere Disttibuted Switch (VDS)
3) Connect that NIC to the uplink or VSS
4) Create a new "vspheremgt" on VSS with same VLAN ID
5) Put vCenter to the new VSS Port Group

So a new VSS was created. Now to detach a NIC which is already connected to VDS, this cant be done via WebClinet. You need to SSH in to the ESXi host

Then issue below command to list all switches and port IDs 


esxcli network vswitch dvs vmware list

As you can see in the example here, 


VDS_MGT_001    Name: VDS_MGT_001    VDS ID: 50 16 aa 43 dd 18 d0 f5-01 0c 24 29 f8 1e 4s fj    Class: etherswitch    Num Ports: 11776    Used Ports: 14    Configured Ports: 512    MTU: 9000    CDP Status: both    Beacon Timeout: -1    Uplinks: vmnic4, vmnic2    VMware Branded: true    DVPort:          Client: vmnic2          DVPortgroup ID: dvportgroup-39          In Use: true          Port ID: 40
         Client: vmnic4          DVPortgroup ID: dvportgroup-39          In Use: true          Port ID: 41
         Client: vmk0          DVPortgroup ID: dvportgroup-44          In Use: true          Port ID: 15
         Client: vmk3          DVPortgroup ID: dvportgroup-45          In Use: true          Port ID: 23
         Client: TestServerTest.eth1          DVPortgroup ID: dvportgroup-447          In Use: true          Port ID: 49
         Client: PowerPathAppliance.eth0          DVPortgroup ID: dvportgroup-529          In Use: false          Port ID: h-2
         Client: POCTest2.eth0          DVPortgroup ID: dvportgroup-47          In Use: true          Port ID: 33
VDS_GUEST_001    Name: VDS_GUEST_001    VDS ID: 50 06 81 23 7c bf 6f 31-a7 g3 h9 8d dc a4 b8 6b    Class: etherswitch    Num Ports: 11776    Used Ports: 10    Configured Ports: 512    MTU: 9000    CDP Status: both    Beacon Timeout: -1    Uplinks: vmnic5, vmnic3    VMware Branded: true    DVPort:          Client: vmnic3          DVPortgroup ID: dvportgroup-163          In Use: true          Port ID: 40
         Client: vmnic5          DVPortgroup ID: dvportgroup-163          In Use: true          Port ID: 41
         Client:          DVPortgroup ID: dvportgroup-163          In Use: false          Port ID: 42
         Client:          DVPortgroup ID: dvportgroup-163          In Use: false          Port ID: 43
         Client:          DVPortgroup ID: dvportgroup-163          In Use: false          Port ID: 44

the output tells me…

1) Tells me that there are two VDS VDS_MGT_001 and VDS_GUEST_001
2) VDS_MGT_001 - has got two uplinks vmnic4, vmnic2
3) VDS_GUEST_001 - has got two uplinks vmnic5, vmnic3
So we are looking at the MGMT VDS and heighted in Red and listed below are the information you need to collect

1) VDS Switch Name -: VDS_MGT_001
2) Which is the uplink that we need to detach from VDS?  -: I decided to go with vmnic4
3) What is the port ID of vmnic4? -: 41

From this information collected, I need to run below command to detach vmnic4 from the VDS


esxcfg-vswitch -Q vmnic4 -V 41 VDS_MGT_001
Once that's done, go back to ESXi's WebClient and setup up vmnic4 as the new uplink and create a portgroup with the VLAN ID required for your machine. Once you change VM setting to the new PG created, VM will be back on the network :)











Comments

Popular posts from this blog

vMotion Failing at 21% with error ""The vMotion failed because the destination host did not receive data from the source host on the vMotion network. Please check your vMotion network settings and physical network configuration and ensure they are correct."

applmgmt service wont start on PSC Appliace post converge operation