How to quit ESXi SSH and leave background tasks running

In Linux when a console session is closed most background jobs (^Z and bg %n) will stop running when the parent ( the ssh session ) is closed because the parent sends a SIGHUP to all its children when closing (properly). Some programs can catch and ignore the SIGHUP or not handle it at all hence passing to the root init parent. The disown command in a shell removes a background job from the list to send SIGHUPs to.

In ESXi there is no disown command. However there is a way to close a shell immediately without issuing the SIGHUPs :

exec </dev/null >/dev/null 2>/dev/null

The exec command will run a command and switch it out for the current shell. Also this command will make sure the stdio and stderr are piped properly.

nJoy šŸ˜‰

Automatically passing ssh password in scripts especially to ESX where passwordless ssh is hard

First you need to installĀ sshpass.

  • Ubuntu/Debian:Ā apt-get install sshpass
  • Fedora/CentOS:Ā yum install sshpass
  • Arch:Ā pacman -S sshpass

Example:

sshpass -p "YOUR_PASSWORD" ssh -o StrictHostKeyChecking=no YOUR_USERNAME@SOME_SITE.COM

Custom port example:

sshpass -p "YOUR_PASSWORD" ssh -o StrictHostKeyChecking=no YOUR_USERNAME@SOME_SITE.COM:2400

from :Ā https://stackoverflow.com/questions/12202587/automatically-enter-ssh-password-with-script

 

This works better for me though for sshfs:

echo $mypassword | sshfs -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no user@host mountpoint -o workaround=rename -o password_stdin 

 

 

nJoy šŸ˜‰

 

Resizing a disk once the volume has grown on host in a VM (Ubuntu)

The disk will not automatically resize on many platforms once more disk space was made available.

Particularly in Ubuntu 16.04

Rescanning the device for size maps :

echo 1 > /sys/class/scsi_device/2\:0\:0\:0/device/rescan

Then standard procedures to grow the fs apply.

nJoy šŸ˜‰

 

 

Longevity and stability of sshfs

Just worth noting

I had many arguments with other sysadmins and bosses over the stability and validity of sshfs as a shortcut to creating a bridge for transferring files for backup or adhoc moves

 

The following is the result of an rsync from one sshfs mount to another (from ESXi servers I only had ssh access to )Ā  over a slow link surpassed my expectations, by far:

 

sent 1342341123955 bytesĀ  received 91 bytesĀ  3495447.89 bytes/secĀ 

total size is 1342177283599Ā  speedup is 1.00

 

Yes 1.3TB VM image taken with GhettoVCB moved over the arch of about 4.4 days with no loss or corruption.

 

Amazing…

Just worth saying in Linux where there is a will there is a way. In Windows the same move was dying out and never recovering, we tried simple file copy using the vmware browser download, veem ( gave up the move for some wierd licensing reason) , robocopyĀ  you name it.

Solution :

 

mount both servers to a vm over ssh using sshfs and rsync across. The process was slow for the following reasons:

  1. The lack of compression.
  2. Encryption on both tunnels across ( ssh )
  3. VMware expanding the image as it is pulled from the VMFS (as it does)
  4. Network badwidth is slow and shared with literally hundreds of machines
  5. Some other reasons like disk latency ( no cache on the controller) etc..

 

I am just saying sometime even relatively inelegant solutions can give surprising results in Linux environments.

Ode to stability and the fortune of having such amazing free tools to work with.

 

šŸ™‚

 

 

 

 

Accessing ESX management interface (DCUI) from ssh

Access the ESXi Direct Console User Interface (DCUI) over SSH

When not in position to go to the DC to access the ESX text mode interface use the DCUI command:

First you need to enable and start Remote Tech Support (SSH). This is done for the ESXi Host in Configuration -> Software -> Security Profile


Use an SSH client (putty) to connect to the ESXi host.

Once logged in simply run dcui

~ # dcui

Look familiar?Ā Want to change the color to look like the console – check out this post.

To exit DCUI and return to the prompt use CTRL-C

KB article : here

 

nJoy šŸ™‚

Enable VMWare time sync from command line

Virtual machines and NTP do not go easily well together. Machines paused for extended periods tend to loose the ntp sync since the difference grows too much.

 

Also from VMware Docs the following info might be clarificatory:

  • Do not configure the virtual machine to synchronize to its own (virtual) hardware clock, not even as a fallbackĀ with a high stratum number. Some sample ntpd.conf files contain a section specifying the local clock as aĀ potential time server, often marked with the comment ā€œundisciplined local clock.ā€ Delete any such serverĀ specification from your ntpd.conf file.
  • Include the option tinker panic 0 at the top of your ntp.conf file. By default, the NTP daemon sometimesĀ panics and exits if the underlying clock appears to be behaving erratically. This option causes the daemon toĀ keep running instead of panicking.
  • Follow standard best practices for NTP: Choose a set of servers to synchronize to that have accurate time andĀ adequate redundancy. If you have many virtual or physical client machines to synchronize, set up someĀ internal servers for them to use, so hat all your clients are not directly accessing an external low-stratum NTP server and overloading it with requests

And :

In ESX, the ESX NTP daemon runs in the service console. Because the service console is partially virtualized, withĀ the VMkernel in direct control of the hardware, NTP running on the service console provides less precise time than in configurations where it runs directly on a host operating system. Therefore, if you are using native synchronization software in your virtual machines, it is somewhat preferable to synchronize them over the network from an NTP server that is running directly on its host kernel, not to the NTP server in the service console. In ESXi, there is no service console and the NTP daemon runs directly on the VMkernel, so it works well as a NTP server for virtual machines.

Quoted from :Ā http://www.vmware.com/files/pdf/techpaper/Timekeeping-In-VirtualMachines.pdf

Easy way (And I think best solution is)

  1. Setup NTP client to ESX
  2. Install VMWARE tools on the Guests ( recommended anyways).
  3. In Linux ( cause that’s what we care about ) runĀ vmware-toolbox-cmd timesync enable

To force a sync run hwclock command.

nJoy šŸ™‚ !

 

 

ESXi 5.1 : Fixing ā€˜Failed to deploy OVF package: The task was canceled by a user.

Where I work, we love using OVA templates to speed up our deployment of virtual machines. I recently upgraded one of my servers to ESXi 5.1 (which also required an update to vSphere). ESXi 5.1 provides support for Windows 8 and Server 2012, which is incredibly useful. However, whilst building OVA templates for these operating systems, I stumbled across an issue.

I ran through the ā€˜New Virtual Machine’ wizard, selecting Windows 8 (or Server 2012), leaving all settings default. Installed my operating system, and made the required customisations, shutdown the machine and exported an OVA template through vSphere – excellent, how easy!

However, whilst trying to re-deploy the OVA to the ESXi 5.1 host, through the ā€˜Deploy OVA template’ wizard, it failed immediately after completing the wizard (right before it shows the deployment progress bar). Now, I have a particular hate for misleading error messages, and this one seems to fall right in-to thatĀ category –

Failed to deploy OVF package: The task was canceled by a user.

How misleading. I, or any other user, certainly didn’t cancel the task. So what happened? I took a look through the (horrendous) hostd.log on the ESXi box and found absolutely nothing of any value.

Frustrated by the inability to redeploy a template I spent so long preparing, I broke open the OVA template and took a look inside. There were three files with different extensions,

  • .ovaĀ –Ā OVF descriptor, written in XML, which describes the hardware requirements
  • .mf – contains SHA1 checksums of the .OVA and .VMDK
  • .vmdk – the virtual hard disk for the virtual machine.

I immediately discarded (renaming to .mfx will do the trick) theĀ .mf. If you modify theĀ .ovaĀ and don’t update theĀ .mf, it’ll complain that the checksum is invalid. Removing this file seems to prevent vSphere from checking the checksums, which is useful, seeing as we want to poke around theĀ .ova. After fiddling around inside theĀ .ova, I stumbled across the following line…

<rasd:ResourceSubType>vmware.cdrom.iso</rasd:ResourceSubType>

Changing the above line, to read…

<rasd:ResourceSubType>vmware.cdrom.atapi</rasd:ResourceSubType>

…appears to have fixed my deployment issues. Perhaps changing the ā€˜CD Drive Device type’ in the virtual machine’s settings would’ve fixed it. But by that point, I had already exported the OVA and deleted the source virtual machine.

Hopefully someone will stumble across this one day, and it’ll save them a few hours!