One problem I ran into after deploying a few thin clients to users: how do they reboot their virtual machines? Usually they don't need to, but after a week or two of just logging off, things can get a little stale. And we instruct our (normal) users to restart their computer every night, so if we push an update out, then their computer is in the perfect state.
Some connection brokers actually interface directly with VirtualCenter, and can power on and off machines as users need them. We don't really have a connection broker per say. We have a few HP thin clients that we purchased, and we are transforming older PC's into thin clients (using the RDP client in Windows XP). We're also testing 2x's ThinClientServer and ThinClientOS, but it's not really ready for prime-time.
So my idea is to have a script that would run nightly. It would check all of our virtual desktops, and reboot any of them that are not logged on. It will be a batch script, so I ran some tests to see what programs show up in TASKLIST during different stages. After looking over the results, I think I can use the existence of explorer.exe in TASKLIST to decide if a user is logged in or not.
Now I need a list of virtual desktops. I looked through the SQL tables for VirtualCenter. One table has a list of all virtual machines in the inventory: VPX_VM. There are a couple of columns we can use in here:
- DNS_NAME: we can pass this on to the shutdown command to reboot the machine
- POWER_STATE: no sense in trying to reboot a machine that is off
- ANNOTATION: this notes field could be used to specify which VMs are eligible for a reboot (I'm not going to use these, but it might be useful for someone else)
- GUEST_OS: maybe we only want to reboot machines matching "winXPProGuest"
I need to make a SQL query to return a list of virtual desktops. I don't want any NULLs from DNS_NAME, machines that are powered off (POWER_STATE <> 0), or machines that don't match winXPProGuest. Here's what I came up with:
select DNS_NAME from VirtualCenter.dbo.VPX_VM where POWER_STATE <> 0 and GUEST_OS = 'winXPProGuest'
I'm going to run the script once a day on our VirtualCenter server. I typically keep a directory called C:\Scripts for these sorts of things, so that I don't lose track of them.
SQL 2005 has a command-line utility called SQLCMD.EXE. It's pretty picky about dependencies, so I installed the workstation/client components from the SQL disc to the VirtualCenter server, and after that it worked fine. After the install, I copied the SQLCMD files (SQLCMD.EXE, batchparser90.dll, and SQLCMD.rll) from our SQL 2005 Server installation to this directory. They can be found under \90\Tools\Binn\.
SQLCMD and TASKLIST can each take a specified username and account. For SQLCMD, any user with read access to the VirtualCenter DB will be fine. And for TASKLIST, I'll use the local administrator on each of the machines. TASKLIST doesn't have the ability to specific an account to use, so use whatever is good for your situation. I'll add the computer account for my VirtualCenter server (example: VCSRV$) to the Administrators group on each of our virtual desktops, and run the scheduled task as Local System.
Here's the command to get the list of machines using SQLCMD. I added a FIND pipe to remove any unnecessary clutter from the output. Using most of the FQDN in FIND will return just our VMs.
sqlcmd -S SQLSERVER -U Username -P Password -Q "select DNS_NAME from VirtualCenter.dbo.VPX_VM where POWER_STATE <> 0 and GUEST_OS = 'winXPProGuest'" -W | find /i ".subdomain.domain.com"
This returns a list of servers, sort of like this:
JOHNDvm.subdomain.domain.com
JANEDvm.subdomain.domain.com
So for each result, I want to first check if the user is not logged on (winlogon.exe is not running), and then issue a shutdown command. The batch script will be formed generically like so:
For each line of the SQL results, call :ParseResult
Goto the end of the file
:ParseResult
Is this machine running winlogon.exe?
If it is not, then reboot it.
Wait for a minute so that we don't overload the ESX server
Exit this sub so I can check the next one.
So here's the actual code. I'm nesting the SQLCMD inside of the FOR statement, which might seem confusing, but is really the best way to do it.
for /F %%a IN ('sqlcmd -S SQLSERVER -U Username -P Password -Q "select DNS_NAME from VirtualCenter.dbo.VPX_VM where POWER_STATE <> 0 and GUEST_OS = 'winXPProGuest'" -W ^| find /i ".subdomain.domain.com"') do call :ParseResult %%a
::Notice the ^ before the pipe, this is required inside of a FOR statements
goto :EOF
::Go to the end of the file when the FOR statement finished
:ParseResult
::This is called by the FOR statement and gets passed the DNS name for each VM
set ThisHost=%1
::I like to make it an actual variable before doing anything with it
tasklist /S %ThisHost% /U %ThisHost%\Administrator /P Password /FI "IMAGENAME eq explorer.exe" | find /i "explorer.exe" >nul
::Lists if explorer.exe is running on this host. The find is there to set the errorlevel.
if %errorlevel%==0 exit /b
::If it finds explorer.exe, forget about it.
shutdown -f -r -t 300 -m \\%ThisHost% -c "Message to users"
::Shutdown the machine: force, restart, wait 5 minutes, the target, and a message
sleep 120
::Wait for a minute before going on to the next one.
exit /b
::Continues to the next line inside of the FOR statement.
Obviously, you'll need to change usernames, passwords, and server names to match your environment.
You might ask yourself why I specify a filter for explorer.exe even though I'm going to do a FIND right after that. Well, I've noticed some issues with TASKLIST and FIND; it seems inconsistent. After switching to this method, it works fine, so that's enough reason for me.
Also, you'll need the
SLEEP command to run the above script. If you only have 3 or 4 virtual desktops, staggering them isn't really an issue. But what if your ESX servers and VirtualCenter have 50 or 100 machines rebooting at the same time? It's good to stagger them a little, and you should adjust the time depending on your environment.
Some things that could be better about all of this:
- POWER_STATE might not be 0, but the machine could still be in another state that wouldn't let us reboot it. I can't readily find a schema for this field/table, but it's not a big deal.
- Maybe we want VMs to reboot once a week. Well if I schedule it for once a week, and someone stays logged on over the weekend, now we're at two weeks. So if the script still ran nightly, but skipped machine that haven't been on for a week yet, that would be cool. SYSTEMINFO | FIND /i "System Up Time:" would do this. But again, not a big deal.
- Maybe it should turn on VMs that are off every morning before users get in to work. This would require API scripting (or a real connection broker), so maybe something for the future.
All that's left is to make a scheduled task for the script (and pipe the output to a log file if you're interested). It does a very simple specific task, and I'm pretty happy with the result. I'd love to hear feedback if anyone else tries this.