Automating TFS Build Server deployment with SCVMM and PowerShell

Richard and I have been busy this week. It started with a conversation about automating the installation of new build servers. Richard was looking at writing PowerShell to install and configure the TFS build agent, along with all the various SDKs that we use across all out projects. Our current array of build servers have all been built by hand and each has a different set of SDKs to build specific project types. Richard’s aim is to make a single, homogenous build server configuration so we can then scale out for capacity much more quickly than before.

Enter, stage left, SCVMM. For my part, I’ve been looking at what can be done with VM Templates and, more importantly, service templates. It seemed to me that creating a Build Server service in SCVMM with a standard template would allow us to quickly and easily add and remove servers to the group.

There isn’t much written about the application/script side of SCVMM server templates, so I thought I’d write up my part.

Note: I’m not a System Center specialist. We use Config Manager, Virtual Machine Manager and Data Protection Manager at Black Marble for our own services rather than being a System Center partner.

Dividing up the problem space

Our final template uses a single PowerShell script to perform the configuration and installation work. Yes, I could have created steps in the service template to install each of the items Richard’s script deployed, but we decided against that. The reasoning is relatively simple: It’s much easier to modify the PowerShell script to add, remove or change the stuff that gets deployed. It’s hard to do that with SCVMM, as far as I can tell.

However, during testing I discovered that if I added windows roles and features through the template it was faster than when the various installers Richard called in his script triggered the feature addition.

The division of labour, then, became the following:

SCVMM Tasks:

  • VM Template is created for the target OS. The VM template is configured to automatically join the new machine to our domain and place the machine in the correct OU. It also sets the language correctly. More on that later.
  • Service Template is created for a Build Servers service. It’s a single tier service that has a minimum of one machine and a maximum of twenty (that maximum is a bit on an arbitrary value on my part). The service template adds the roles and features to the machine definition and runs two script application blocks:
    • The first simply runs xcopy to pull the contents of a folder on a share to the local PC. I do this because running a power shell PowerShell script from a network share doesn’t work – in an interactive session you are prompted before the script executes because it’s from an untrusted location and I haven’t worked out how to suppress the prompt yet.
    • The second application executes powershell.exe and feeds in the full path to Richard’s PowerShell script, newly copied onto the local disk.

Step 1: VM Template

There’s a wealth of information about creating VM templates in the internet, so I’m not going to cover this in depth. I did, however, want to pull out a couple of things I discovered along the way.

I installed my base VM with the UK English regional settings. When SCVMM converted that into a template via sysprep, the resulting machine comes up in English US, which is really annoying. Had I been paying attention, I would have noticed that we already had an unattend.xml file to correct this, which I could have referenced in the VM template settings. However, I found a much more interesting way to address the issue (which of course led me down another rabbit hole).

A bit of research led me to a very interesting post by Gunter Danzeisen. In it he shows how to use powershell to modify an unattendsettings property of the VM template within System Center. This is at the same time both irritating and enlightening.

It’s irritating, because I am truly fed up of ‘hidden’ functionality in products that causes me pain. The VM Template clearly allows me to specify an unattend.xml file, so why have an internal one as an object property. Moreover, why not simply document it’s existence and let me modify that property – why do I need two different methods which then makes me constantly wonder which gets priority.

It’s enlightening, because I can modify that property really easily – it’s simply a collection of name/value pairs that marry against the unattend.xml settings.

There is a bit of snag with this approach, however, which I’ll come onto in a little while.

Anyway, back to the plot. I followed Gunter’s advice and used PowerShell to set the language values of the internal unattend. I then decided to use the same approach to see if I could add other settings – specifically the destination OU for the server when added to AD.

The PowerShell for the region settings is below:

$template = Get-SCVMtemplate | where {$_.Name -eq "My VM Template"}  $settings = $template.UnattendSettings;  $settings.add("oobeSystem/Microsoft-Windows-International-Core/UserLocale","en-GB");  $settings.add("oobeSystem/Microsoft-Windows-International-Core/SystemLocale","en-GB");  $settings.add("oobeSystem/Microsoft-Windows-International-Core/UILanguage","en-GB");  $settings.add("oobeSystem/Microsoft-Windows-International-Core/InputLocale","0809:00000809");  Set-SCVMTemplate -VMTemplate $template -UnattendSettings $settings

For reference, removing a setting is easy – simply reference the name of the setting when calling the remove method:

$settings.remove("oobeSystem/Microsoft-Windows-International-Core/UserLocale"); 

I then set the destination OU with the following setting:

$settings.add("specialize/Microsoft-Windows-UnattendedJoin/Identification/MachineObjectOU","OU=FileServers,DC=mydomain,DC=local"); 

The Snag

There is a problem with this approach. If you use an unattend.xml file, you can override that setting when you add the VM template to your service template. However, whilst I could find the unattendsettings property of the VM when referenced by the template, I couldn’t modify it.

If we access the Service Template object with:

$svctemplate = Get-SCServiceTemplate | where {$_.name -eq "TFS Build Service"}

We get an object that contains one or more ComputerTierTemplates (depending on how many tiers you gave your service). Each of those has a VMTemplate object that holds the information from our original VMTemplate, and therefor has our unattendsettings.

$svctemplate.ComputerTierTemplates[0].VMTemplate.UnattendSettings

So, we can grab those settings and modify them. Great. The trouble is, I haven’t found a way to update the stored configuration. Set-SCServiceTemplate doesn’t let me stuff the settings back in the same was as Set-SCVMTemplate does, and you can’t use the latter with a reference to the VMTemplate child of our template.

Now, I decided that I would create a copy of my original VM template just for Build Servers, so I could set a different target OU for the servers. In hindsight, I’m not sure whether this is better than overlaying unattend.xml files, and I haven’t experimented with how the unattend.xml might interact with the unattendsettings yet either. If you try, please let me know how you get on.

Step 2: Service Template

Once I’d got my VM Template sorted, the next step was to create a service. There’s a pretty nice design surface for these that allows you to pick a ‘starter’ template with the right number of tiers, although it’s dead easy to add a new tier.

I started with the Single Machine template, which gave me a single tier. You then need to drag a VM template from a list of available templates onto the tier. The screenshot below shows my single tier Service. The VM template has a single NIC and is configured to connect to my Black Marble network.

image

The light blue border on the large box (the service tier) indicates it’s selected. That will show the tier properties at the bottom of the design window. In here I have set a minimum and maximum number of servers that can be deployed in the tier.

image

Notice also the availability set option – if I needed to ensure that VMs in this service were spread across multiple hosts for resilience I could tick this option. I don’t care where build servers get deployed (they go onto our Lab VM hosts and are effectively ‘disposable’) so I have left this alone.

Open the properties of the tier (right-click or choose View All Properties in the property pan) and a dialog opens with machine properties. In here I have configured the roles and features for the build server (I deliberately haven’t set these in the VM Template so I can have fewer, more general VM templates).

image

Also in here are the Application Configuration settings that cause the VM to run Richard’s PowerShell. The first is a simple one that references cmd.exe to run xcopy. All the settings on this are default.

image

The second app runs Powershell.exe and passes in a file parameter. This was a source of much frustration – I wanted to use the –ExecutionPolicy parameter to ensure the script ran successfully but if I added this (as the first parameter, –File has to be the last one) the whole command failed. As it happens I set the execution policy in Group Policy for all the Build Servers but I like the belt and braces approach.

The biggest point here is the timeout setting. Richard’s script can take an hour or so to run, so the timeout is a BIG number. The first few times I deployed the script task failed because of this, although in reality the script itself was still running happily and completed fine.

image

I have changed the advanced settings for this script, though. To make debugging a little easier (the VM is a black box whilst deploying, so it’s tricky to see what’s going on) I have directed standard output and standard errors to a file. I’ve also turned off the options to automatically ‘detect’ failures through watching output, error and exit codes. Richard’s script can be run repeatedly with no ill-effect, so I’ve left the restart option to restart the script. That means that if I restart the deployment job from SCVMM if it fails, the script will be run.

image

Once the service template is created, I deployed the service with a single server. We then added a second server by scaling out the tier.

image

We can do this via the SCVMM console, or using the virtualmachinemanager PowerShell module:

$serviceInstance = Get-SCService -Name "TFS Build Servers" $computerTier = Get-SCComputerTier -Service $serviceInstance | where { $_.Name -eq "TFS Build Server" } New-SCVirtualMachine -ComputerTier $computerTier -Name "Build03" -Description "" -ReturnImmediately -ComputerName "Build03" -StartAction "NeverAutoTurnOnVM" -StopAction "SaveVM"

Lessons Learned

It’s been an interesting process overall. I think we’ve made the right choice in using a single, easily modifiable powershell script to do the heavy lifting here. Yes, I could have created Application Profiles in SCVMM for each of the items Richard installed, but it would have been harder to make changes (and things like the Azure SDK are updated faster than I can blink!).

I’m still considering whether my choice of adding the domain location to the unattendsettings in the VM template object was a good choice or not. I’m happy that the language settings should go in there – we never change those. I need to experiment with how adding an unattend.xml file affects the settings in the template object.

Service Templates are a great way to go. When you think about it, most of our IT systems tend to be service-focused. Using service templates for things like SharePoint or CRM are a no brainer, but also things like web servers, where we have a number of relatively heterogeneous VMs that host internal web services or sites. Services in SCVMM collect those VMs into manageable groups that can be easily spread across multiple hosts for resilience if required. It’s also much easier to find VMs in services than in a very long list of hosts!

In terms of futures, I’m interested in where Desired State Configuration will take us. Crafting the necessary elements for this project would be extremely complicated with DSC right now, but when all the ducks are in order it should make life much, much easier, and DSC is certainly on my learning list.

Our TFS Lab Management Infrastructure

Richard and I spend a good deal of time talking about Lab Manager and our environments. I’ve written here before about our migration to the latest versions of the various components of Lab and both Richard and I have delivered sessions at user groups and conferences.

Richard was in Belgium last week for Techorama, after which he was asked about the specifics of our setup. Between us, we came up with a diagram of our Lab Environment and Richard recently posted that to his blog. Hopefully some of you will find it useful.

Fixing Lab Manager environments with brute force

As you’ve probably seen, our Lab Manager/SCVMM 2008 R2 upgrade to SCVMM 2012 SP1 was not the smoothest in the world. The end result was a clean lab manager and SCVMM install, but a raft of virtual machines that had previously been part of environments.

In tidying up, Richard and I learned a few things about picking apart VMs that were once part of an environment such that a new environment could be built form the wreckage.

There are two approaches to getting what you need: Firstly, you could simply compose the existing virtual machines into a new environment without storing in, and deploying from SCVMM. Secondly, you could pull the VMs back into SCVMM such that you could build a new environment.

Don’t forget to fix the networks

If you want to use the running VMs you will need to make sure that you have recreated any private network generated by Lab Manager. These are all helpfully listed in the XML configuration file of the VMs. They are normally named Lab_<GUID>_NI so are easy to find in the file. On the hyper-v host, using hyper-v manager you will need to create a new private virtual network with the name you just found. You should then attach the synthetic network adapter of your VMs (not the legacy network adapter) to this private network. If you have a DC, and you told Lab Manager it was a DC, then you are likely to need to hook its legacy adapter to the private network as well.

Scenario 1: Pull existing machines into an environment

The big problem you are likely to find here is that whilst you have imported the VMs onto your hyper-v server and SCVMM can see the machines just fine, Lab Manager refuses to show them to you.

The reason for this is that Lab Manager believes the VMs are currently part of an environment, just not one it currently has. It therefore hides the VMs from you. It turns out that this is pretty straightforward to fix. In the notes field of the running VM settings you will see a block of XML. That is read by Lab Manager to identify the VMs in environments. Simply delete that xml and the machine will now show up in Lab Manager as being available to compose into an enviroment.

Scenario 2: Get the VMs back into SCVMM to build a new environment and deploy it.

This is a trickier situation and one which needs to follow the steps I talked about in my previous post about building VMs for Lab Manager.

The problem here is not just the XML, but that Lab Manager has probably mangled the hardware settings of the VM as well. You will need to tidy each VM before storing it in SCVMM ready for Lab Manager:

  • Remove the XML from the notes field.
  • Remove the legacy network adapter.
  • Configure the network adapter within windows to use an IP address and DNS handed to it from DHCP.
  • Delete any snapshots.
  • Make sure you cleanly shut down the VM – don’t save it!

If you follow those steps you can store the VMs back into SCVMM then build a new environment from the stored VMs. If this still gives you trouble then you should export the VMs from hyper-v, reimport them as a copy to get a new unique ID and then push those into SCVMM.

So far this has worked just fine for us with Richard working his magic in Lab Manager whilst I fix up VMs in hyper-v and SCVMM.

Things to remember when building virtual machines for a lab manager environment

As you will have read on both mine and Richard’s blogs, we have recently upgraded our Lab environment and it wasn’t the smoothest of processes.

However, as always it has been a learning experience and this post is all about building VM environments that can be sucked into Lab and turned into a Lab environment that can be pushed out multiple times.

Note:  This article is all about virtual machines running on Windows Server 2012 that may have been built on Windows 8 and are managed by SCVMM 2012 SP1 and Lab Manager/TFS 2012 CU1. Whilst the things I have found in terms of prepping VMs for Lab Manager are likely to be common to older versions, your mileage may vary.

Approaches to building environments

There are a number of approaches to building multi-machine environments that developers can effectively self-serve as required:

  • The ALM Rangers have a VM Factory project on Codeplex which aims to deliver scripted build-from-scratch on demand.
  • SCVMM has templates for machines that are part-built and stored after running sysprep. Orchestrator can then be used to deploy templates and run scripts to wire them together.
  • Lab Manager allows you to take running VMs and group them together into an environment. It stores all the VMs in SCVMM and when requested, generates new VMs by copying the ones from the library.

Trouble at ‘mill

There are also a number of problems in this space that must balance the needs of IT pros with the needs of developers:

  • Developers are an impatient bunch. They will request the environment at the last minute and need it deployed as quickly as possible. This doesn’t necessarily work well with complete bare-metal scripted approaches.
  • Developers would also prefer some consistency – if they have to remember one set of credentials it’s probably too much. Use different accounts and passwords and machine names for all your environments and it can get trick.
  • Developers love to use the Lab Manager and Test Manager tooling. This delivers great integration with the Team Project in Team Foundation Server.
  • IT Pros need to deal with issues caused by multiple machines with the same identities sharing a network. This is especially true of domain controllers.
  • IT pros would like to keep the number of snapshots (SCVMM checkpoints) to a minimum, especially when memory images are in play as well.
  • IT pros would prefer the environments used by the developers to match the way things are installed in the real world. This is less critical for the actual development environment but really important when it comes to testing. This tends to lead to requirements for additional DNS entries and multiple user accounts. This is especially true if you are building SharePoint farms properly.

How IT pros would do it…

Let’s use one of our environments as an example. We have a four server set:

  1. The Domain Controller is acting as DNS and also runs SQL Server. It doesn’t have to do the latter, but we were trying to avoid an additional machine. Reporting services and analysis services are installed and reporting services is listening on a host header with a DNS CNAME entry for it.
  2. An IIS server allows for deployment of custom web apps.
  3. A CRM 2011 server is using the SQL instance on the DC for its database and reporting services functions. The CRM system itself is published on another host header.
  4. A SharePoint 2010 server is using the SQL instance as well. It has separate web applications for intranet and mysites and each is published on a separate host header.

If we were building this without lab manager then we would give the machines two NICs. One would be on our network and the other on a private network. On the DC we unbind the nasty windows protocols from our network. Remote desktop is enabled on all machines for the devs to access it.

Lab Manager complicates matters however. It is clever enough to understand that we might need to keep DC traffic away from our network and has a mechanism to deliver this, called Network Isolation. How it actually goes about that is somewhat problematic, however.

Basically, Lab Manager wants to control all the networking in the new environment. To do that it adds new network adapters to the VMs and it uses those new adapters to connect to the main network. It expects a single adapter to be in the original VM, which it connects to a new private network that it creates.

Did I mention that IT pros hate GUIDs? Lab Manager loves them. Whilst I can appreciate that it’s the best way to generate unique names for networks and VMs it’s a complete pain to manage.

Anyway, it’s really, really easy to confuse Lab Manager. Sadly, if the IT pro builds what they consider to be a sensible rig, that will confuse Lab Manager right away. The answer is that we need to build our environment the right way and then trim it in readiness for the Lab Manager bit.

Building carefully

I would build my environment on my Windows 8 box. I create a private network and use that as a backbone for the environment. I assign fixed IP addresses to each server on that network. Each server uses the DC as its DNS. That way I can ensure everything works during build. I also add a second NIC to each box that is connected to my main network. I carefully set the protocols that are bound to that NIC. Both of those network adapters are what lab manager calls ‘synthetic’ – they are the native virtualised adapter hyper-v uses, not the emulated legacy adapter.

I carefully make sure that all host header-required DNS entries are created as CNAMEs that point to the host record for the server I need. This is important because all the IP addresses will change when Lab Manager takes over.

I may make snapshots as I build so I can move back in time if something goes wrong.

When built, I will probably store my working rig so I can come back to it later. I will then change the rig, effectively breaking it, in order to work with Lab Manager.

The Lab Manager readiness checklist

  • Lab Manager will fail if there is more than a single network adapter. It must be a synthetic adapter, not a legacy one. The adapter should be set to use DHCP for all its configuration – address and DNS.
  • Install, but do not configure the Visual Studio Test Agent before you shut the machines down. We’ve seen Lab fail to install this many times, but if it’s already there it normally configures it just fine.
  • Delete all the snapshots for the virtual machine. Whilst Lab Manager can cope with snapshots, both it and SCVMM get confused when machines are imported with different configurations in the snapshots from the final configuration. It will stop Lab Manager in its tracks.
  • Make sure there is nothing in the notes field of the VM settings. Both Lab Manager and SCVMM shove crap in there to track the VM. If anybody from either team is listening, this is really annoying and gets in the way of putting notes about the rigs in there. Lab Manager shoves XML in there to describe the environment.
  • Make sure there are no saved states. Your machines need to be shut down properly when you finish, before importing into SCVMM. The machines need to boot clean or they will get very confused and Lab Manager may struggle to make the hardware changes.
  • Make sure you export the machines – don’t just copy the folder structure, even though its much easier to do.

Next, get it into SCVMM

There is a good reason to export the VMs. It turns out that SCVMM latches on to the unique identifier of the VM (logical, if you think about it). The snag with this is that you can end up with VMs ‘hiding’. If I copy my set of four VMs to an SCVMM library share I can’t have a copy running as well. Unless you do everything through SCVMM (and for many, many reasons I’m just not going to!) you can end up with confusion. This gets really irritating when you have multiple library shares because if you have copies of a VM in more than one library, one will not appear in the lists in SCVMM. There are good reasons why I might want to store those multiple copies.

Back to the plot. SCVMM won’t let us import a VM. We can construct a new one from a VHD but I have yet to find a way to import a VM (why on earth not? If I’ve missed something please tell me!). So, we need to import our VMs onto a server managed by SCVMM. We have a small box for just this purpose – it’s not managed by Lab Manager but is managed by our SCVMM so I can pull machines from it into the library.

Import the VMs onto your host using Hyper-V manager. Make sure you create sensible folder structures and names for them all. Once they are imported make sure you close hyper-v manager. I have seen SCVMM fail to delete VM folders correctly because hyper-v manager seems to have the VHD open for some reason.

In SCVMM, refresh the host you’ve just imported the VMs to. You should see them in the VM list. I tend to refresh the VMs too, but that’s just me. Start the VMs and let SCVMM get all the information from them like host name etc. I usually leave them for a few minutes, then shut them down cleanly from the SCVMM console.

Now we know SCVMM is happy with them, we can store the VMs in the SCVMM library that Lab Manager uses. You should see them wink out existence on the VM host once the store is complete.

Create the Lab environment

At this point the IT guys can hand over to the people managing labs. In our case that’s Richard. He can now compose a new environment within Lab Manager and pull the VMs I have just stored into his lab. He tells the lab that it needs to run with network isolation and identifies the DC.

What Lab Manager will then do is deploy a new VM through SCVMM using the ones I built as a source. It will then modify the hardware configuration of the VMs, adding a legacy network adapter. It also configures the MAC address of the existing synthetic adapter to be static.

A new private virtual network is created on the target VM host. It’s really hard to manage these through SCVMM so if Lab ever leaves them hanging around I delete them using hyper-v manager. The synthetic adapters in the VMs are connected to the private network while the legacy adapters are connected to the main network.

Exactly why they do it this way I’m not sure. Other than needing legacy adapters for PXE boot (which this isn’t doing) I can’t see why we’re using legacy adapters. I am assuming the visual studio team selected them for a good reason, probably around issuing commands to the VMs, but I don’t know why.

When the environment is started, Lab will assign static IP addresses to the NICs attached to the private network. All ours seem to be 192.168.23.x addresses. It will also set the DNS address to be that which has been assigned to the DC in the lab. The legacy adapters will be set to DHCP for all settings. The end result is a DC that is only connected to the private network and all other machines connected to both private and main networks.

Once the environment is up, Lab Manager should configure the test agent and you’re off. The new lab environment can then be stored in such a way as to allow multiple copies to be deployed as required by the devs.