But it works on my PC!

The random thoughts of Richard Fennell on technology and software development

Error TF400129: Verifying that the team project collection has space for new system fields when upgrading TFS to 2012.2

Whist testing an upgrade of TFS 2010 to TFS 2012.2 I was getting a number of verification errors in the TFS configuration upgrade wizard. They were all TF400129 based such as

TF400129: Verifying that the team project collection has space for new system fields

but also mention models and schema.

A quick search threw up this thread on the subject, but on checking the DB tables I could see my problem was all together more basic. The thread talked of TPCs in incorrect states. In my case I had been provided with an empty DB, so TFS could find not tables at all. So I suppose the error message was a bit too specific, should have been ‘DB is empty!!!!’ error. Once I got a valid file backup restored for the TPC in question all was ok.

A bit more digging showed that I could also see an error if I issued the command

tfsconfig remapdbs /sqlinstances:TFS1 /databaseName:TFS1;Tfs_Configuration

As this too reported it could not find a DB it was expecting.

So the tip is make sure you really have the Dbs restored you think you have.

What machine name is being used when you compose an environment from running VMs in Lab Management?

This is a follow up to my older post on a similar subject 

When composing a new Lab Environment from running VMs the PC you are running MTM on needs to be able to connect to the running VMs. It does this using IP so at the most basic level you need to be able to resolve the name of the VM to an IP address.

If your VM is connected to the same LAN as your PC, but not in the same domain the chances are that DNS name resolution will not work. I find the best option is to put a temporary entry in your local hosts file, keeping it for just as long as the creation process takes.

But what should this entry be? Should it be the name of the VM as it appears in the MTM new environment wizard?

Turns out the answer is no, it needs to be the name as appears in the SC-VMM console

image

So the hosts table contains the correct entries for the FQDN (watch out for typo’s here, a mistype IP address only adds to the confusion) e.g.

10.10.10.100 wyfrswin7.wyfrs.local
10.10.10.45 shamrockbay.wyfrs.local

Once all this is set then just follow the process in my older post to enable the connection so the new environment wizard can verify OK.

Remember the firewall on the VMs may also be an issue. Just for the period of the environment creation I often disable this.

Also Wireshark is your friend, it will show if the machine you think is responding is the one you really want.

Lab Management with SCVMM 2012 and /labenvironmentplacementpolicy:aggressive

I did a post a year or so ago about setting up TFS Labs and mentioned command

C:\Program Files\Microsoft Team Foundation Server 2010\Tools>tfsconfig lab /hostgroup /collectionName:myTpc  ​/labenvironmentplacementpolicy:aggressive /edit /Name:"My hosts group"

This can be used tell TFS Lab Management to place VMs using any memory that is assigned stopped environments. This allowed a degree of over commitment of resources.

As I discovered today this command only works for SCVMM 2010 based system. if you try it you just get a message saying not support on SCVMM 2012. There appears to be no equivalent for 2012.

However you can use features such as dynamic memory with in SCVMM 2012 so all is not lost

TF900548 when using my Typemock 2012 TFS custom build activity

Using the Typemock TFS 2012 Build activity I created I had started seen the error

TF900548: An error occurred publishing the Visual Studio test results. Details: 'The following id must have a positive value: testRunId.'

I thought it might be down to having patched our build boxes to TFS 2012 Update 1, maybe it needed to be rebuild due to some dependency? However, on trying the build activity on my development TFS server I found it ran fine.

I made sure I had the same custom assemblies and Typemock autorun folder and build definition on both systems, I did, so it was not that.

Next I tried running the build but targeting an agent not on the same VM as the build controller. This worked, so it seems I have a build controller issues. So I ran Windows update to make sure the OS was patched it to date, it applied a few patches and rebooted. And all was OK my test ran gain.

It does seem that for many build issues the standard switch it off and back on again does the job

TFS TPC Databases and SQL 2012 availability groups

Worth noting that when you create a new TPC in TFS 2012, when the TFS configuration DB and other TPC DBs are in SQL 2012 availability groups, the new TPC DB is not placed in this or any other availability group. You have to add it manually, and historically remove it when servicing TFS. Though the need to remove it for servicing changes with TFS 2012.2 which allows servicing of high availability DBs

Recovering network isolated lab management environments if you have to recreate your SC-VMM server’s DB

Whilst upgrading our Lab Management system we lost the SC-VMM DB. This has meant we needed to recreate environments we already have running on Hyper_V hosts but were unknown to TFS. If they were not network isolated this is straight forward, just recompose the environment (after clear out the XML in the VM descriptions fields). However if they are network isolated and running, then you have do play around a bit.

This is the simplest method I have found thus far. I am interested to hear if you have a better way

  • In SC-VMM (or via PowerShell) find all the VMs in your environment. They are going to have names in the form Lab_[GUID]. If you look at the properties of the VMs in the description field you can see the XML that defines the Lab they belong to.

image

If you are not sure what VMs you need you can of course cross reference the internal machine names with the AD within the network isolated environment. remember this environment is running so you can login to it.

  • Via SC-VMM Shutdown each VM
  • Via SC-VMM store the VM in the library
  • Wait a while…….
  • When all the VMs have been stored, navigate to them in SC-VMM. For each one in turn open the properties and
    • CHECK THE DESCRIPTION XML TO MAKE SURE YOU HAVE THE RIGHT VM AND KNOW THEIR ROLE
    • Change the name to something sensible (not essential if you like GUIDs in environment members names, but as I think it helps) e.g change Lab_[guid] to ‘My Test DC’
    • Delete all the XML in the Description field
    • In the hardware configuration, delete the ‘legacy network’ and connect the ‘Network adaptor’ to your main network – this will all be recreated when you create the new lab

image

Note that a DC will not have any connections to your main network as it is network isolated. For the purpose of this migration it DOES need to be reconnected. Again this will be stored by the tooling when you create the new environment.

  • When all have been update in SC-VMM, open MTM and import the stored VMs into the team project
  • You can now create a new environment using these stored VM. It should deploy out OK, but I have found you might need to restart it before all the test agent connect correctly
  • And that should be it, the environment is known to TFS lab managed and is running network isolated

You might want to delete the stored VMs once you have the environment running. But this will down to your policies, they are not needed as you can store the environment as a whole to archive or duplicate it with network isolation.

Upgrading our TFS 2012 Lab Management to use SC-VMM 2012 SP1

Background

We have been successfully using our TFS Lab Management system for a while. However, we have noticed an issue that when deploying environments the performance of the system slowed. This was down to I/O congestion between our servers and the SAN that provided their main VM storage because the library store and the hyper-v host servers all shared the same SAN.

clip_image002

Also we had the need to start to create environments using Windows 8 and Server 2012. This is not possible using System Center Virtual Machine Manager (SCVMM) 2008 or Hyper-V hosted on earlier than Windows 2012.

So it was time to do an upgrade of both the operating system, and our underlying hardware. The decision was made to dump the SAN and provide each Hyper-V host with its own SAS based RAID 5 disk. The main Lab SCVMM library server would also use local storage. All servers would be moved to Server 2012 and SCVMM would advance to 2012 with Service Pack 1 (required to manage Server 2012).

clip_image004

You might ask why we did not do this sooner, especially the move to support Windows 8 and Server 2012 VMs? The answer is that until TFS 2012 quarterly update 1 there was no support for System Center 2012, which itself had to wait until the SP1 release in January. So the start of the year was the first time we could do the update. We had planned to do very early in the year, but waiting for hardware and scheduling of our time let it drag on.

Hardware

The hardware upgrade was less straightforward than original anticipated as we needed to change the host controller card to support the required RAID5 array of disks. Once that was done on each of the Hyper-V hosts, installing the OS and configuring Hyper-V was quick and easy.

Meanwhile the library server was moved to a different box to allow for the sliding block puzzle of storage move and OS upgrades.

The Plan

After much thought we decided our upgrade path would be

  1. Stop all environments
  2. Rebuild each Hyper-V server with Server 2012 on their mirrored boot disks, adding the new SAS hardware and removing the SAN disks.
  3. Build the new SCVMM 2012SP1 server on Server 2012. The DB server would also move from an in-place SQL 2008 R2 instance to be hosted on our resilient SQL 2012 with Always On (also now supported in SCVMM 2012 SP1). This would mean an upgrade with existing database, as far as SCVMM is concerned.
  4. Add the rebuilt Hyper-V hosts to SCVMM
  5. Get the deployed VMs off SAN and onto the new SAS disks, along with a transfer of the library share.
  6. Reconfigure TFS to point to the new SCVMM server

What really happened

SCVMM Upgrade

The major problem we found was that you can’t upgrade the database from SCVMM 2008 R2 SP1 to SCVMM 2012 SP1. It need to be upgraded to SCVMM 2012 first.

In fact the SCVMM install process then suffered a cascade of issues:

  • SCVMM 2008 R2 needed the WAIK for Vista. SCVMM 2012 required the Win7 WAIK and SCVMM 2012 SP1 wanted the Windows 8 Automated Deployment kit.
  • Each time SCVMM wanted to upgrade its database, the database had to be in Simple Recovery mode. This is incompatible with Always On, which requires Full Recovery mode. This meant that the DB couldn’t be moved into the Availability group until the final installation was complete.

In the end we built a temporary VM for SCVMM 2012. The original host was left on Server 2008 R2 with the Vista WAIK; the intermediate host had Server 2008 R2 with the Windows 7 WAIK and the final host had Server 2012 with the Windows 8 ADK.

The problem we then found was that the final SCVMM 2012 SP1 system failed to start the Virtual Machine Manager service. The error we saw was:

Problem signature:
P1: vmmservice
P2: 3.1.6011.0
P3: Utils
P4: 3.1.6011.0
P5: M.V.D.SqlRetryCommand.InternalExecuteReader
P6: M.V.DB.CarmineSqlException
P7: b82c

We rolled back and repeated the process a number of time, and even tried a final in-place upgrade of the original host. Each resulted in the same fault. The internet turned up no help – two or three people reported the same fault and each one required a clean SCVMM installation to fix the fault.

Interestingly, our original DB size was around 700Mb. Each time, the upgrade process left a final DB of around 70Mb, suggesting something had gone wrong during DB updates. Whether this had anything to do with the presence of Lab we don’t know.

In the end we had no choice but to install SCVMM 2012 SP1 clean, with a new database. Once we did that everything worked just fine from the SCVMM point of view.

TFS

First we repointed TFS at the new SCVMM server

Tfsconfig.exe lab /settings / scvmmservername:my_new_scvmmservername /force

This worked OK, we then tried to run to upgrade the schema

tfsconfig lab /upgradeSCVMM /collectionName:*.

But this errored. Also when we tried to change the library and host groups both via the UI and command line we also got errors. The problem was that TFS thought there were libraries and host groups on the now retired old SCVMM server.

The solution was to open Microsoft Test Manager (MTM) and delete the live environments and stored environments, VMs and templates. This had no effect on the new SCVVM server as these entries referred to the now no-existent SCVMM host. It just removed the entries in the TFS databases.

Once this was done we could run the command

tfsconfig lab /upgradeSCVMM /collectionName:*.

And we could also reset the library and host groups to point to the new server.

So what do we have?

The upgrade (reinstall?) is now done and what do we have? On the Hyper-V hosts we have running VMs, and due to the effort of our IT team we have wired back the virtual networks previously created for Lab Management for network isolation. It is a mess and need doing via MTM, but it works for now.

The SCVMM library contains loads of stored environments. However as they were stored using the GUID based names used in Lab management their purpose is not that obvious.

As we had to delete the SCVMM database and TFS entries we have lost that index of GUIDs to sensible names.

The next step?

We need to manually take VMs and get them imported correctly into the new SCVMM library so that they can be deployed.

For running environment that don’t require network isolation we can compose new environments to wrapper the VMs. However, if we need network isolation we can’t see any method other than to push each VM up into the library and re-deploy them as a new network isolated environment, more posts to follow as I am sure we will learn stuff doing this.

Points of note:

  • Lab Manager shoves a load of XML into the notes field of a VM on Hyper-V. If that xml is present, lab assumes the VM is part of an environment and won’t show that as being a VM for use in new environment. Deleting the XML makes the VM magically reappear.
  • Allowing developers to make lots of snapshots, particularly with save states is a bad idea. When migrating between server versions, the Hyper-V save states are incompatible so you will lose lots and lots and lots of work if you’re not careful.
  • Having lots of snapshots with configuration variations might also cause confusion. Collapsing snapshots overall is a not a bad idea.
  • If you have a VM in the library (we now have multiple libraries to make sure people using Lab can’t accidentally delete key VM images) with the same VM ID as a running machine SCVMM won’t show you the one that is running. If you think it through this makes sense – SCVMM doesn’t export and import VM folders, it simply hefts them around the network. This is just like copying a VM form one Win8 box to another – the default option is to import the machine with the same ‘unique’ identifier.
    The solution to this one is to import a new copy of the source VM onto the hyper-v host, choosing the ‘make a copy and generate a new ID’ option. This new VM can then be manipulated with SCVMM and you still have a ‘gold master’ in your library.
  • Enabling server 2012 data deduplication on your SCVMM library shares is a very good plan. Ours is saving 75% of disk space. The only thing to be wary of is that if you pull a VM from the library onto Hyper-V and then store it back to the library the file will take up the ‘correct’ amount of disk space until the dedupe job runs again. If you’re not careful you can ‘overfill’ your disk this way!
  • SCVMM 2012 and SP1 like clouds and offer all kinds of technology solutions that can build a virtual infrastructure for you with VLANs and all kinds of cleverness. This will confuse the life out of Lab Manager so don’t do it for your Lab Environment. We now have a couple of ‘clouds’ in SCVMM and the Lab one consists of little more than the Hyper-V hosts and associated libraries. There is one virtual switch and one logical switch, both of which exist only to hook up our main network. Anything more complex will confuse Lab.
    Meanwhile, Lab still creates new Hyper-V Virtual Switches. SCVMM will know nothing of these so you need to be aware of that when inspecting VMs in SCVMM.

If we were doing it again…

So what have we learnt? The upgrade of the SCVMM database is critical. If this fails you have no other option other than to rebuild and manual recreate from the basic VMs

Even with SC-VMM 2012 SP1 and Lab Management there are still many moving parts you have to consider to get the system working. The change from 2008 to 2012 is like going from Imperial to Metric measurements. It is the same basic idea, it just feels like everything has changed.

Thanks to Rik and Rob for helping with this post