But it works on my PC!

The random thoughts of Richard Fennell on technology and software development

Why can’t I create an environment using a running VM on my Lab Management system?

With TFS lab management you can build environments from stored VM and VM templates stored in an SCVMM library or from VMs running on a Hyper-V host within your lab infrastructure. This second form is what used to be called composing an environment in TFS 2010. Recently when I tried to compose an environment I had a problem. After selecting the running VM inside the new environment wizard I got the red star that shows an error in the machine properties

image

Now I would only expect to see this when creating an environment with a VM templates as a red star usually means the OS profile is not set e.g. you have missed a product key, or passwords don’t match. However, this was a running VM so there were no settings I could make, and no obvious way to diagnose the problem. After a few email with Microsoft Lab management team we go to the bottom of the problem, it was all down to the Hyper-V hosts network connections, but that is rushing ahead, first lets see why it was a confusing problem.

First the red herring

We now know the issue was the Hyper-V host network, but at first it looked like I could compose some guest VMs but not others. I wrongly assumed the issue was some bad meta-data or corrupt settings within the VMs. Tthis problem all started after a server crash and so we were fearing corruption, which clouded our thoughts.

The actual reason some VMs could be composed and some could not was dependant on which Hyper-V host they were running on. Not the VMs themselves.

The diagnostic steps

To get to the root of this issue a few commands and tools were used. Don’t think for a second there was not a lot of random jumping about and trial and error. In this post I am just going to point out what was helpful.

Firstly you need to use the TFSConfig command on your TFS server to find out your network location setting. So run

C:\Program Files\Microsoft Team Foundation Server 11.0\Tools>tfsconfig lab /settings /list
SCVMM Server Name: vmm.blackmarble.co.uk
Network Location: VSLM Network Location
IP Block: 192.168.23.0/24
DNS Suffix: blackmarble.co.uk

Next you need to see which, if any, of your Hyper-V hosts are connected to this location. You can do this in a few graphically ways in SCVMM (and I am sure via PowerShell too)

If you select a Hyper-V host in SCVVM, right click and select View networking. On a healthy host you see the VSLM network location connected to external network adaptor the VMs are using

image

On my failing Hyper-V host the VSLM network was connected to an empty network port

image

You can also see this on the SCVMM > host (right click) > properties. If you look on  the networking tab for the main virtual network  you should see the VSLM network as the location. On the failing Hyper-V host this location was empty.

image

The solution

You would naively think selecting the edit option on the screen shot above would allow you to enter the VSLM Network as the location, but no. Not on that tab. You need to select the hardware tab.

image

You can then select the correct network adaptor and override the discovered network location to point to the VSLM Network Location. Once this was done I could compose environments as I would expect.

I have said it before, but Lab Management has a lot of moving parts, and they all must be setup right else nothing works. A small configuration error can seriously ruin your day.

Did I delete the right lab?

It was bound to happen in the end, the wrong environment got deleted on our TFS Lab Management instance. The usual selection of rushing, minor mistakes, misunderstandings and not reading the final dialog properly and BANG you get that sinking feeling as you see the wrong set of VMs being deleted. Well this happened yesterday, so was there anything that can be done? Luckily the answer is yes, if you are quick.

Firstly we knew SCVMM operations are slow, so I RDP’d onto the Hyper-V host  and quickly copied the folders that contained the VMs scheduled to be deleted. We now had a copy of the VHDs.

On the SCVMM host I cancelled the delete jobs. Turns out this did not really help as the jobs just get rescheduled. In fact it may make matters worse as the failing of jobs and their restarting seems to confuse SCVMM, took it hours before it was happy again, kept giving ‘can’t run job as XXX in use’ and losing sight of the Hyper-V hosts (needed to restart the VMM service in the end).

So I now had a copy of three network isolated VM, so I

  • Created new VMs on a Hyper-V host using Hyper-V manager with the saved VHDs as their disks. I then made sure they ran and were not corrupted
  • In SCVMM cleared down the saved state so they were stopped (I forgot to do this the first time I went through this process and it meant I could not deploy the stored VMs into an isolated environment, that wasted hours!)
  • In SCVMM put them into the library on a path our Lab Management server knows about (gotcha here is SCVMM deletes the VM after putting it into the library, this is unlike MTM Lab Center which leaves the original in place, always scares me when I forget)
  • In MTM Lab Center import the new VMs from the library
  • Create a new network isolated environment with the VMs
  • Wait……………………….

When it eventually started I had a network isolated environment back to the state it was when we in effect pulled the power out. All took about 24 hours, but most of this was waiting for copies to and from the library to complete.

So the top tip is try to avoid the problem, this is down to process frankly

  • Use the ‘mark a in use’ feature to say who is using a VM
  • Put a process in place to manage the lab resources. It does not matter how much Hyper-V resource you have you will run out in the end and be unable to add that extra VM. You need a way to delete/archive out what is not currently need
  • Read the confirmation dialogs, they are there for a reason

TF900546, can’t run Windows 8 App Store unit tests in a TFS build

Today has been one of purging build system problems. On my TFS 2012 Windows 8 build box I was was getting the following error when trying to run Windows 8 App Store unit tests

TF900546: An unexpected error occurred while running the RunTests activity: 'Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.'.

On further investigation, I am not really sure anything was working too well on this box. To give a bit of background

  • I have one build controller build2012
  • with a number of build agents spread across various VMs. I use tags to target the correct agent e.g. SUR40 or WIN8

In the case of Windows 8 builds (where the  TFS build agent has to run on a Windows 8 box) the build seemed to run, but tests failed with the TF900546 ‘its broken error, but I am not saying why’ error. As usual there was nothing in the logs to help.

To try to debug the error I added a build controller to this box, and eventually, just like Martin in his post noticed, after far too long, that I was getting a error on the build service on the Windows 8 box and the agent was not fully online.

image

The main symptom is the build agent says ready, but shows a red box (stopped). If you hit the details link that appears you get the error dialog. Martin had a 500 error, I was getting a 404. I had seen similar problems before, I really should read (or at least remember) my own blog posts.

I can’t stress enough, if you don’t see a green icon on build controllers and agent you have a problem, it might not be obvious at that point but it will bite you later!

For me the fix was the URL I was using to connect to the TFS server. i was using HTTPS (SSL), as soon as switched to HTTP all was OK. In this case this was fine as both the TFS server and build box were in the same rack so SSL was not really needed. I suspect that the solution, if I had wanted SSL, would be as Martin outlined, a config file edit to sort out the bindings.

But remember….

That having a working build system is not enough for Windows 8 App Store unit tests. You also have to manually install the application certificate for test assembly as detailed in MSDN as well as getting the build service running in interactive mode.

Once this was done my application build and the tests ran OK

More thoughts on addressing TF900546 ‘Unable to load one or more of the requested types’ on TFS2012

A while ago I posted about seeing the TF900546 error when running unit tests in a previously working TFS 2012 build. The full error being:

TF900546: An unexpected error occurred while running the RunTests activity: 'Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.'.

Well late last week this problem came back with avengeance on a number of builds run on the same build controller/agent(s). Irritatingly I first noticed it after a major refactor of a codebase, so I had plenty of potential root causes as assemblies had been renamed and it was possible they might not be found. However, after a bit of testing there were no obvious candidates as all tests worked fine locally on my development PC, and a new very simple test application showed the same issues. It was defiantly an issue on the build system.

I can still find no good way to debug this error, Stackoverflow mention Fuslogvw and WinDbg, as well as various copy local settings and the like. Again all seems too much as this build was working in the past, just seemed to stop. I tried a couple but got no real information, and the error logs were empty.

In the end I just tried what I did before (as I could think of no better tactic to pin down the true issue). I went into the build controller config, removed the reference to the custom assemblies, saved this settings (causing a controller restart), then put it back (another restart of the controller)

image

After this my test started working again, with no other changes

Interesting a restart of VM running the build controller did not fix the problem. However this does somewhat chime with comments in the StackOverFlow thread that causing the AppPool in MVC apps to rebuild completely, ignoring any cached assemblies, seems to fix the issue.

Change in the System.AssignedTo in TFS SOAP alerts with TFS 2012

Ewald’s post explains how to create a WCF web service to act as an end point for TFS Alerts. I have been using the model with a TFS 2010 to check for work item changed events, using the work item’s System.AssignedTo field to retrieve the owner of the work item (via the TFS API) so I can send an email, as well as other tasks (I know I could just send the email with a standard alert).

In TFS 2010 this worked fine, if the work item was assigned to me I got back the name richard, which I could use as the to address for the email by appending our domain name.

When I moved this WCF event receiver onto a TFS 2012 (using the TFS 2012 API) I had not expected any problems, but the emails did not arrive. On checking my logging I saw they were being sent to fennell@blackmarble.co.uk. Turns out the issue was that the API call

value = this.workItem.Fields[“System.AssignedTo ”].Value.ToString();

was returning the display name ‘Richard Fennell’, which was not a valid part of the email address.

The best solution I found, thus far, was to check to see if had a display name in the AD using the method I found on stackoverflow. If I got a user name back I used that, if I got a empty string (because I have been passed a non display name) I just use the initial value assuming it is a valid address.

Seems to work but this there a easier solution?

Cannot run coded ui test on a TFS lab due to lack of rights to the drops folder

Whilst setting up a TFS 2012 Lab management environment for Coded UI testing we got the problem that none of the tests were running, in fact we could see no tests were even being passed to the agent in the lab

image

On the build report I clicked on  the ‘View Test Results’ which loaded the result in Microsoft Test Manager (MTM)

image

and viewed the test run log, and we saw

image

The issue, it claimed, was that the build controller did not have rights to access the drop folder containing the assembly with the CodedUI. tests.

Initially i thought the issue was the test controller was running a ‘local service’, so I changed it to the domain\tfsbuild account (which obviously has rights to the drops folder as it put the files there) but still got the same error. i was confused.

So I checked the event log on the build controller and found the following

image

The problem was my tfslab account, not the local service or tfsbuild one. So the message shown in the build report was just confusing, mentioning the wrong user. The lab account is the one configured in the test controller (yes you have to asked how had I missed that when I had been into the same tools to change the user the test controller ran as!)

image

As soon as I granted this tfslab user rights to the drops folder all was OK

Team Foundation Service RTMs

Today at Build 2012 it was announced that https://tfspreview.com has RTM'd as Team Foundation Service on https://tfs.visualstudio.com.

Up until now there has been no pricing information, which had been a barrier to some people I have spoken to as they did not want to started using something without knowing the future cost.

So to the really good news, as of now

  • It is free for up to 5 users
  • If you have an active MSDN subscription it is also free. So a team of any size can use it as long as they all have MSDN

The announcement said to look out for further price options next year.

Check the full details at Soma's blog

More fun with creating TFS 2012 SC-VMM environments

Whilst setting up new a new SC-VMM based lab environment I managed to find some new ways to fail above and beyond the ones I have found before.

We needed to build a new environment for testing CRM application, this needed to have its own DC, IIS server and a CRM server. The aim was to have this as a network isolated environment, but you have to build it first as the various VMs.

So we did the following

  • On the Hyper-V hosts managed by our SC-VMM server create 3 new VMs connect to our corporate LAN
  • Install the OS on the three VMs
  • Make one of the a DC for dev.local
  • Join the others to the DC’s domain dev.local (they are not joined to our corporate domain)
  • On the IIS box add the web server role
  • On the CRM box install Microsoft CRM

So we now have a three box domain that does what we want, but it is not network isolated. We could have used the features of SC-VMM to push these VMs into the library and hence import into the Lab Management library. However we choose to make sure we could connect to them first as an environment.

So first I tried to create a standard environment, not using SC-VMM. I had to create a local hosts file on the PC running MTM, but once this was done I could verify the environment, so all was OK. I did not actually create it.

Next I tried to create a SC-VMM based environment and this is where I hit a problem. I was basically trying to do something I have done before with all our pre-lab management test VMs, wrapper existing VMs in an environment. When I tried to do this the verification failed, saying I could not connect to any of the VMs. First we made sure file sharing was enable, firewalls were not blocking etc. All to no effect.

To cut a long story short I had a number of issues, mostly down to the reuse of names

  • The SC-VMM VM names for the VMs (e.g. LabDC) did not match the actual host name (DevDC). I had to rename the VM in SC-VMM to match the name of the host in the operating system (I am still unsure if this is a red herring and not really that important, but I think it is good practice anyway)
  • We had to have a hosts file on the MTM box with the fully qualified names for the three boxes (not just the server name). Not that this hosts entry (or could be DNS if you want) is only needed until the environment is built
  • 192.168.200.102 devcrm.dev.local
    192.168.200.103 deviis.dev.local
    192.168.200.104 devdc.dev.local

  • The name DevDC has been used on another VM that was being run on one of our developers Windows 8 Hyper-V setup. This was causing problems when MTM tried to resolve the machine name via SMB (Netbios, IP resolution was fine). Switching off this other VM fixed this, we only spotted it by using Wireshark on the PC running MTM (note you have to run the installer in Win7 compatibility mode to get it to work with Windows 8)
  • When entering the login details for development domain when creating the new environment in MTM  the user ID had to be entered as administator@dev.local and not dev\administrator

Once this was all done I could verify my environment and create it, the TFS agent was installed, but did not connect to the test controller. This is exactly as expect as details in my previous post.

I now have a few choices

  • If I don’t want to network isolate it I can install a Test Controller in the domain
  • I can save each of the three VMs into the SC-VMM library via MTM and create an isolated environment.

So I hope this helps you avoid some of the problems I have seen, I just wish that the MTM environment creation step gave out a better log file so i don’t have to second guess it or use wireshark.