But it works on my PC!

The random thoughts of Richard Fennell on technology and software development

Experiences applying TFS 2012 QU1 and it subsequent hotfix

Brian Harry posted last week about a hotfix for TFS 2012 QU1 (KB2795609). This should not be needed by most people, but as his post points out does fix issues for a few customers. Well we were one of those customers. When upgrading from 2012 RTM to 2012 QU1 we had attempted what with hindsight was an over ambitious hardware migration too. This involved swapping our data tier from a SQL 2012 instance to a new 2012 availability group and merging team project collections from different server as well as applying the QU1. Our migration plan contained some team project collection detach/attach steps hence getting into the area this hotfix addresses.

The end point was we ended up with a QU1 upgraded server, but we could only get users connected if we made them team project administrators, a valid short term solution, but something we needed to fix.

We therefore applied the new KB2795609 patch, but hit a gotcha that you should be aware of

  • We ran the patch EXE on our TFS server that was showing the problem.
  • This ran without error, taking about 5 minutes
  • We tried to connect to the patched TFS server via the web client and VS2012, we could make a connection to TFS but could open any TPCs
  • On checking the TFS admin console we saw the TPC was offline and reporting that the servicing had failed (but this had not been reported back via the patch tool)
  • We reran the servicing job (via the TFS admin console) but it failed in the core step we saw in the logs

[Error] TF400744: An error occurred while executing the following script: TurnOnRCSI.sql. Failed batch starts on the line 1. Statement line: 1. Script line: 1. Error: 5069 ALTER DATABASE statement failed.

  • Our TFS DBs are now stored with a SQL 2012 availability group, during the upgrade to QU1 we had seen problems applying the upgrade unless we removed the DBs from the availability groups. So we removed the tfs_configuration and tfs_[mytpc] from availability groups and re applied the servicing job and all was OK
  • Once the servicing of the TPC was completed it went online as expected.
  • We then put the DBs back into the availability group
  • We could then remove the users from the team project administrators group as their previous rights were working again.

So we now had a patched and working TFS 2012 QU1 server. Lets hope that QU2 is a little smoother and we don’t need the direct help of product group, who I must say have been great in getting this problem addressed. I really like the openness we see in Brian’s blog of both the good and the bad.

Why can’t I create an environment using a running VM on my Lab Management system?

With TFS lab management you can build environments from stored VM and VM templates stored in an SCVMM library or from VMs running on a Hyper-V host within your lab infrastructure. This second form is what used to be called composing an environment in TFS 2010. Recently when I tried to compose an environment I had a problem. After selecting the running VM inside the new environment wizard I got the red star that shows an error in the machine properties

image

Now I would only expect to see this when creating an environment with a VM templates as a red star usually means the OS profile is not set e.g. you have missed a product key, or passwords don’t match. However, this was a running VM so there were no settings I could make, and no obvious way to diagnose the problem. After a few email with Microsoft Lab management team we go to the bottom of the problem, it was all down to the Hyper-V hosts network connections, but that is rushing ahead, first lets see why it was a confusing problem.

First the red herring

We now know the issue was the Hyper-V host network, but at first it looked like I could compose some guest VMs but not others. I wrongly assumed the issue was some bad meta-data or corrupt settings within the VMs. Tthis problem all started after a server crash and so we were fearing corruption, which clouded our thoughts.

The actual reason some VMs could be composed and some could not was dependant on which Hyper-V host they were running on. Not the VMs themselves.

The diagnostic steps

To get to the root of this issue a few commands and tools were used. Don’t think for a second there was not a lot of random jumping about and trial and error. In this post I am just going to point out what was helpful.

Firstly you need to use the TFSConfig command on your TFS server to find out your network location setting. So run

C:\Program Files\Microsoft Team Foundation Server 11.0\Tools>tfsconfig lab /settings /list
SCVMM Server Name: vmm.blackmarble.co.uk
Network Location: VSLM Network Location
IP Block: 192.168.23.0/24
DNS Suffix: blackmarble.co.uk

Next you need to see which, if any, of your Hyper-V hosts are connected to this location. You can do this in a few graphically ways in SCVMM (and I am sure via PowerShell too)

If you select a Hyper-V host in SCVVM, right click and select View networking. On a healthy host you see the VSLM network location connected to external network adaptor the VMs are using

image

On my failing Hyper-V host the VSLM network was connected to an empty network port

image

You can also see this on the SCVMM > host (right click) > properties. If you look on  the networking tab for the main virtual network  you should see the VSLM network as the location. On the failing Hyper-V host this location was empty.

image

The solution

You would naively think selecting the edit option on the screen shot above would allow you to enter the VSLM Network as the location, but no. Not on that tab. You need to select the hardware tab.

image

You can then select the correct network adaptor and override the discovered network location to point to the VSLM Network Location. Once this was done I could compose environments as I would expect.

I have said it before, but Lab Management has a lot of moving parts, and they all must be setup right else nothing works. A small configuration error can seriously ruin your day.

Did I delete the right lab?

It was bound to happen in the end, the wrong environment got deleted on our TFS Lab Management instance. The usual selection of rushing, minor mistakes, misunderstandings and not reading the final dialog properly and BANG you get that sinking feeling as you see the wrong set of VMs being deleted. Well this happened yesterday, so was there anything that can be done? Luckily the answer is yes, if you are quick.

Firstly we knew SCVMM operations are slow, so I RDP’d onto the Hyper-V host  and quickly copied the folders that contained the VMs scheduled to be deleted. We now had a copy of the VHDs.

On the SCVMM host I cancelled the delete jobs. Turns out this did not really help as the jobs just get rescheduled. In fact it may make matters worse as the failing of jobs and their restarting seems to confuse SCVMM, took it hours before it was happy again, kept giving ‘can’t run job as XXX in use’ and losing sight of the Hyper-V hosts (needed to restart the VMM service in the end).

So I now had a copy of three network isolated VM, so I

  • Created new VMs on a Hyper-V host using Hyper-V manager with the saved VHDs as their disks. I then made sure they ran and were not corrupted
  • In SCVMM cleared down the saved state so they were stopped (I forgot to do this the first time I went through this process and it meant I could not deploy the stored VMs into an isolated environment, that wasted hours!)
  • In SCVMM put them into the library on a path our Lab Management server knows about (gotcha here is SCVMM deletes the VM after putting it into the library, this is unlike MTM Lab Center which leaves the original in place, always scares me when I forget)
  • In MTM Lab Center import the new VMs from the library
  • Create a new network isolated environment with the VMs
  • Wait……………………….

When it eventually started I had a network isolated environment back to the state it was when we in effect pulled the power out. All took about 24 hours, but most of this was waiting for copies to and from the library to complete.

So the top tip is try to avoid the problem, this is down to process frankly

  • Use the ‘mark a in use’ feature to say who is using a VM
  • Put a process in place to manage the lab resources. It does not matter how much Hyper-V resource you have you will run out in the end and be unable to add that extra VM. You need a way to delete/archive out what is not currently need
  • Read the confirmation dialogs, they are there for a reason

TF900546, can’t run Windows 8 App Store unit tests in a TFS build

Today has been one of purging build system problems. On my TFS 2012 Windows 8 build box I was was getting the following error when trying to run Windows 8 App Store unit tests

TF900546: An unexpected error occurred while running the RunTests activity: 'Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.'.

On further investigation, I am not really sure anything was working too well on this box. To give a bit of background

  • I have one build controller build2012
  • with a number of build agents spread across various VMs. I use tags to target the correct agent e.g. SUR40 or WIN8

In the case of Windows 8 builds (where the  TFS build agent has to run on a Windows 8 box) the build seemed to run, but tests failed with the TF900546 ‘its broken error, but I am not saying why’ error. As usual there was nothing in the logs to help.

To try to debug the error I added a build controller to this box, and eventually, just like Martin in his post noticed, after far too long, that I was getting a error on the build service on the Windows 8 box and the agent was not fully online.

image

The main symptom is the build agent says ready, but shows a red box (stopped). If you hit the details link that appears you get the error dialog. Martin had a 500 error, I was getting a 404. I had seen similar problems before, I really should read (or at least remember) my own blog posts.

I can’t stress enough, if you don’t see a green icon on build controllers and agent you have a problem, it might not be obvious at that point but it will bite you later!

For me the fix was the URL I was using to connect to the TFS server. i was using HTTPS (SSL), as soon as switched to HTTP all was OK. In this case this was fine as both the TFS server and build box were in the same rack so SSL was not really needed. I suspect that the solution, if I had wanted SSL, would be as Martin outlined, a config file edit to sort out the bindings.

But remember….

That having a working build system is not enough for Windows 8 App Store unit tests. You also have to manually install the application certificate for test assembly as detailed in MSDN as well as getting the build service running in interactive mode.

Once this was done my application build and the tests ran OK

More thoughts on addressing TF900546 ‘Unable to load one or more of the requested types’ on TFS2012

A while ago I posted about seeing the TF900546 error when running unit tests in a previously working TFS 2012 build. The full error being:

TF900546: An unexpected error occurred while running the RunTests activity: 'Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.'.

Well late last week this problem came back with avengeance on a number of builds run on the same build controller/agent(s). Irritatingly I first noticed it after a major refactor of a codebase, so I had plenty of potential root causes as assemblies had been renamed and it was possible they might not be found. However, after a bit of testing there were no obvious candidates as all tests worked fine locally on my development PC, and a new very simple test application showed the same issues. It was defiantly an issue on the build system.

I can still find no good way to debug this error, Stackoverflow mention Fuslogvw and WinDbg, as well as various copy local settings and the like. Again all seems too much as this build was working in the past, just seemed to stop. I tried a couple but got no real information, and the error logs were empty.

In the end I just tried what I did before (as I could think of no better tactic to pin down the true issue). I went into the build controller config, removed the reference to the custom assemblies, saved this settings (causing a controller restart), then put it back (another restart of the controller)

image

After this my test started working again, with no other changes

Interesting a restart of VM running the build controller did not fix the problem. However this does somewhat chime with comments in the StackOverFlow thread that causing the AppPool in MVC apps to rebuild completely, ignoring any cached assemblies, seems to fix the issue.

Change in the System.AssignedTo in TFS SOAP alerts with TFS 2012

Ewald’s post explains how to create a WCF web service to act as an end point for TFS Alerts. I have been using the model with a TFS 2010 to check for work item changed events, using the work item’s System.AssignedTo field to retrieve the owner of the work item (via the TFS API) so I can send an email, as well as other tasks (I know I could just send the email with a standard alert).

In TFS 2010 this worked fine, if the work item was assigned to me I got back the name richard, which I could use as the to address for the email by appending our domain name.

When I moved this WCF event receiver onto a TFS 2012 (using the TFS 2012 API) I had not expected any problems, but the emails did not arrive. On checking my logging I saw they were being sent to fennell@blackmarble.co.uk. Turns out the issue was that the API call

value = this.workItem.Fields[“System.AssignedTo ”].Value.ToString();

was returning the display name ‘Richard Fennell’, which was not a valid part of the email address.

The best solution I found, thus far, was to check to see if had a display name in the AD using the method I found on stackoverflow. If I got a user name back I used that, if I got a empty string (because I have been passed a non display name) I just use the initial value assuming it is a valid address.

Seems to work but this there a easier solution?

Cannot run coded ui test on a TFS lab due to lack of rights to the drops folder

Whilst setting up a TFS 2012 Lab management environment for Coded UI testing we got the problem that none of the tests were running, in fact we could see no tests were even being passed to the agent in the lab

image

On the build report I clicked on  the ‘View Test Results’ which loaded the result in Microsoft Test Manager (MTM)

image

and viewed the test run log, and we saw

image

The issue, it claimed, was that the build controller did not have rights to access the drop folder containing the assembly with the CodedUI. tests.

Initially i thought the issue was the test controller was running a ‘local service’, so I changed it to the domain\tfsbuild account (which obviously has rights to the drops folder as it put the files there) but still got the same error. i was confused.

So I checked the event log on the build controller and found the following

image

The problem was my tfslab account, not the local service or tfsbuild one. So the message shown in the build report was just confusing, mentioning the wrong user. The lab account is the one configured in the test controller (yes you have to asked how had I missed that when I had been into the same tools to change the user the test controller ran as!)

image

As soon as I granted this tfslab user rights to the drops folder all was OK

Team Foundation Service RTMs

Today at Build 2012 it was announced that https://tfspreview.com has RTM'd as Team Foundation Service on https://tfs.visualstudio.com.

Up until now there has been no pricing information, which had been a barrier to some people I have spoken to as they did not want to started using something without knowing the future cost.

So to the really good news, as of now

  • It is free for up to 5 users
  • If you have an active MSDN subscription it is also free. So a team of any size can use it as long as they all have MSDN

The announcement said to look out for further price options next year.

Check the full details at Soma's blog