But it works on my PC!

The random thoughts of Richard Fennell on technology and software development

Running a SaaS service at scale

Brian Harry has done a couple of very interesting posts (post 1 and post 2) on the recent outages of the VSTS service. Whether you use VSTS or not they make interesting reading for anyone who is involved in running SaaS based systems, or anything at scale.

From the posts the obvious reading is you cannot under estimate the importance of

  • in production montoring
  • having an response plan
  • doing a proper root cause analysis
  • and putting steps in place to stop the problem happening again

Well worth a read

Repost: What I learnt extending my VSTS Release Process to on-premises Lab Management Network Isolated Environments

This a a repost of a guest article first posted on the Microsoft UK Developers Blog: How to extend a VSTS release process to on-premises

Note that since I write the original post there have been some changes on VSTS and the release to TFS 2015.2 RC1. These mean there is no longer an option to pull build artifacts from the an external TFS server as part of a release; so invalidating some of the options this post discusses. I have struck out the outdated sections. The rest of the post is still valid, especially the section on where to update configuration settings. The release of TFS 2015.2 RC1 actually makes many of options easier as you don’t have to bridge between on premises TFS and VSTS as both build and release features are on the same server.


 

Background

Visual Studio Team Services (VSTS) provides a completely new version of Release Management, replacing the version shipped with TFS 2013/2015. This new system is based on the same cross platform agent model as the new vNext build system shipped with TFS 2015 (and also available on VSTS). At present this new Release Management system is only available on VSTS, but the features timeline suggest we should see it on-premises in the upcoming update 2015.2.

You might immediately think that as this feature is only available in VSTS at present, that you cannot use this new release management system with on-premises services, but this would not be true. The Release Management team have provided an excellent blog post on running an agent connected to your VSTS instance inside your on-premises network to enable hybrid scenarios.

This works well for deploying to domain connected targets, especially if you are using Azure Active Directory Sync to sync your corporate domain and AAD to provide a directory backed VSTS instance. In this case you can use a single corporate domain account to connect to VSTS and to the domain services you wish to deploy to from the on-premises agent.

However, I make extensive use of TFS Lab Management to provide isolated dev/test environments (linked to an on-premises TFS 2015.1 instance). If I want to deploy to these VMs it adds complexity in how I need to manage authentication; as I don’t want to have to place a VSTS build agent in each transiently created dev/test lab. One because it is complex and two because there is a cost to having more than one self provisioned vNext build agent.

It is fair to say that deploying to an on-premises Lab Management environment from a VSTS instance is an edge case, but the same basic process will be needed when the new Release Management features become available on-premises.

Now, I would be the first to say that there is a good case to look at a move away from Lab Management to using Azure Dev Labs which are currently in preview, but Dev Labs needs fuller Azure Resource Manager support before we can replicate the network isolated Lab Management environments I need.

The Example

So at this time, I still need to be able to use the new Release Management with my current Lab Management network isolated labs, but this raises some issues of authentication and just what is running where. So let us work through an example; say I want to deploy a SQL DB via a DACPAC and a web site via MSDeploy on the infrastructure shown below.

 

image

Both the target SQL and Web servers live inside the Lab Management isolated network on the proj.local domain, but have DHCP assigned addresses on the corporate LAN in the form vslm-[guid].corp.com (managed by Lab Management), so I can access them from the build agent with appropriate credentials (a login for the proj.local domain within the network isolated lab).

The first step is to install a VSTS build agent linked to my VSTS instance, once this is done we can start to create our release pipeline. The first stage is to get the artifacts we need to deploy i.e. the output of builds. These could be XAML or vNext build on the VSTS instance, or from the on-premises TFS instance or a Jenkins build. Remember a single release can deploy any number of artifacts (builds) e.g. the output of a number of builds. It is this fact that makes this setup not as strange as it initially appears. We are just using VSTS Release Management to orchestrate a deployment to on-premises systems.

The problem we have is that though our release now has artifacts, we now need to run some commands on the VM running the vNext Build Agent to do the actual deployment. VSTS provides a number of deployment tasks to help in this area. Unfortunately, at the time of writing, the list of deployment tasks in VSTS are somewhat Azure focused, so not that much use to me.

clip_image004

This will change over time as more tasks get released, you can see what is being developed on the VSO Agent Task GitHub Repo (and of course you could install versions from this repo if you wish).

So for now I need to use my own scripts, as we are on a Windows based system (not Linux or Mac) this means some PowerShell scripts.

The next choice becomes ‘do I run the script on the Build Agent VM or remotely on the target VM’ (within the network isolated environment). The answer is the age-old consultants answer ‘it depends’. In the case of both DACPAC and MSDeploy deployments, there is the option to do remote deployment i.e. run the deployment command on the Build Agent VM and it remotely connects to the target VMs in the network isolated environment. The problem with this way of working is that I would need to open more ports on the SQL and Web VMs to allow the remote connections; I did not want to do this.

The alternative is to use PowerShell remoting, in this model I trigger the script on the Build Agent VM, but it uses PowerShell remoting to run the command on the target VM. For this I only need to enable remote PowerShell on the target VMs, this is done by running the following command and follow prompts on each target VM to set up the required services and open the correct ports on the target VMs firewall.

winrm -qc 

This is something we are starting to do as standard to allow remote management via PowerShell on all our VMs.

So at this point it all seems fairly straight forward, run a couple of remote PowerShell scripts and all is good, but no. There is a problem.

A key feature of Release Management is that you can provide different configurations for different environments e.g. the DB connection string is different for the QA lab as opposed to production. These values are stored securely in Release Management and applied as needed.

clip_image006

The way these variables are presented is as environment variables on the Build Agent VM, hence they can accessed from PowerShell in the form env:$__DOMAIN__. IT IS IMPORTANT TO REMEMBER that they are not presented on any target VMs in the isolated lab network environment, or to these VMs via PowerShell remoting.

So if we are intending to use remote PowerShell execution for our deployments we can’t just access settings environment variables as part of the scripts being run remotely; we would have to pass the environment variable in as PowerShell command line arguments.

This works OK for the DACPAC deployment as we only need to pass in a few, fixed arguments e.g. The PowerShell script arguments when passing the arguments for the package name, target server and DB name using the Release Management variables in their $(variable) form become:

-DBPackage $(DBPACKAGE) -TarhegDBName $(TARGETDDBNAME) –TargetServer $(TARGETSERVERNAME)

However, for the MSDeploy deploy there is no simple fixed list of parameters. This is because as well as parameters like package names, we need to modify the setparameters.xml file at deployment time to inject values for our web.config from the release management system.

The solution I have adopted is do not try to pass this potentially long list of arguments into a script to be run remotely, the command line argument just becomes hard to edit without making errors, and needs to be updated each time we add an extra variable.

The alternative is to update the setparameters.xml file on the Build Agent VM before we attempt to run it remotely. To this end I have written a custom build task to handle the process which can found on my GitHub repo. This updates a named setparameters.xml file using token replacement based on environment variables set by Release Management. If you would rather automatically find a number of setparmeters.xml files using wildcards (because you are deploying many sites/services) and update them all with a single set of tokens, have a look at Colin Dembovsky’s build task which does just that.

So given this technique my release steps become:

1. Get the artifacts from the builds to the Build Agent VM.

2. Update the setparameters.xml file using environment variables on the Build Agent VM.

3. Copy the downloaded (and modified) artifacts to all the target machines in the environment.

4. On the SQL VM run the sqlpackage.exe command to deploy the DACPAC using remote PowerShell execution.

5. On the Web VM run the MSDeploy command using remote PowerShell execution.

clip_image008

The PowerShell I run in the final two tasks are just simple wrappers around the underlying commands. The key fact is that because they are scripts it allows remote execution. The targeting of the execution is done by associating each task with a target machine group, and filtering either by name or in my case role, to target specific VMs.

clip_image010

In my machine group I have defined both my SQL and Web VMs using the names on the corporate LAN. Assigning a role to each to make targeting easier. Note that it is here, in the machine group definition, that I provide the credentials required to access the VMs in my Network Isolated environment i.e. a proj.local set of credentials.

clip_image012.

Once I get all these settings in place I am able to build a product on my VSTS build system (or my on-premises TFS instance) and using this VSTS connected, but on-premises located; Build Agent deploy my DB and web site to a Lab Management network isolated test environment.

There is no reason why I cannot add more tasks to this release pipeline to perform more actions such as run tests (remember the network isolated environment already has TFS Test Agents installed, but they are pointing to the on-premises TFS instance) or to deploy to other environments.

Summary

As I said before, this is an edge case, but I hope it shows how flexible the new build and release systems can be for both TFS and VSTS.

Release Manager 2015 stalls at the ‘uploading components’ step and error log shows XML load errors

Whilst seting up a Release Management 2015.1 server we came across a strange problem. The installation appears to go OK. We were able to install the server and from the client created a simple vNext release pipeline and run it. However, the release stalled on the ‘Upload Components’ step.

Looking in event log of the VM running the Release Management server we could see many many errors all complaining about invalid XML, all in the general form

 

Message: Object reference not set to an instance of an object.: \r\n\r\n   at Microsoft.TeamFoundation.Release.Data.Model.SystemSettings.LoadXml(Int32 id)

 

Note: The assembly that it complaining about varied, but all Release Management Deploayer related.

We tried a reinstall on a new server VM, but got the same results.

Turns out issue was due to the service account that the Release Management server was running as; this was the only thing common between the two server VM instances. We swapped to use ‘Network Server’ and everything lept into life. All we could assume was that some group policy or similar settings on the service account was placing some restriction on assembly or assembly config file loading.

vNext Build editor filePath control always returns a path even if you did not set a value

You can use the filePath type in a vNext VSTS/TFS task as shown below 

{
     "name": "settingsFile",
     "type": "filePath",
     "label": "Settings File",
     "defaultValue": "",
     "required": false,
     "helpMarkDown": "Path to single settings files to use (as opposed to files in project folders)",
     "groupName":"advanced"
   }

to present a file picker dialog in the build editor that allows the build editor to pick a file or folder in the build’s source repository

image

While doing some task development recently I found that this control did not behave as I had expected

  • If a value is explicitally set then the full local path to selected file or folder (on the build agent) is returned e.g. c:\agent\_work\3\s\yourfolder\yourfile.txt – just as expected
  • If you do not set a value, or set a value then remove your setting when you edit a build, then you don’t get an empty string, as I had expected. You get the path to the BUILD_SOURCESDIRECTORY e.g. c:\agent\_work\3\s – makes sense when you think about it.

So, if as in my case, you wanted to have specific behaviour only when this values was set to something other than the repo root you need to add some guard code


if ($settingsFile -eq $Env:BUILD_SOURCESDIRECTORY )
{
    $settingsFile = ""
}

Once I did this my task behaved as a needed, only running the code when the user had set an explicit value for the settings file.

A VSTS vNext build task to run StyleCop

I have previously posted on how a PowerShell script can be used to run StyleCop as part of vNext VSTS/TFS build. Now I have more experience with vNext tasks it seemed a good time to convert this PowerShell script into a true task that can deploy StyleCop and making it far easier to expose the various parameters StyleCop allows.

To this end I have written a new StyleCop task that can be found in my vNext Build Repo, this has been built to use the 4.7.49.0 release of StyleCop (so you don’t need to install StyleCop in the build machine, so it works well on VSTS).

To use this task:

  1. Clone the repo
  2. Build the tasks using Gulp
  3. Upload the task you require to your VSTS or TFS instance

Once this is done you can add the task to your build. You probably won’t need to set any parameters as long as you have settings.stylecop files to define your StyleCop ruleset in the same folders as your .CSPROJ files (or are happy default rulesets).

If you do want to set parameters your options are:

  • TreatStyleCopViolationsErrorsAsWarnings - Treat StyleCop violations errors as warnings, if set to False any StyleCop violations will cause the build to fail (default false).

And on the advanced panel

  • MaximumViolationCount - Maximum violations before analysis stops (default 1000)
  • ShowOutput - Sets the flag so StyleCop scanner outputs progress to the console (default false)
  • CacheResults - Cache analysis results for reuse (default false)
  • ForceFullAnalysis - Force complete re-analysis (default true)
  • AdditionalAddInPath - Path to any custom rule sets folder, the directory cannot be a sub directory of current directory at runtime as this is automatically scanned. This folder must contain your custom DLL and the Stylecop.dll and Stylecop.csharp.cs else you will get load errors
  • SettingsFile - Path to single settings files to use for all analysis (as opposed to settings.stylecop files in project folders)

 

image

 

When you run the build with the new task you sould expect to see a summary of the StyleCop run on the right

image

A new vNext task to run StyleCop

Update 6 Feb 2016 - I have made some major changes to this task to expose more parameters, have a look at this post that details the newer version

Today a good way to pull together all your measures of code quality is to run SonarQube within your automated build; in a .NET world this can show changes in quality over time for tools such as FxCop (Code Analysis) and StyleCop. However sometime you might just want to run one of these tools alone as part of your automated build. For Code Analysis this is easy, it is built into Visual Studio just set it as a property on the project. For StyleCop it is a bit more awkward as StyleCop was not designed to be run from the command line.

To get around this limitation I wrote a command line wrapper that could be used within a build process, see my blog post for details of how this could be used with vNext build.

Well that was all best part of a year ago. Now I have more experience with vNext build it seems wrong to use just a PowerShell script when I could create a build task that also deploys StyleCop. I have eventually got around to writing the task which you can find in my vNextBuild repo.

Once the task is uploaded to your TFS for VSTS instance, the StyleCop task can be added into any build process. The task picks up the file locations from the build environment variables and then hunts for StyleCop settings files (as detailed in my previous post). The only argument that needs to be set is whether the buidl should fail if there are violations

 

image

Once this is all setup the build can be run and the violations will be shown in the build report, whether the build fails or passes is down to how you set the flag for the handling of violations

image

.

Follow up from my session at the Black Marble Tech Update 2016

There have been some requests for more information about the areas I convered in my presentation at the Black Marble Tech Update 2016 that we held last week.

I could send out slides, but I think it is far more useful to point you at the ‘live’ resource on the Internet. The key reason for this is that the whole of the Visual Studio family is now being released at a ‘cloud cadence’ i.e. new features are appearing rapidly, so anything I write will soon be out of date. Better to look at the live sources where possible.

Hope you find these useful pointers

Fixing cannot load dashboard issues on BlogEngine.NET using sub blog aggregation

As I discovered during my BlogEngine upgrade, there is an effort within the project team to focus the codebase on three possible usage models on any given BlogEngine server instance:

  • Single blog with a user – a personal blog (default)
  • Single blog with many users – a team/company blog
  • Many blogs each with a single user – a set of related blogs that can be agregated togther

I needed the third option, problem was in its history our blog has been both of the other two types, so I have multiple user accounts for each blogs, and login usernames are repeated between individual blogs on the server.

This is not fundamentally an issue for a server running in the third mode, except on the primary blog that is setup to provide agregation of all the other blogs.  Even here, on a day to day basis, it is not an issue either, basic post RSS aggregation is fine. However, when you login as an administration user and try to access the dashboard you get the error

Item has already been added. Key in dictionary: 'displayname' Key being added: 'displayname'

The workaround I have used in the past was to temporarily switch off blog aggregation whenever I needed to access the primary blog dashboard – not the best solution.

After a bit of investigation of the codebase I found that this issue is due to the fact we had users  called ‘admin’ on the primary and all the child blogs. The fix I used was a bit of SQL to do some user renaming from ‘admin’ to ‘adminblogname’ . I needed to rename the username in a few tables.

AS USUAL BEWARE THIS SQL, MAKE SURE YOU HAVE A BACKUP BEFORE YOU USE IT, IT WORKS FOR ME BUT I MIGHT HAVE MISSED SOMETHING YOU NEED


update p
set p.SettingValue = concat (p.SettingValue , ' ', b.BlogName)
from be_Profiles p
    inner join be_Blogs b on
        b.BlogID = p.BlogId
where
SettingName ='displayname' and
SettingValue = 'admin';

update p
set p.UserName = concat (p.UserName , b.BlogName)
from be_Profiles p
    inner join be_Blogs b on
        b.BlogID = p.BlogId
where
username= 'admin';

update u
set u.UserName = concat (u.UserName , b.BlogName)
from be_Users u
    inner join be_Blogs b on
        b.BlogID = u.BlogId
where
username = 'admin';

update r
set r.UserName = concat (r.UserName , b.BlogName)
from be_UserRoles r
    inner join be_Blogs b on
        b.BlogID = r.BlogId
where
username = 'admin';

 

This is not a problem specific to admin users, any username duplication will cause the same error. This basic SQL script can be modified to fix any other user accounts you might have username clashes on.

Once this SQL was run I was able to login to the dashboard on the primary blog as expected.

Upgraded to BlogEngine.NET 3.2

I have just completed the upgrade of this blog server to the new release 3.2 of BlogEngine.NET. I did a manual upgrade (as opposed to the automated built in upgrade) as I needed to make a few changes from the default settings. The process I used followed the upgrade process document

  1. Downloaded the latest release and unzip the folder
  2. Run the SQL upgrade script (in /setup/sqlserver folder), this adds some new DB constraints
  3. Created a IIS web site using the new release
  4. Copied in the sample web.config from the /setup/sqlserver folder.
  5. Copied in my App_DATA folder
  6. Accessed my site

As I had not copied anything from the old custom folder, I had theme issues at this point. However, I decided to moved all the blogs to the newest generation of theme templates, so did a quick fix up by hand on each one, picking the required theme and making sure any settings, like Twitter accounts, were set (note these are set on a per blog/per theme basis, so changing a theme means you need to reenter any custom values). I also needed to copy in a few missing logos and any extra widgets from my old custom folder the blogs were using.

Once this was all done I had an upgraded blog server.

Running CodeUI tests on a VM with on remote desktop session open as part of a vNext build

If you want to run CodeUI tests as part of a build you need to make sure the device running the test has access to the UI, for remote VMs this means having a logged in session open and the build/test agent running interactivally. Problem is what happens when you disconnect the session. UNless you manage it you will get the error

Automation engine is unable to playback the test because it is not able to interact with the desktop. This could happen if the computer running the test is locked or it’s remote session window is minimized

In the past I would use a standard TFS Lab Management Environment to manage this,you just check a box to say the VM/PC is running coded UI tests and it sorts out the rest. However, with the advent of vNext build and the move away from Lab Manager this seems overly complex.

It is not a perfect solution but this works

  1. Make sure the VM autologs in and starts your build/test agents in interactive mode (I used SysInternal AutoLogin to set this up)
  2. I connect to the session and make sure all is OK, but I then disconnect redirecting the session
    • To get my session ID, at the command prompt, I use the command query user
    • I then redirect the session tscon.exe RDP-Tcp#99 /dest:console, where RDP-Tcp#99 is my session ID
  3. Once I was disconnected my CodeUI test still run

I am sure I can get a slicker way to do this, but it does fix the immediate issue

Updated:

This bit of Powershell code could be put in a shortcut on the desktop to do the job, you will want to run the script as administrator

$OutputVariable = (query user) | Out-String

$session = $OutputVariable.Substring($OutputVariable.IndexOf("rdp-tcp#")).Split(" ")[0]

& tscon.exe $session /dest:console