Firstly it’s been a while (about 4 years or so) since I last blogged, and so there’s going to be a bit of a gap in what I blog about now versus what I used to blog about. Why the big gap, and what’s been happening between I might talk about in a future post but for now I thought I’d share a recent war story of a TFS Upgrade gone wrong, which had a pretty arcane error.
The client had previously been using an on-premise TFS 2012.4 server and had expressed an interest in upgrading to a newer version, TFS 2018.3 to be precise.
Naturally our first recommendation was to move to Azure DevOps (formerly VSTS, VSO) but this wasn’t a good fit for them at the point in time, issues around Data sovereignty and all that and Brexit generally putting the spook into everyone in the UK who had been eyeing the European Data Centres.
Non the less we built some future proofing into the new TFS server we built and configured for them, chiefly using SQL Server 2017 to mitigate any problems with them updating to Azure DevOps Server 2019/2020 which is due for release before the end of the year, or just after the turn of the year if the previous release cadence of TFS is any indicator.
We performed a dry-run upgrade installation, and then a production run, the client I suspect brushed through their dry-run testing a little too quickly and failed to notice an issue which appeared in the production run.
There was no content in the Code Hub file content.
: Script error for “BuiltInExtensions/Scripts/TFS.Extension”
We also saw certain page assets failing to load in the network trace.
Editor.main.css looked to be quite important given the context of the page we were in. It was also noted in the network trace we had quite a number of 401’s and many of the page assets were not displaying correctly in TFS (like the Management/Admin Cog, the TFS logo in the top right, folder and branch icons in the Code Hub source control navigator). We were stumped at first, a support call from Microsoft in the end illuminated us to the issue. The client had a group policy setting which prevented membership of a role assignment in Local Policies from being modified.
When adding the IIS feature to Windows Server, the local user group IIS_IUSRS normally gets added to this role. In the clients case, because of this group policy setting which prevented role assignments from being made, this had not occurred. No error had been raised in the feature enablement, and so no one knew anything had gone amiss when setting up the server.
This local user group contains (as I understand it) the application pool identities when creating application pools in IIS. TFS’s app pool needs this impersonation policy to load certain page assets as the currently signed in user. Some group policy changes later we were able to add the local group this by hand and resolve the issue (combined with an iisreset command). It’s been explained to me this used to be a more common error back in the days of TFS 2010 and 2012 but is something of a rarity these days hence no luck with any Google-Fu or other inquiries we made to the error.
Interestingly a week later I was performing another TFS Upgrade for a different client, they were going to TFS 2017.3 and the Code Hub, Extension Management, and Dashboards were affected by the same error, fortunately recent experience helped us resolve that quickly.