New problem when generating build agents using Packer
The Problem
I have been using Packer to generate our Azure DevOps Build agent VHD images for a while now, but when I came to regenerate them this time I hit a problem.
As I have documented previously, our process is that we update our fork of the Microsoft repository and then merge the newest changes into our long lived branch that contains our customisations. I then use Packer to generate a new generalised VHD which has all the same features as the Microsoft hosted agents. I then use this VHD to create our new Hyper-V based self hosted Azure DevOps agent VMs using Lability.
The problem is since the last time we ran this process a couple of months ago, Microsoft have swapped the Packer Builder option they are using in their definitions.
- They used to use the
vhd
option in theazure-arm
builder. This generated a generalised VHD in Azure Blob Storage. - They are now using the
image
option, also to be found in theazure-arm
builder, this generates a VM Image in Azure.
I can see that this is a better option for Microsoft, or anyone using Azure hosted agents, as it means the generated image is ready and waiting to be used to create an Azure hosted agent VM or VM Scale Set.
However, it is not what we need. We need a local VHD that we can use with Hyper-V/Lability.
For us, the key problem is that there does not appear to be a way to directly download the generalized VHD from the VM Image. You either have to
- Create an Azure hosted VM from the image, then re-SysPrep the VM and then download the now generalised VHD.
- Or export the images into a Azure Computer Gallery, from which you can download the VHD.
Both options add more slow steps, so initially I looked for another option, a way to keep doing what I had done before.
A Partial Solution - doing it the old way
The key point to note is that all the critical Packer definition changes are in the builder block and associated variables. The end target for the built image is changed, not what is installed on the image i.e. the Packer provisioners that install the features/tools are unchanged.
So I thought, what happens if we just revert the builder block of JSON?
So I did the following
- Copied the
{..}
variables block from the previous version of the Packer JSON file that generated a VHD to a file - Copied the
{..}
builder block (from inside the[..]
array) from the previous version of the Packer JSON file to a file - Update our fork/branch of my copy of the repo to get the updated Packer definitions that build an image.
- Ran a script that replaces the variables block and build block in the array with the historic ones from a our previously saved files
1# Load the images\win\windows20xx.json
2$target = get-content $targetFile | Convertfrom-Json
3# Load the variables json block and builders blocks
4$variables = get-content $variablesFile | ConvertFrom-Json
5$builders = get-content $buildersFile | Convertfrom-Json
6# Replacing Microsoft Variables block with our values
7$target.variables = $variables
8# Replacing Microsoft Builders jsonblock with our values
9$target.builders = @($builders) # make sure we force to be an array
10# Writing Out the modified Windows 20xx JSON file
11$target | ConvertTo-Json -depth 100 | out-file $targetFile -encoding ascii
I then ran Packer as normal, and for the Windows 2022 definition, after the usual multi-hour wait, got the expected VHD.
However, for a Windows 2019 based image I got an error installing the .NET Framework 4.5 feature. This problem did not occur when we built the Packer definition as a image
as opposed to a vhd
.
1vhd: Provisioning with powershell script: C:\azure-pipelines-virtual-environments\images\win>/scripts/Installers/Install-WindowsFeatures.ps1
2vhd: Activating Windows Feature 'NET-Framework-Features'...
3vhd: Windows Feature 'NET-Framework-Features' was activated successfully
4vhd: Activating Windows Feature 'NET-Framework-45-Features'...
5vhd: Install-WindowsFeature : The request to add or remove features on the specified server failed.
6vhd: The operation cannot be completed, because the server that you specified requires a restart.
Try as I might by re-ordering steps in the build, I could not get this to work. As soon as I fixed one issue another appeared.
I have no idea why changing the builder should cause such issues, but I needed a solution that did not require so much editing, so had to look for another option.
A Better Solution - via an Azure Compute Gallery
It was obvious I had to use the new way of using Packer. I quickly discarded the idea of creating an Azure hosted VM from the Packer generated VM image and then re-SysPrep'ing it. This would have been a slow process and would have required a lot of manual steps.
The best of the alternatives I could find was to build a VM image, using the new Packer definition, then clone it to an Azure Computer Gallery. From where I could download it as a generalised VHD.
The complete process is as follows:
[Done once] Create an Azure Computer Gallery instance in your subscription.
Run Packer to generate your generalised VM Image
In the Azure Portal view the newly created VM Image and select 'Clone to a VM Image'
- Select the previously created Azure Computer Gallery
- Provide a version number, we are using one based on the OS of the images e.g. 2022.0.1
- If it is the first time you are cloning a VM image, create a new 'Target VM Image definition' with a suitable name, for all subsequent clones just select the existing definition target
- Pick the replication rules that meet your needs, I used local replication on premium storage.
The cloning of the image version takes around 30 minutes for the 250Gb image.
If you don't want to use the portal, you could use the Azure CLI, using the command
az sig image-version
commands i.eaz sig image-version create --gallery-name MyPackerGallery --resource-group myrg --gallery-image-definition BuildAgent2022 --gallery-image-version 2.0.1 --managed-image /subscriptions/<GUID>/resourceGroups/BMPACKER/providers/Microsoft.Compute/images/buildagent --target-regions westeurope=1=premium_lrs --location westeurope
You need the
--location
else your subscriptions default location is used, which may not be where your VM Image is, resulting in the somewhat confusing error given you have provided a full URL for the image and your source and target are in the same region and resource group.(InvalidParameter) Gallery image version publishing profile regions 'westeurope' must contain the location of image version 'North Europe'.
Once the replication has completed, consider deleting the VM Image that was created by Packer as it is no longer needed
You can then download the generalised VHD using PowerShell.
1param (
2 $subscription # "My Subscription",
3 $rgName # "packer",
4 $galleryName # "PackerGallery",
5 $galleryDefintionName # "BuildAgent2022",
6 $galleryImageVersion # "2022.0.1",
7 $targetDir # "c:\download"
8)
9write-host "This script uses the Az PowerShell module"
10write-host " Install-Module -Name Az -Repository" write-host "It also assumed that AZCOPY.EXE is in the current folder`n`n`"
11
12write-host "Connect to Azure subscription '$subscription'"
13Connect-AzAccount -Subscription $subscription
14
15write-host "Connecting to Azure subscript '$subscription'"
16select-AzSubscription $subscription
17$imgver = Get-AzGalleryImageVersion -ResourceGroupName $rgName -GalleryName $galleryName -GalleryImageDefinitionName $galleryDefintionName -Name $galleryImageVersion
18
19write-host "Downloading VHD for $galleryDefintionName $galleryImageVersion"
20$imgver
21$galleryImageVersionID = $imgver.Id
22
23write-host "Creating temporary disk"
24$diskName = "tmpOSDisk"
25$imageOSDisk = @{Id = $galleryImageVersionID}
26$OSDiskConfig = New-AzDiskConfig -Location $imgver.location -CreateOption "FromImage" -GalleryImageReference $imageOSDisk
27$osd = New-AzDisk -ResourceGroupName $rgName -DiskName $diskName -Disk $OSDiskConfig
28
29$downloadPath = $targetDir + "\" + $galleryImageVersion + ".vhd"
30write-host "Granting acces to temporary disk"
31$sas = Grant-AzDiskAccess -ResourceGroupName $rgName -DiskName $osd.Name -Access "Read" -DurationInSecond 18000
32
33# We need to up the timeout else only get 35Gb of the 250Gb VHD
34write-host "Downloading VHD to $downloadpath - this will take some time"
35.\azcopy cp $sas.AccessSAS $downloadPath
Once the VHD is on our local network I was able to build my self hosted agents as normal using Lability on Hyper-V
Conclusion
In the end the change of Packer target was not as big a problem as I first thought. The extra clone step adds about 30 minutes to the process, but given the whole process of VHD creation and copying takes best part of 24 hours another 30 minutes is neither here nor there.
It will be interesting to see how I can use the Azure Computer Gallery in the future. It could make managing self-hosed agent images for Azure much easier.