Why is my SnipeIT instance suddenly slow?

Background

As I have blogged previously, we run a SnipeIT instance to manage our IT assets, hosted in Azure using Docker.

This has been working well for us for the past year, but recently we have noticed that the system has become very slow to respond.

Looking on the Azure portal, we can see that around the 15th of October the web app’s response times have gone from milliseconds to 10s of seconds

Slow web response times

Investigation

The first question we asked, as you always should, was what had changed?

The answer was nothing obvious.

  • We had not changed the Azure Web App or MySQL SKUs
  • It is true that we had not updated the SnipeIT version since July, but we were on the current major and minor version and only a few patch versions behind.
  • We had not added any significant new assets or users in the last few months.

So it all appeared strange what had changed?

We of course tried restarting the container instance, and cleaning down logfiles. We have seen slow performance in other systems when the logfiles get too large.

We also tried upping the SKU of the Azure Web App, but none of this made a difference.

We then set the SnipeIT log level to debug, this showed the high response times were not due to the web app being slow to respond, but the time taken to get data from the MySQL instance.

Looking again at the metrics in the Azure portal, I noticed that the MySQL instance started showing a higher CPU usage and increased DB connections around the same time as the web app started to slow down.

MySQL CPU

These new CPU and connection levels did not seem excessive, but they were higher than they had been previously. I then noticed in the Azure Portal a warning message, that we had used up all the MySQL Burstable SKU credits.

MySQL warning

Solution

The solution was in fact simple, to move the MySQL instance to a higher SKU, from Standard_B1s to Standard_B2ms, increasing the running cost by only a few pence a month, from £2.17 to £2.25 PCM

This SKU change doubles the resources available to MySQL, but more importantly allows us to build up more burstable CPU credit for the times we need it in a month.

So now I have a working solution, it as all down to the fact that we had been running close to the limit of our MySQL SKU for a while, and it was a matter of time before the issue occurred.

For the original version of this post see Richard Fennell's personal blog at Why is my SnipeIT instance suddenly slow?