Update on my old Dell

I posted in the past about all the problems I had with my overheating Dell 5150, and with the problems trying to Vista Betas working on it. Well an update on both….

  1. After putting new thermal grease on the CPU heatskin it never overheated again
  2. With the release version of Vista every bit of hardware (bar the modem which I quickly found a driver for) was detected and worked first time and I have seen no problems with the PC since.

So I have a healthy, three year old laptop running Vista with the Aero interface (and the 5150 does have a nice high res. screen) perfectly adequately.

This all said I have still moved onto one of our new company standard Acer 8210 Laptops with a Core2Duo running Vista.

Follow up on our nVidia RAID problems

I had posted on problems with the nVidia Raid on our SunFire servers. Well I think I now have the root cause of the problems: not the Sun hardware, the nVidia RAID, or Windows 64 bit drivers.

All the problems we had were when we used mirrored pairs of Western Digital 500Gb SATA drives that we had bought in a single batch of four drives. Identical drives bought on another day were fine, as were 500Gb drives from Maxtor and Hitachi.

After testing these four drives we found three of them kept developing low level unfixable bad blocks, irrespective of the PC, Sun or any other brand, they were used in. It seems when one of these bad blocks was hit:

  • the nVidia RAID caused the mirrored pair to loose their sync and the server hung, when rebooted the server had two drives and the mirror had to be recreated.
  • if Windows Software Mirroring was used we just lost the mirrored pair, at least there was no server hang. The mirror would try to recover – with mixed success, usually failing at the same point each time. However, sometimes working, hence all our confusion in finding the root cause of the problem.

Given this experience we are staying with Windows Software Mirroring as at least the server does not hang.

Now in twenty odd years in this business I have never had three out of four drives fail in single batch, in fact I don’t think I have ever had a ‘dead on arrival’ hard disk from any of the big name brands.

My guess is these four drives were dropped at some point after they left the factory QA department and before we bought them. The faulty three are off back to WD under warranty, I wonder if the fourth will survive? It is certainly not going into any system that is critical.

When the server just stops

Today I rebuilt a PC with new drives and all seemed OK, but after 15 minutes or so it kept stopping (no nice shutdown), irrespective of what the PC was doing. I even swapped back the old disks all to no avail, hence I was stumped for a while.

The problem turned out to be it was a somewhat full case and a wire was stopping the system fan turning, so the CPU was going into thermal shutdown and leaving no event log messages. Also the motherboard was not complaining as the am was still drawing current.

I should have spotted this one earlier, as it has happened before, so it goes on my blog of stupid things you forget to check for!

Problems removing McAfee ProtectionPoint 1.5.0 Agent

When your try to remove the McAfee ProtectionPilot 1.5.0 agent using the command

frminst /remove=agent

you get an error “can’t stop service mcafeeframework” if you are using VirusScan 8.5.

This is because VirusScan 8.5 has a new option under Access Protection that stops the McAfee services being stopped. So if you want to remove the agent you first have to go into the VirusScan Console and on the access protection properties page uncheck the Proect McAfee Services, then disable Access Protection for good measure. Then run the remove command andall should be OK

Moving Microsoft CRM 3.0

I had to move our CRM install this week to new hardware, and I had expected it to be a nasty job. However it turned out to be straight forward, though it is really hard to find any useful Microsoft Dynamics CRM documentation beyond the basic manuals. Why do the Dynamics team make it so hard to find things out?

Anyway these are the steps to follow:

To move the database (got this process from a Microsoft Knowledgebase document on setting up SQL clustering for CRM)

  1. Backup the CRM databases in SQL Manager (the yourcompany_METABASE and yourcompany_MSCRM)
  2. Restore them on the new SQL server
  3. On the CRM server load the CRM deployment management tool
  4. Select the Server Manager, your server and right click to get properties.
  5. Use the Configure SQL option to point at the new SQL server
  6. And that should be all!
  7. I recommend then taking the old SQL server CRM DBs off line to make sure there is no confusion.

Now that was not hard was it, using this process I moved our DBs from an old SQL 2000 server to a new SQL 2005 64Bit server.

However I could not move the front end to a new 64Bit platform as it only support Asp.Net 1.1. All the post I found said that it will not work with ASP.NET 2.0 and on 64Bit it is .NET 2.0 only. So I put the front end onto a Virtual 32bit Windows 2003 server I had for just such legacy systems.

  1. On the new server run the CRM 3.0 installer
  2. Let it install any prerequisites, services etc.
  3. Enter your existing product key
  4. Point it at your existing CRM Db
  5. On the page that says if it can install or not you will get a warning that the product key is already in the DB and will be ignored, you might get some more depending on setup
  6. Let the install run to completion
  7. You should now have a new front end server that points to the existing DB with all the data there as you would hope. Also all the license/user management will be as it was before.

Again not to hard, why is it not documented anywhere I could find?

This setup will do us until the vNext edition comes out with proper .NET 2 support.

Of course these notes have to carry the usual disclaimer of make sure you backup first!

Changing TFS user account passwords

If you change the domain\tfsservice and/or the domain\tfsreports as you would expect your TFS server to stop, and it does. To get it back working you have to reset the passwords in:

  • The various TFS AppPools on the frontend server need the new tfsservice password
  • The datasource in ([frontendserver]/reports) on the reporting services need the new tfsreports password.

After that all should be OK

Swapping from Nvidia RAID to Software mirroring on a SunFire x2100

Our SunFire x2100 application servers, running Windows 2003 R2, are configured with Nvidia RAID system (built into the motherboard) to mirror the pair of SATA drives in each servers. Two of these servers keeps losing the mirrors, but a third, and a SunFire x2200, do not. When this happens Windows hanging – not good in a server.

We have had a support call open with Sun and swapped bits of hardware, but it keeps happening. During the support calls we discovered that the Nvidia RAID is not true hardware, but a ‘software trick’. So not a great step up from letting Windows do the mirroring itself.

To try to isolate if we have a hardware problems (motherboard, disk etc.) or a driver problem we decided to move from the Nvidia RAID to Windows software mirroring. This was not as hard as I had expected, even though you have to load a special driver for Nvidia RAID, it does not the standard SATA  built into Windows.

So the process is:

  1. Boot to BIOS setup (F2)
  2. In the integrated peripherals, disable the Nvidia RAID
  3. Boot to Windows (may take a couple of reboots to get to a login prompt, not sure why this is)
  4. When logged in you should now see two drives, copies of each other
  5. In the Admin Tools | Computer Manager, select the Disk manager and convert the two drives from basic to dynamic drives (this needs a reboot or two)
  6. When this is completed go back into the disk manager and delete the partition on the second drive
  7. Select the partition on the primary drive right click and add the second disk as a mirror

And now you should be done, at least when the resysnc completes, which took about four hours for our 500Gb disks which are only about 10-20% full.

It goes without saying to make sure you have a backup before you start.

Lets see if this fixes our problems

Microsoft SQL 2005 starts then stops with 3414 error

I recently had one of our Windows 2003 server lose it’s disk mirrors and locked up. When it was restarted it has two (virtually idenitical) drives C: and E:. It booted off the primary mirror disk (C:) and all seemed OK except SQL.

I also tried booting off the secondary mirror (E:) but this would not boot (this drive it turns out had some bad blocks).

So I went back to the primary disk. The actual problems was SQL server started but then stopped after a few seconds, the Windows error log showed the unhelpful 3414 error. I google for this, but all that was mentioned was issues with DTC, but this did not relavent as we not use distributed transactions. There was nothing else on the web of note.

I had a look at the MSQL.1\logs directory and this showed problems loading the various databases. So it seems when the disk de-mirrored it was writing SQL transaction logs, and they ended up corrupted. So in my case a generic 3414 error in the error log meant corrupt transactions that could not be rolled forward or back.

More in hope than expectations I tried copying the SQL datafiles and logs back from the faulty secondary drive (E:) and tried to restart SQL and this worked – SQL started without a problem! I was lucky the bad blocks were not near the SQL files. This saved me from having to rebuild the server and restore backups, espcially as some the the DBs were SharePoint, and a SharePoint SQL restore is rarely fun!

When installing Cassini why do I always forget this?

If installing the Cassini Personal Server on a PC you will often get the “Cassini managed web server failed to start listening to port 80. Possible conflict with another web server on the same port.” error.

You of course think this is a firewall, other web server or anti virus port blocker problems

IT IS NOT!

Ok it might be those problems as well but usually it is that you need to run

gacutil /i c:\cassini\cassini.dll

or just drag a copy of the cassini.dll into the GAC (C:\Windows\Assembly)

Shame the installer does not do this.

Moving Community Server

Today I moved this blog server from an old server to our nice shinny new ones. This meant splitting it so the DB to the dedicated SQL server and the front end to the new web server box. This cause a few problems.

The actual move was fine, just back and restore the DB and copy over the ASP.NET web contents. I then edited the web.config to point at the new server and had some problems, some expect some not.

  • First I altered the CustomErrors block to report full errors
  • In the SiteSqlServer setting I alter the server name to point to point at the new server, until this was done I got not unsurprisingly a server not found error.
  • Once the server was right I got a MyDomain\MyServerName$ could not connect, so I created this user giving the rights listed in the database readme file could in the scripts sub directory.
  • However this did not work, I then got a CS generated form titled Critical Error: Data Store Unavailable, that told me to edit the entries I had just edited!

After much digging about I found the answer in the CS forums, you have to also give the user the ASPNET_* rights for the CommunityServer database.

Hope this saves someone some time.