SCCM SQL Cluster Problem
Earlier this week I had an issue with backing up SCCM that was because Kerberos was not enabled on the cluster. Got that fixed, but I was noticing other things on my SCCM server that just didn’t seem right. (Instructions for how to enable Kerberos are in the link above.) In particular I noticed that my Site System Status was red. In looking into this I saw where SCCM was referencing the SQL cluster nodes directly…not the SQL cluster. That’s not good. So I took a look at the Site Systems (under Site Settings), and here is what I saw:
What you see here is that the SQL cluster does NOT hold the site database role. That role is held directly by the SQL nodes. What happened was that although Kerberos must be enabled on the cluster for normal SCCM operation, the pre-req checker apparently does not check for this. As a result it allowed the install to go through and ended up installing directly to the nodes instead of to the SQL cluster…because it could not see the cluster since Kerberos was not enabled on it. Anyway…all of that said…it’s a major problem. Site Status is red. Who knows what would happen in the event of a SQL node failover.
So I got to thinking. I’m pretty sure the problem is a result of the database being created on the SQL server before Kerberos was enabled. In theory, I should be able to move the DB elsewhere, then move it back (now that Kerberos is enabled) and everything would be lovely again. Nice theory. But will it work? Enough thinking…let’s find out.
I moved the DB to a SQL named instance on the same server using the instructions found here. [Note…at the time of this writing there is a mistake in the instructions. Between steps 2 and 3 should be a step about actually going into SQL server and detaching and attaching the site DB. I reported it, and Microsoft acknowledged that it is missing and it will be fixed in the next update of the documentation.] After bringing up SCCM on the named instance, I shut it back down and used the same process to move it back to the default instance. Here is what it looks like now:
Note that the site database role is on the SQL cluster now. The two nodes are still in the list, but they have no roles associated with them. Right clicking them does not give an option to delete. According to Wally Mead and Stan White, those two should age out of the system after 30 days. The very nice thing is that my Site Status is now a lovely shade of green.
I reported this issue as a bug in SCCM. Got a great response from Wally Mead. He assigned it to the SCCM SP1 team for possible inclusion in SP1. Very cool!
Good work!
How did you manage to get SCCM to work with a NAMED sql cluster instance? I’ve been unable to get SCCM console to open up after the database’s been moved to a named instance. It seems to work only with default (ie, unnamed) instances. The Console would just errors out.
Same story here, didn’t work. Verified kerberos was working on the sql server, ran through the steps outlined in the tech article with the added step of detaching/reattaching, and it still left my named instance as a site system, and the db on both nodes of the cluster…as well as leaving them as component servers and site systems as well. SCCM separates the men from the boys, that’s for sure. What a PITA.
Yeah, I see the same problem. I’m using SCCM 2007 SP2 R3. I don’t see the SQL cluster name in my Site Systems.
Just node1 and node2 (with database server roles)…but there is no mention of SQL cluster virtual name.
Yeah same problem here with SCCM 2007 SP2, Nodes, but no SQL Cluster name.
Is this post still open for responses?
@Joe – If you click on the node name, under ConfigMgr site database name you will see the named clustered instance.
I’m also curious how the virtual name is able to be added. I can’t find a way to do it without streams of errors everywhere. It must be to support the shonky backup process where the SCCM server object needs to be in the local admins group on the cluster nodes. Why it can’t use the DB access account to dump an export of the tables is beyond me.
We had a Microsoft RAP performed on our SCCM infrastructure and this was an issue they highlighted for remediation that the SQL named instance was not in the list but the nodes were. If this is the case, how do you install if the pre-req checker can’t check kerberos in a two forest resource domain architecture?
Answers on a postcard please!