Tuesday, 4 December 2007

Backups, backups, backups!!

Backups are quite possibly the most important part of a SQL server administration situation, but they're almost always overlooked.
The servers I look after have various different methods for backing up data ranging from local disk backups to UNC NAS storage, and even dedicated SAN LUNS. Pretty soon I'll also start streaming directly to tape for some of the multi-terrabyte backups I'll have on my hands but let's not worry about that.

I've got a few tips for anyone that needs to admin a SQL server:
1. Always make sure you a backup plan that will meet the needs of the business.
2. Use all the backups types available wisely. Transaction log, differential, and full can all be used to create restore plans, but keep in mind recovery time!
3. Always verify your backups! I recommend a script that does at least a checksum check (you are backing up "with checksum" on SQL 2005, right?), and more is better.
4. If possible, use dedicated NICs and VLANs for backup traffic.
For point 4 I setup a static route like this:
ROUTE -p ADD 10.xxx.xxx.xxx MASK 10.yyy.yyy.yyy METRIC 10
-xxx IP denotes the target subnet of your backups (i.e. a cifs share on a NetApp or other NAS... be sure to set the proper mask!)
-yyy IP denotes the adapter that'll be used to connect to the share (dedicated NIC)
-p makes it survive reboot

How to force Storage Foundations disk groups to show up as a resource in Microsoft Cluster Service (MSCS)

I had an issue with an MSCS SQL cluster a while back where the "Volume Manager Disk Group" resource disappeared from the selection drop down! This was not cool as I needed the resource as a dependency of my SQL cluster I was about to setup.

The solution was fairly easy, but most certainly NOT obvious. Here's what you need to do:
Drop to a command prompt and type... (hit enter after each line)
cluster restype "Volume Manager Disk Group" /Delete /TYPE

regsvr32 vxresext.dll

cluster restype "Volume Manager Disk Group" /CREATE /DLL:vxres.dll /TYPE:"Volume Manager Disk Group"

cluster /regext:vxresext.dll

After that's done you should see the resource available in the drop down box. If it matters, this was done on Windows 2003 SP2 R2 x64 Enterprise Edition.

Monday, 3 December 2007

MSX - TSX SQL Server Setup Problems with SQL 2005

If you can't get your TSX server to enlist then check this out:

I was trying to setup a MSX server in my domain to admin the various (and rapidly growing) SQL 2005 deployments I manage. However, the wizard kept failing with an error message along the lines of "I can't talk to that server". Now I know damn well it can so I was determined to find the problem.
After much searching and screwing around with sp_msx_enlist I finally found the problem! It was a setting in the registry called MsxEncryptChannelOptions
It turns out that the servers were attempting to communicate over SSL but there were no certs for them. Changing the default value from 2 to 0 took care of my problems. I suggest that MS gets off their ass and fixes the TERRIBLE error message you get to actually reflect the fact that it's an SSL communication error.
More info about the setting here

P.S. If you see secure channel errors in your system log while screwing with all this you need to get your certs sorted or change the setting on each instance to 0.

I've added more information here about SSL Encryption. I can't honestly recommend that people don't use it with MSX servers as you may have info going across that connection you don't want people snooping.
See my post here.

Can't get VPN to work? Try this!

I was attempting to connect to a friends test domain across the internet via a PPTP VPN he setup for me. The connection was being made quickly, and I got an IP from the connection, but I couldn't ping anything!
It was a little frustrating until I did a route print. Once I did that I noticed that my default gateway was being reset to the VPN connection IP.

To fix this I went to Network Connections and selected the properties of the VPN connection. Then:
-go to the networking tab
-select tcp/ip and then properties
-click the "Advanced" button
-remove the check mark that says "Use default gateway on remote network"
That's it!

When I connected the next time the route print looked much better and the connection worked properly and I was able to access both the remote VPN-connected network and it's resources as well as the internet through my local connection. Easy fix, but not totally obvious.

Adding Dynamic Disk from Storage Foundations (VSFW) and Microsoft Cluster Service (MSCS) in Windows 2003 R2 x64

I recently had to fight with getting some LUNS configured and presented as dynamic disks to a SQL 2005 x64 (SP 2 Build 3159) cluster running on Windows 2004 x64 R2 SP2. It wasn't as easy as I thought it would be.
The first problem is that the quorum disk can NOT be a dynamic disk. So if your storage guys are presenting 100GB LUNS get ready to waste a lot of expensive SAN disk. If you want to run in a config that's "not supported by Microsoft" you can partition the LUN that you'll use for the quorum such that the quorum parition gets only a small percentage of the disk, and then you can use the rest for something else like logs, backups, SQL binaries, etc.

Here's the steps to getting the dynamic disks in the cluster:
1. Configure your dynamic disk group(s) any way that suits your needs but be sure to create them as a cluster disk group or you're going to fail the next part.
2. Add a "Volume Manager Dynamic Disk" resource top your SQL cluster group.
-If you created the disk groups as cluster resources then you should see them in a drop down menu at the end of the configuration for this resource. If you don't or don't even see the Volume Manager resource then you need to do this fix.
NOTE: When you add the resource the disks will go offline and then back online. Be sure you're not doing anything with them when you do this!

You now have a SQL cluster resource with dynamic disks that's ready to have SQL installed, or if it's already installed then you can migrate your db files as needed.

Dynamic disks are the way to go as you can easily grow volumes, extend stripes, add mirrors, etc. I never use basic disks when I connect a SQL server to a SAN.

Random note:
I like to setup at least one LUN for logs (ldf) and at least one more for data files (mdf)... If the server is going to be under enough load I'll make sure tempdb is on it's own LUNs as well. (System Center Operations Manager is an example of that config).

See this post for a potential resolution with VCS disks not showing up.