Anyways, the performance of this sync job is fairly constant until a certain point and then it just blows up and quadruples the run time which affects other processes.
Previously this was solved when I noted the complex system we use to defrag our databases had a bug in it. I fixed the bug, defragmented things and everything went back to normal.
This time however, there is no bug and the tables that play a role didn't show any sort of fragmentation at all. In QA I just went ahead and defragmented everything with the "quick & dirty" rebuild script
EXEC sp_msforeachtable 'ALTER INDEX ALL ON ? REBUILD'
Never, ever use that on a production database :p
A test showed that the sync was indeed working properly again. Great. So all we need to do is ensure that the database has essentially zero fragmentation on all tables and indexed views to get this to work. I'd also take a pet unicorn while we're wishing.
So I did a little more digging around and I noticed that our automated index rebuilding script didn't consider index_id = 0 (HEAP) which meant I wasn't viewing any data for that while reviewing fragmentation levels. I coded up a very quick fragmentation check query
SELECT object_name(object_id)
,index_id
,partition_number
,page_count
,avg_fragmentation_in_percent
,index_type_desc
FROM sys.dm_db_index_physical_stats(DB_ID(N'--dbname--'), NULL, NULL, NULL , 'LIMITED')
ORDER BY page_count DESC, avg_fragmentation_in_percent DESC;
What I found shocked me. There were 10+ large heaps with 99.5%+ fragmentation!!
I immediately started to dig in and see why these tables didn't have clustered indexes and you know what I found? No reason. No reason at all. There were already non-clustered and unique indexes on every table I found. For some reason the developer/vendor just never clued in that over time they would create massive fragmentation on these tables and not give us poor DBAs a chance to fix it. My assumption is that this was developed way back in the SQL 6 or 7 days where I *think* they used linked-heap table structure... don't quote me on that though as I've never touched anything older than 2000 (and oh man do I not miss 2000!)
If you're not sure how I fixed things yet I'll tell you. I converted a UQ index on each table to a clustered index (using a best guess approach) which rebuilt all the others. Fragmentation is under 1% and I'll be able to monitor the fragmentation now to see if I need to do anything else with these tables.
All this work because a vendor didn't bother to properly design their database. Shocking, eh? ;)
No comments:
Post a Comment