February 5, 2014

Sitecore Clean Up Databases Tool Fails

February 5, 2014

Sitecore Clean Up Databases Tool Fails
We recently ran into an issue with a custom agent that we wrote for a client.  The agents job is to automatically sync images in a network location with those in the Sitecore media library.  The agent runs on a set schedule and was working great until the DBAs noticed that our Sitecore databases were consuming massive amounts of disk space.  Upon further examination, both the 'master' and 'web' databases were in excess of 700 GB.

The Problem

The master and web databases were growing at an exponential rate and the times of growth were tied to when our custom agent was processing.  We were quickly running out of disk space on our database servers.  I looked at the size of the images contained in the sync folder and knew that it should only be about 10GB.  It looked like Sitecore was maintaining old blob records.  I knew that Sitecore provided a "Clean Up Databases" (Sitecore 6.6) utility under "Control Panel\Database" to help with exactly this type of problem.

"Clean Up Databases" allows you to select the database(s) you would like to 'clean' and performs the following actions: (thanks to @DanSolovay and http://bit.ly/1fMv2mF)
  1. Removes items that have parents, but the parents are not in the item tree.
  2. Removes invalid language data.
  3. Removes fields for non existing items.
  4. Removes orphaned items.
  5. Removes unused blob records.
  6. Removes fields from orphaned items removed in step 4.
  7. Rebuilds the Descendants table (which stores parent/child relationships).
  8. Clears all caches.
I was interested in was removing the unused blob records that were apparently sticking around.  I ran the utility and let it go.  It ran and ran and ran.  I continued to check on the utility and could see that the size of the blobs table was slowing returning to a 'normal' size.  After 6 hours, I saw that an exception was thrown and that SQL Server then backed-out of the transaction, returning the database to the pre-clean up size.  After opening up the logs, I could see that the error was:

Exception: System.Data.SqlClient.SqlException

Message: The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.

It turns out I was trying to delete so much data that the utility used up all the available locks for SQL Server.

The Solution

I figured that I needed some way to remove the unused blobs in smaller batches so that we could acquire the locks and then release them as data is deleted.  I contacted Sitecore Support to come up with a way to do just that.  With their insight into what was happening behind the scenes, we came up with a script that can be run manually to clean up the blobs table.

DECLARE @UsableBlobs table(
    ID uniqueidentifier
    );

INSERT INTO 
    @UsableBlobs    
select convert(uniqueidentifier,[Value]) as EmpID from [Fields]
where [Value] != '' 
and (FieldId='{40E50ED9-BA07-4702-992E-A912738D32DC}' or FieldId='{DBBE7D99-1388-4357-BB34-AD71EDF18ED3}') 
delete top (XXXXX) from [Blobs] 
where [BlobId] not in (select * from @UsableBlobs)

Using a temp table, the script looks for all valid blobs and stores them.  It then finds all the blobs not in the temp table and deletes them.  Simply replace the "XXXXX" in order to control how many records are deleted at once.  I started with one record to prove the concept and then went to 1000 and eventually was deleting 200k records at once with no issues.

I was able to clean up the databases manually and recover the space for the database servers.  We modified our scheduling of the agent to run less frequently and additionally created a 'clean-up' agent that ensures that we don't run into this issue again.

It should be noted that this script shouldn't replace the "Clean Up Databases" utility, but can be used in cases where things have already gotten out of hand.  When possible, use the built-in Sitecore utilities to manage the databases.

Also note: Please make backups of your data before running any scripts you find on the Internet.

Additional Reading

My Stack Overflow question regarding this issue: http://bit.ly/1dp3109
Dan Solovay's Stack Overflow answer about the "Clean Up Databases" Utility: http://bit.ly/1fMv2mF

3 comments:

  1. I have not run into this issue before, but I am glad I read your post as this could prove to be very useful to know.

    Thank you for putting the time to let everyone know of this solution.

    I am going push for us to do clean up more often to prevent running into this issue.

    ReplyDelete
  2. Glad to help!

    Note that Sitecore says this blob issue was resolved in Sitecore 6.6.0 rev. 130111 (6.6.0 Update-3) (http://sdn.sitecore.net/Products/Sitecore%20V5/Sitecore%20CMS%206/ReleaseNotes/ChangeLog.aspx). My client was using Sitecore 6.6 Update 4, so I'm not sure if there is still a problem or it was more of a problem with the frequency in which we were running the agent.

    Also, with regards to clean up, check out the DMS tuning guide: http://sdn.sitecore.net/upload/sitecore6/65/dms_performance_tuning_guide_sc65-usletter.pdf. It's technically written for DMS tuning, but the parts about creating maintenance plans that check and fix your databases will help keep your instance healthy.

    ReplyDelete
  3. Did you find out what was really the cause of the increasing of the Blob table?
    We have maybe the same problem but only occuring in our Web database.
    We are running Sitecore.NET 7.2 (rev. 140526) so this version should include the problem correction of 6.6 update3 you mentionned on StackOverflow.

    ReplyDelete