Saturday, August 10, 2013

Internal details of SPWorkItemJobDefinition or several reasons of why SPWorkItemJobDefinition doesn’t work

SPWorkItemJobDefinition is the special type of the regular SPJobDefinition – base class for custom Sharepoint jobs. The main difference of work item jobs from regular jobs is that they allow to pass parameters, which can be useful in different scenarios. For example when you need to specify that data for processing is located on specific site collection and on specific site inside this site collection.

First of all you need to define class – inheritor of SPWorkItemJobDefinition:

   1:  
   2: public class MyWorkItemJob : SPWorkItemJobDefinition
   3: {
   4:     public override int BatchFetchLimit
   5:     {
   6:         get
   7:         {
   8:             // how many items at time will be processed
   9:             return 1;
  10:         }
  11:     }
  12:  
  13:     public MyWorkItemJob()
  14:     {}
  15:  
  16:     public MyWorkItemJob(string name, string title, SPWebApplication webApp):
  17:         base(name, webApp, null, SPJobLockType.None)
  18:     {
  19:         this.Title = title;
  20:     }
  21:  
  22:     public override Guid WorkItemType()
  23:     {
  24:         return Constants.WORK_ITEM_TYPE_ID;
  25:     }
  26:  
  27:     public override void ProcessWorkItems(SPContentDatabase db,
  28:         SPWorkItemCollection workItems, ref bool continueProcessing)
  29:     {
  30:         try
  31:         {
  32:             if (workItems == null || workItems.Count == 0)
  33:             {
  34:                 // log
  35:                 return;
  36:             }
  37:  
  38:             // process first item (there should not be more than one,
  39:             // see BatchFetchLimit property)
  40:             var workItem = workItems[0];
  41:             using (var site = new SPSite(workItem.SiteId))
  42:             {
  43:                 using (var web = site.OpenWeb(workItem.WebId))
  44:                 {
  45:                     ...
  46:                 }
  47:             }
  48:         }
  49:         catch (Exception x)
  50:         {
  51:             // log
  52:             throw;
  53:         }
  54:     }
  55: }

There are several important notes. First of all in constructor we pass SPJobLockType.None, which, according to the documentation, means:

Provides no locks. The timer job runs on every machine in the farm on which the parent service is provisioned, unless the job I associated with a specified server in which case it runs on only that server (and only if the parent service is provisioned on the server).

But in case of SPWorkItemJobDefinition it has also other reason. Let’s see how SPWorkItemJobDefinition.Execute() method is implemented (this is the starting point of the work item job):

   1:  
   2: public override void Execute(Guid targetInstanceId)
   3: {
   4:     SPWebApplication parent = base.Parent as SPWebApplication;
   5:     if (base.LockType == SPJobLockType.ContentDatabase)
   6:     {
   7:         SPContentDatabase db = parent.ContentDatabases[targetInstanceId];
   8:         this.HandleOneContentDatabase(db);
   9:     }
  10:     else if (base.LockType == SPJobLockType.None)
  11:     {
  12:         bool flag;
  13:         do
  14:         {
  15:             flag = false;
  16:             IEnumerator<SPContentDatabase> enumerator =
  17:                 parent.ContentDatabases.GetEnumerator();
  18:             while (enumerator.MoveNext())
  19:             {
  20:                 SPContentDatabase current = enumerator.Current;
  21:                 flag |= this.HandleOneContentDatabase(current);
  22:             }
  23:         }
  24:         while (flag);
  25:     }
  26: }

The logic here is quite simple: if in constructor we passed SPJobLockType.ContentDatabase it gets instance of content database using its id passed in targetInstanceId parameter and calls HandleOneContentDatabase() method (lines 5-9). There may be problem with this code: Sharepoint may pass incorrect targetInstanceId. E.g. we faced with situation when it passed correct targetInstanceId on development environment, but it was incorrect on production. In this case you have to override Execute() method and fix it e.g. like this:

   1: public override void Execute(Guid targetInstanceId)
   2: {
   3:     try
   4:     {
   5:         var parent = base.Parent as SPWebApplication;
   6:         if (parent == null)
   7:         {
   8:             // log
   9:             return;
  10:         }
  11:         var db = parent.ContentDatabases[0];
  12:         base.Execute(db.Id);
  13:     }
  14:     catch (Exception x)
  15:     {
  16:         // log
  17:         throw;
  18:     }
  19: }

It may be the one of the reasons of why your job doesn’t work. This workaround will work only with 1st content database, if you have several you need to write additional code there. However let's return to our analysis. If SPJobLockType.None was used then code enumerates all content databases and calls HandleOneContentDatabase method for each of them. As you probably know there is also SPJobLockType.Job lock type, however if it is used for work item job, this job won’t be run at all. This is second important finding.

Let’s see how HandleOneContentDatabase is implemented:

   1: private bool HandleOneContentDatabase(SPContentDatabase db)
   2: {
   3:     bool continueProcessing = false;
   4:     SPWorkItemCollection activeWorkItems = db.GetActiveWorkItems(this.WorkItemType());
   5:     activeWorkItems.ProcessingBatchSize = this.BatchFetchLimit;
   6:     activeWorkItems.ProcessingThrottle = 0;
   7:     if (activeWorkItems.Count > 0)
   8:     {
   9:         continueProcessing = true;
  10:         if (base.LockType != SPJobLockType.None)
  11:         {
  12:             this.ProcessWorkItems(db, activeWorkItems);
  13:             return continueProcessing;
  14:         }
  15:         this.ProcessWorkItems(db, activeWorkItems, ref continueProcessing);
  16:     }
  17:     return continueProcessing;
  18: }

Here it also checks lock. If ContentDatabase lock is used, it calls ProcessWorkItems() method with the following signature:

   1: public virtual void ProcessWorkItems(SPContentDatabase db,
   2:     SPWorkItemCollection workItems)
   3: {
   4:     ...
   5: }

If None lock is used, then it uses the following signature:

   1: public virtual void ProcessWorkItems(SPContentDatabase db,
   2:     SPWorkItemCollection workItems, ref bool continueProcessing)
   3: {
   4:     ...
   5: }

This is another reason of why your SPWorkItemJobDefinition doesn’t work: if you passed SPJobLockType.None, but then override ProcessWorkItems with two parameters, it won’t run, because in this case you need to override ProcessWorkItems with three parameters (and in opposite way for SPJobLockType.ContentDatabase).

Another important moment is that it uses SPWorkItemJobDefinition.WorkItemType() guid for retrieving items from the database. Each custom work item class should override this method in order to specify what items should be handled by this job type (line 4 in the HandleOneContentDatabase() method).

Now our custom job is ready and we need to add the work item to the queue (to the dbo.ScheduledWorkItems table in content database). It can be done by the following code:

   1: SPContext.Current.Site.AddWorkItem(Guid.NewGuid(),
   2:     DateTime.Now.ToUniversalTime(),
   3:     Constants.WORK_ITEM_TYPE_ID,
   4:     SPContext.Current.Site.RootWeb.ID,
   5:     listId,
   6:     listItemIntegerId,
   7:     true,
   8:     itemId,
   9:     Guid.NewGuid(),
  10:     SPContext.Current.Web.CurrentUser.ID,
  11:     null,
  12:     string.Empty,
  13:     Guid.Empty);

Parameters are well described in the following post: Processing items with Work Item Timer Jobs in SharePoint 2010. Here it is important to note that we pass time in UTC format (line 2) and the same work item type as used in the custom work item job class in WorkItemType() method (line3). After this if you will check dbo.ScheduledWorkItems table you should see new row for the item which was added by the code above. But this item won’t be processed without running the job instance. So the next step will be running our custom job:

   1: var jobDefinition = new MyCustomWorkItemJob("MyCustomWorkItemJob",
   2:     "My custom work item job", SPContext.Current.Site.WebApplication);
   3: var schedule = new SPOneTimeSchedule(DateTime.Now);
   4: jobDefinition.Schedule = schedule;
   5: jobDefinition.Update();

This example shows how to run custom SPWorkItemJobDefinition one time. If you need to run it e.g. hourly or daily you need to use another class for the schedule.

However all of this also don’t guarantee that your work item job will run. Let’s check code of HandleOneContentDatabase() method again. In order to get active work items for processing it calls SPContentDatabase.GetActiveWorkItems() method. In turn this method uses proc_GetRunnableWorkItems stored procedure in the content database. There is a lot of code in this stored procedure, but we need to see it also in order to find one more reason of why job can’t be run:

   1: CREATE PROCEDURE [dbo].[proc_GetRunnableWorkItems] (
   2:         @ProcessingId          uniqueidentifier,
   3:         @SiteId                uniqueidentifier,
   4:         @WorkItemType          uniqueidentifier,
   5:         @BatchId               uniqueidentifier,
   6:         @MaxFetchSize          int = 1000,
   7:         @ThrottleThreshold     int = 0
   8:         )
   9: AS
  10:     SET NOCOUNT ON
  11:     IF (dbo.fn_IsOverQuotaOrWriteLocked(@SiteId) >= 1)
  12:     BEGIN
  13:         RETURN 0
  14:     END
  15:     DECLARE @iRet int
  16:     SET @iRet = 0
  17:     DECLARE @oldTranCount int
  18:     SET @oldTranCount = @@TRANCOUNT
  19:     DECLARE @Now datetime
  20:     SET @Now = dbo.fn_RoundDateToNearestSecond(GETUTCDATE())
  21:     DECLARE @InProgressCount int
  22:     DECLARE @ThrottledFetch int
  23:     DECLARE @ReturnWorkItems bit
  24:     SET @ReturnWorkItems = 0
  25:     BEGIN TRAN
  26:     SET @InProgressCount = 0
  27:     SET @ThrottledFetch = 0
  28:     SET @ThrottleThreshold = @ThrottleThreshold + 1
  29:     IF @ThrottleThreshold > 1
  30:     BEGIN
  31:         SET ROWCOUNT @ThrottleThreshold
  32:         SELECT 
  33:             @InProgressCount = COUNT(DISTINCT BatchId)
  34:         FROM
  35:             dbo.ScheduledWorkItems WITH (NOLOCK)
  36:         WHERE
  37:             Type = @WorkItemType AND
  38:             DeliveryDate <= @Now AND
  39:             (InternalState & (1 | 16)) = (1 | 16)
  40:     END
  41:     IF @BatchId IS NOT NULL
  42:     BEGIN
  43:         SET @ThrottledFetch = 16
  44:     END
  45:     IF @InProgressCount < @ThrottleThreshold
  46:     BEGIN
  47:         SET ROWCOUNT @MaxFetchSize
  48:         UPDATE
  49:             dbo.ScheduledWorkItems
  50:         SET
  51:             InternalState = InternalState | 1 | @ThrottledFetch,
  52:             ProcessingId = @ProcessingId
  53:         WHERE
  54:             Type = @WorkItemType AND
  55:             DeliveryDate <= @Now AND
  56:             (@SiteId IS NULL OR 
  57:                 SiteId = @SiteId) AND
  58:             (@BatchId IS NULL OR
  59:                 BatchId = @BatchId) AND
  60:             (InternalState & ((1 | 2))) = 0
  61:         SET @InProgressCount = @@ROWCOUNT
  62:         SET ROWCOUNT 0            
  63:         IF @InProgressCount <> 0
  64:         BEGIN
  65:           EXEC @iRet = proc_AddFailOver @ProcessingId, NULL, NULL, 20, 0
  66:         END
  67:         SET @ReturnWorkItems = 1
  68:     END
  69: CLEANUP:
  70:         SET ROWCOUNT 0
  71:         IF @iRet <> 0
  72:         BEGIN
  73:             IF @@TRANCOUNT = @oldTranCount + 1
  74:             BEGIN
  75:                 ROLLBACK TRAN
  76:             END
  77:         END
  78:         ELSE
  79:         BEGIN
  80:             COMMIT TRAN
  81:             IF @InProgressCount <> 0
  82:                AND @InProgressCount <> @MaxFetchSize 
  83:                AND @WorkItemType = 'BDEADF09-C265-11d0-BCED-00A0C90AB50F'
  84:                AND @BatchId IS NOT NULL AND @SiteId IS NOT NULL
  85:             BEGIN
  86:                 UPDATE
  87:                     dbo.Workflow
  88:                 SET
  89:                     InternalState = InternalState & ~(1024)
  90:                 WHERE
  91:                     SiteId = @SiteId AND
  92:                     Id = @BatchId    
  93:             END            
  94:             IF @ReturnWorkItems = 1
  95:             BEGIN
  96:                 SELECT ALL
  97:                     DeliveryDate, Type, ProcessMachineId as SubType, Id,
  98:                     SiteId, ParentId, ItemId, BatchId, ItemGuid, WebId,
  99:                     UserId, Created, BinaryPayload, TextPayload, InternalState
 100:                 FROM
 101:                     dbo.ScheduledWorkItems
 102:                 WHERE
 103:                     Type = @WorkItemType AND
 104:                     DeliveryDate <= @Now AND
 105:                     ProcessingId = @ProcessingId
 106:                 ORDER BY
 107:                     Created
 108:                 IF @@ROWCOUNT <> 0
 109:                 BEGIN
 110:                     EXEC @iRet = proc_UpdateFailOver @ProcessingId, NULL, 20
 111:                 END
 112:             END
 113:         END
 114:         RETURN @iRet

Here it makes several queries to the ScheduledWorkItems table and it is important to note that every time it uses the following condition: DeliveryDate <= @Now. I.e. retrieves all items which were created earlier that current datetime. This is very important, because if database server’s time is unsynchronized with time on WFE server (e.g. when it is earlier than on the WFE) from where we created job using SPSite.AddWorkItem() method, your job may not be run because proc_GetRunnableWorkItems stored procedure won’t run items. It will happen because delivery date time will belong to the future on the database server. In order to fix the issue you need to synchronize time on your database server (it will be better if time will be synchronized on all servers in the farm. You will have less problems in general in this case).

This is all what I wanted to say about internal mechanisms of SPWorkItemJobDefinition and the reasons of why it may not work. Hope that it was interesting and it will help in your work.

No comments:

Post a Comment