Tuesday, 3 February 2015

Sitecore pipelines in high volume sites

Introduction


If you use class-level variables in your Sitecore pipelines you might discover inconsistent and unwanted behaviour.


Explanation


A Sitecore pipeline is merely a collection of steps with each step being a method that is called on a given class.  Let's say your pipeline step has many functions and you want to maintain a piece of state across those functions; you might write code like this
namespace MyNamespace.Pipelines
{
    public class ProcessParent
    {
        private Item targetItem;
        private string targetField;
        public void Process(MyPipelineArgs args)
        {
            if (args.Item == null)
            {
                args.AbortPipeline();
                return;
            }
            this.targetItem = args.Item.Parent;
            this.targetField = args.Fieldname;
            // A lot of code that either uses this.targetItem or code that
            // calls various methods that use this.targetItem
        }
    }
}

This code will work fine on your local machine,  It will work in test and on UAT.  It will probably even work in production.  Under heavy load though, you're going to start seeing some strange behaviour.

Explanation


In order to keep performance high, Sitecore does not create each class in the pipeline as it is needed, instead it creates a single instance of your class which it then reuses every time the pipeline is processed.  When traffic is low to moderate you probably won't see any problems.  However as traffic gets very high you can start to get multiple threads executing in the same class, and as there is only one instance, all threads share the same class-level data.  So ThreadA sets the targetItem to "ABC" when it starts, then halfway through ThreadB comes along and sets targetItem to "XYZ".  When ThreadA is switched back in it is looking at the wrong data.

Solution


The solution is pretty simple really, don't use class-level state in your pipelines; instead pass the state through your functions as parameters and return values in order to utilise thread local storage, which makes your code thread-safe.

No comments:

Post a Comment