Sitecore Development / Kim Hornung

Thursday, August 30, 2007

Performance of Sitecore Query

As many Sitecore developers know, using Sitecore Query to retrieve items is a really powerful concept (if you don't know, have a look at

Sitecore Query allows you to write XPath-like expressions, so in a single line, you can for example retrieve all items based on a specific template, filtered by various field values:

Item[] activityItems = activityFolderItem.Axes.SelectItems("descendant::*[@@templatekey='intranet.activity' and (@title != '' or @corporatetitle != '' or @headline != '')]");

This is really cool!

But what about performance?
It turns out that the previous code snippet is not as efficient as it could be. The flexibility of Sitecore Query does come with a price.

I did some testing on a relatively small data set, and it turned out that:

  • The first time the code is executed, it took anywhere from 300ms to 1 second! [updated Aug. 31, 2007: the original test was performed with a debugger attached, without this the timings are more like 70-200 ms]
  • For subsequent executions, it took 3-7ms

So how to optimize the code?
The trick to optimize the code is to avoid the predicates. It is (at least with the current implementation of Sitecore Query) cheaper to simply return all descendants, then filter them with regular ASP.NET code:

Item[] activityItems = activityFolderItem.Axes.GetDescendants();
if (activityItems != null)
for (int i = 0; i < activityItems.Length; i++)
if (item.Template.Key == "intranet.activity" && (item["title"] != "" item["corporatetitle"] != "" item["headline"] != ""))

With this simple change, the code runs much better:

  • The first time the code is executed, it now takes 1-3ms [updated Aug. 31, 2007: this is not true, my test case was buggy. In fact, it takes almost as long as the original code - most of the time is spent fetching items from the database]
  • For subsequent executions, it now takes ½-1½ms

Your measurements might vary, but the difference should be significant - especially for the first execution.

Happy coding :o)


  • Nice to know that it is faster to use brute force iteration and not Sitecore Query ;)

    How many items did you try this on?

    Normally if you want to handle something, where you get all descendants, you want to use an index instead.

    By Anonymous Jens Mikkelsen, At 09:19  

  • Hi Jens,

    Good point! Best practice is to use an index if you need to handle something where you get all descendants.

    So maybe I should have added an "don't do this at home, I'm a trained professional" disclaimer at the top of my post :o)

    My initial tests were on a very small data set of only 50 descendants. Larger data sets increase the time considerable for both the original and the optimized code.

    For 700 items I get the following numbers (when I don't clear the cache first):
    - The original code takes 110-130 ms
    - The optimized code takes 19-30 ms, thus saving around 90 ms per call!

    For 4800 items I get the following numbers (when I don't clear the cache first):
    - The original code takes 380-400 ms
    - The optimized code takes 165-180 ms, thus saving around 210 ms per call!

    Notice: These expressions get slower and slower the more items the datas set (descendants) contain, no matter how many matches are found.

    Using an index (the Lucene search engine), the number of items in the data set doesn't really matter. Only the amount of hits (matches) matters:
    - For around 50 matches, including filtering to make sure that we only return activites below the desired node, the Lucene code takes 20 ms.
    - For around 700 matches, including filtering, the Lucene code takes around 250 ms.

    So which algoritm to choose? Depends on the use case. In general, an index gives the best performance for large data sets, and also scales well. Using descendants can only be recommended for small data sets.


    By Blogger Kim Hornung, At 13:57  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home