Filtering search on folder path with Sitecore Contentsearch API

Using the new Sitecore contentsearch api allows you search against a lucene index with very little effort. Examples are a bit short on the ground but using Linq you can find yourself doing something like this to search for an item by name, somewhere within a folder structure:

public void BadSearch(string searchTerm)
{
   var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
   using (var context = webIndex.CreateSearchContext())
   {
      var results = context.GetQueryable<SearchResultItem>().Where(i =>
         i.Name == searchTerm &&
         i.Path.StartsWith("sitecore/content/stuff/")); // don't do this!
   }
}

This may well work. However you may also run into a Lucene error stating that you are using too many clauses, the default limit is 1024.

The underlying reason is that string.StartsWith() or .Contains() or .EndsWith() are all mapped to the lucene SpanWildcardQuery type by the Sitecore search API. Within Lucene itself this is them expanded to thousands of Boolean queries that, in a round-about way, fullfill the string filter. Depending on your content this can be expanded to more than 1024 boolean clauses just for this simple search and this hits the limit defined in .config. Incidently this config setting does not work insofar as the value is not actually read from config, but the default value is the same.  A work-around is therefore required if you wish to increase this limit, but that is for a separate post.

If you do raise the clause limit high enough you’ll alleviate this clause-limit symptom and you’ll get your result, albeit with a performance hit. Be warned that this performance hit could also make the use of Lucene pointless. This  post is specifically about how to filter your search results within a particular folderPath and as such if you’re filtering using the SearchResultItem.Path string property you’re doing it wrong!

Instead of the string .Path property, cast your eye to the .Paths property – an array containing the id of every parent folder for the item. Using Paths.Contains(folderId) achieves the same result Path.StartsWith(“\some\folder\tree\”)  but does so without any wildcard queries and is drastically quicker in one scenario in the order of 15ms compared to 250ms.

public void GoodSearch(string searchTerm)
{
   var searchFolder = Factory.GetDatabase("web").GetItem("sitecore/content/stuff");
   var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
   using (var context = webIndex.CreateSearchContext())
   {
      var results = context.GetQueryable<SearchResultItem>().Where(i =>
         i.Name == searchTerm &&
         i.Paths.Contains(searchFolder.ID)); // much better
   }
}

 

2 thoughts on “Filtering search on folder path with Sitecore Contentsearch API

  1. Sasha

    Yes, I ran into this issue before, and it was very frustrating. The solution with i.Paths.Contains(searchFolder.ID) is so great, that they teach it now in Sitecore Web Developer course. Thank you for posting!!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *