Filtering search on folder path with Sitecore Contentsearch API

Using the new Sitecore contentsearch api allows you search against a lucene index with very little effort. Examples are a bit short on the ground but using Linq you can find yourself doing something like this to search for an item by name, somewhere within a folder structure:

public void BadSearch(string searchTerm)
{
   var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
   using (var context = webIndex.CreateSearchContext())
   {
      var results = context.GetQueryable<SearchResultItem>().Where(i =>
         i.Name == searchTerm &&
         i.Path.StartsWith("sitecore/content/stuff/")); // don't do this!
   }
}

This may well work. However you may also run into a Lucene error stating that you are using too many clauses, the default limit is 1024.

The underlying reason is that string.StartsWith() or .Contains() or .EndsWith() are all mapped to the lucene SpanWildcardQuery type by the Sitecore search API. Within Lucene itself this is them expanded to thousands of Boolean queries that, in a round-about way, fullfill the string filter. Depending on your content this can be expanded to more than 1024 boolean clauses just for this simple search and this hits the limit defined in .config. Incidently this config setting does not work insofar as the value is not actually read from config, but the default value is the same.  A work-around is therefore required if you wish to increase this limit, but that is for a separate post.

If you do raise the clause limit high enough you’ll alleviate this clause-limit symptom and you’ll get your result, albeit with a performance hit. Be warned that this performance hit could also make the use of Lucene pointless. This  post is specifically about how to filter your search results within a particular folderPath and as such if you’re filtering using the SearchResultItem.Path string property you’re doing it wrong!

Instead of the string .Path property, cast your eye to the .Paths property – an array containing the id of every parent folder for the item. Using Paths.Contains(folderId) achieves the same result Path.StartsWith(“\some\folder\tree\”)  but does so without any wildcard queries and is drastically quicker in one scenario in the order of 15ms compared to 250ms.

public void GoodSearch(string searchTerm)
{
   var searchFolder = Factory.GetDatabase("web").GetItem("sitecore/content/stuff");
   var webIndex = ContentSearchManager.GetIndex("sitecore_web_index");
   using (var context = webIndex.CreateSearchContext())
   {
      var results = context.GetQueryable<SearchResultItem>().Where(i =>
         i.Name == searchTerm &&
         i.Paths.Contains(searchFolder.ID)); // much better
   }
}

 

8 thoughts on “Filtering search on folder path with Sitecore Contentsearch API

  1. Sasha

    Yes, I ran into this issue before, and it was very frustrating. The solution with i.Paths.Contains(searchFolder.ID) is so great, that they teach it now in Sitecore Web Developer course. Thank you for posting!!

    Reply
  2. Zach

    Paul! Thank you so much for this post! I’ve been fighting with this issue for a couple of days now and you provided the solution and knowledge I needed. (I was using Azure Search, BTW.)

    Reply
  3. Rikke

    Hi

    Im sttrugling with this Contains(ID) of the index array example here

    Indexfield queried in Solr:
    “tagscase_sm”:[“5a8e142612724fd995bf662dcc06f0c3”,
    “fa9fdd2f5d684833ac25c7c9994c4b39”],

    SearchResultItem class:

    [System.Runtime.Serialization.DataMemberAttribute]
    [IndexField(“tagscase”)]
    public virtual Enumerable FooCase{ get; set; }

    var isFooIncluded = PredicateBuilder.True();

    List idsFoo = Somemethod();

    foreach (var i in idsFoo)
    {
    isFooIncluded = isFooIncluded .And(x => x.FooCase.Contains(i));
    }

    using (var context = ContentSearchManager.GetIndex(“sitecore_web_index”).CreateSearchContext())
    {
    return context.GetQueryable()
    .Where(isFooIncluded )
    .ToList();
    }

    It does not return any items in list even though I know that the search item includes the ids in the array.

    I have tried to hardcode the ID inside Contains but it still does not work.
    I have tried to change the idsFoo to an array instead idsFoo.ToArray()
    Still does not work.

    Reply
    1. Paul Post author

      Your variables and method naming is too generic for me to understand what it is you’re trying to do.

      Reply
      1. Rikke

        Hi Paul,

        I foundation the problem.
        I have a custom search result item classic which contained a property named Tagscase ejere the indexfield was named tagscase. I know from other indexfield named that the suffix og the indexfield could bedste disregarded, so I removed _sm from the name.
        Bur found out agter adding the suffix to the indexfield that IT worked. Like this: tagscase_sm.

        Noe IT retuned WhatsApp expected from Contains method in the Enumable property .

        Reply

Leave a Reply

Your email address will not be published. Required fields are marked *