Sitecore custom search
Home

Sitecore custom search

Author: Ajit Sharma | Categories: Sitecore CMS, CMS

Lucene is an open source search engine used in Sitecore CMS for indexing and searching the contents of a Web site. Sitecore implements a wrapper for the Lucene engine which has its own API. The original API (Lucene.Net) and the Sitecore API (Sitecore.Search) are both accessible to developers that want to extend their indexing and search capabilities. We’ll use Sitecore.Search to create our Custom Search.

Sitecore has given search a front row seat in Sitecore 7. This new version offers marketers the environment to adapt the search experience without having to go to a Search Administrator or a developer. Marketers can now pick an attribute and make it faceted element in a search. These facets can span different types of content. Marketers can also tweak and tune content relevancy, through direct tagging, boosting rules, pushing the most important content to the top of search results. Boosting, which gives more weight to something in search results, can be done by field, field type, item and more.

Sitecore 7 has search technology as the information access layer now. By doing this, Sitecore has taken off the scaling limits that a typical content management infrastructure offers. Out of the box, Sitecore comes with Lucene as its core search engine, enhanced for this version and Solr, new to this version of Sitecore, but customers aren’t tied to Lucene. The search has been implemented using the provider model, so you can plug-in your search engine of choice to be the information access layer.

Important Note

This document is basically focused at Search in Sitecore 6.6. The Sitecore.Data.Indexing API was deprecated in Sitecore CMS 6.5. The Sitecore.Search API works with Sitecore CMS 6 and 7 but is not recommended for new development in version 7.0. Developers should use the Sitecore.ContentSearch API when using Sitecore Search or Lucene search indexes in Sitecore 7.0.

By Extending the Web Site Search (Custom Search), we can easily do the following:

  • Create filters to exclude certain items from the search results.
  • Create categories to present search results grouped by location or other criteria.
  • In a bidirectional relationship, display items from the reverse side of the relationship.
  • Create faceted navigation from items in a bidirectional relationship.
  • Create relationship by the name of the referenced item rather than the ID (raw field value).

Extending the indexing and search capabilities basically consist of:

  • Creating our own crawler/indexer for creating a custom index.
  • Creating our own custom search result class which will use the custom index to find Sitecore items
  • Adding Search configuration settings for the custom indexer in web.config

Sitecore.Search API

The Sitecore.Search API acts as a wrapper for Lucene.Net. It provides flexible integration of Sitecore with Lucene and a set of .NET-friendly wrappers around Lucene classes related to search. However, before we start creating our own crawler using the Sitecore.Search API, it is important to understand some key concepts.

Index

An index is a physical file populated by a crawler and stored in the data folder of the Web site. A crawler is a program that constantly reads the site pages (Sitecore items) and creates an index for Search Engine. In our case, the Search Engine is Lucene. In Sitecore, an index can serve a number of different purposes:

  • From a Sitecore configuration perspective, an Index is the listing of keywords along with the Sitecore item/page where they are referred. An Index is identified by its name and stores data in a physical location in the data folder.
  • From crawler’s perspective, an index is an interface to Lucene.
  • From an end-user perspective, an index is a searchable list of keywords which links to actual Sitecore items where they are referred.

Lucene Search Index Classes

Lucene uses these classes during the indexing process and in the retrieval of data from indexes regardless of which API is used.

Class Description
Document Defines a single searchable entry in the index.Document
is like a row in a database. A Document represents a single searchable Sitecore Item. Each document consists of one or more fields.
Field Field
is like a column name in a database. But unlike database table, document can have multiple fields with same name (like in XML)

By default when any item is indexed, the following fields are added to the Lucene document and maintained by Sitecore in Lucene Index:

  • _id : Item ID
  • _path : Item Path
  • _language : Item Language
  • _name : Item Name
  • _group :
  • _url : Item URI/Path
  • _content : Item field values of type Single-Line Text, Rich Text, Multi-Line Text, text, rich text, html, memo, Word Document.

Other classes use these Fields to access the content in an index via queries. We can add more fields to extend/customize the search functionality.

Sitecore.Search Classes

Group Class Brief Description
Configuration SearchConfiguration Used in the web.config file.
Field Query Object Model CombinedQuery FieldQuery FullTextQuery

QueryBase

QueryClause

QueryOccurance

Provides an object model to create complex queries.
Search Context ISearchContext SearchContext Simple framework to attach context information to the search, such as user identity or user preferences.
Handling Results SearchHit SearchHits SearchResult

SearchResultCollection

SearchResultCategoryCollection

Wrapper for Lucene API enabling grouping, paging, and categorization of search results.
Constants and Enumerations SearchType BuiltinFields BuiltinFields contains the names of the fields maintained by Sitecore by default in the system index. Search or filter these fields to
refine search results.

Now we can start creating the custom search.

Search Code

 

using Sitecore;
using Sitecore.Data.Fields;
using Sitecore.Data.Items;
using Lucene.Net.Documents;
using Sitecore.Data.Managers;
using Field = Lucene.Net.Documents.Field;
using System;
using Sitecore.Search;
using Lucene.Net.Search;
using System.Collections.Generic;
using System.Web.UI.WebControls;
using Sitecore.Data;
using System.Text;

namespace CustomSearch
{
   /// <summary>
   /// Constants representing names of custom fields in the Lucene Index.
   /// </summary>
   public struct CustomIndexFields
   {
      public const string FirstName = "FirstName";
      public const string MiddleName = "MiddleName";
      public const string LastName = "LastName";
      public const string Practice = "Practice";
      public const string HideInSearch = "HideInSearch";
   }

   ///<summary>
   ///Custom Indexer for adding custom fields in the Lucene Index.
   ///Search or filter these fields to refine search results.
   ///</summary>
   public class CustomIndexer : Sitecore.Search.Crawlers.DatabaseCrawler
   {
      protected override void AddAllFields(Document document, Item item, bool versionSpecific)
      {
         //FirstName
         if (item.Fields["FirstName"] != null)
         {
            document.Add(new Field(CustomIndexFields.FirstName, item.Fields["FirstName"].GetValue(true),
               Field.Store.YES, Field.Index.TOKENIZED));
         }

         // MiddleName
         if (item.Fields["MiddleName "] != null)
         {
            document.Add(new Field(CustomIndexFields. MiddleName, item.Fields["MiddleName "].GetValue(true),
               Field.Store.YES, Field.Index.TOKENIZED));
         }

         // LastName
         if (item.Fields["LastName "] != null)
         {
            document.Add(new Field(CustomIndexFields. LastName, item.Fields["LastName"].GetValue(true),
               Field.Store.YES, Field.Index.TOKENIZED));
         }

         //PracticeRelationships
         //Store the names of related Practices instead of Practice IDs
         if (item.Fields["PracticeRelationships"] != null)
         {
            var practices = ((MultilistField)item.Fields["PracticeRelationships"]).GetItems();
            if (practices != null && practices.Length > 0)
            {
               foreach (var practice in practices)
               {
                  string pName = "";
                  if (practice.Fields["Name"] != null)
                  {
                     pName = practice.Fields["Name"].GetValue(false).Trim();
                     if (!string.IsNullOrEmpty(pName))
                     {
                        document.Add(new Field(PeopleIndexFields.Practice, pName, Field.Store.YES, Field.Index.ANALYZED));
                     }
                  }
               }
            }
         }

         //We can call base.AddAddFields function if we want to used sitecore default Full Text Search
         //base.AddAllFields(document, item, versionSpecific);
      }
   }

   ///<summary>
   ///This class handles the custom search.
   ///Create custom queries and search/filter the custom fields to refine search results.
   ///</summary>
   public class CustomSearchResults
   {
      Sitecore.Search.Index index = SearchManager.GetIndex("CustomIndex");

      public List<Item> GetSearchResults(string firstName, string lastName, string practice)
      {
         List<Item> items = null;
         SearchHits KeywordHits;

         try
         {
            using (IndexSearchContext SearchContext = index.CreateSearchContext())
            {
               KeywordHits = DoKeywordSearch(SearchContext, firstName, lastName, practice);
               items = GetItemsList(KeywordHits);
            }
         }
         catch (Exception exception)
         {
            Sitecore.Diagnostics.Log.Error(exception.ToString(), this);
         }
         return items;
      }

      private Sitecore.Search.SearchHits DoKeywordSearch(Sitecore.Search.IndexSearchContext searchContext,
         string firstName, string lastName, string practice)
      {
         SearchHits searchhits = null;
         CombinedQuery query = new CombinedQuery();

         if (String.IsNullOrEmpty(firstName) == false)
         {
            QueryBase firstNameQuery = new FieldQuery(CustomIndexFields.FirstName, firstName);
            query.Add(firstNameQuery, QueryOccurance.Must);
         }

         if (String.IsNullOrEmpty(lastName) == false)
         {
            QueryBase lastNameQuery = new FieldQuery(CustomIndexFields.LastName, lastName);
            query.Add(lastNameQuery, QueryOccurance.Must);
         }

         if (String.IsNullOrEmpty(practice) == false)
         {
            QueryBase practiceQuery = new FieldQuery(CustomIndexFields.Practice, practice);
            query.Add(practiceQuery, QueryOccurance.Must);
         }

         searchhits = searchContext.Search(query, index.GetDocumentCount());
         return searchhits;
      }

      private List<Item> GetItemsList(SearchHits searchHits)
      {
         List<Item> listItems = new List<Item>();

         if (searchHits != null)
         {
            SearchResultCollection searchResults = searchHits.FetchResults(0, searchHits.Length);
            foreach (SearchResult result in searchResults)
            {
               Item item = result.GetObject<Item>();
               if (item != null)
               {
                  listItems.Add(item);
               }
            }
         }
         return listItems;
      }
   }
}

 

How to create Custom Index Crawler?

We can create custom crawler by creating a class inherited from “Sitecore.Search.Crawlers.DatabaseCrawler” class and then overriding the ” AddAllFields” method.

Please refer to the above code. The code basically consists of the following:

public struct CustomIndexFields – Constants representing names of custom fields in the Lucene Index.

public class CustomIndexer: Sitecore.Search.Crawlers.DatabaseCrawler – Custom Indexer for adding custom fields in the Lucene Index. We can Search or filter these fields to refine search results.

The document.Add method is used to add a custom field to the document. It takes a Field object as an argument.

document.Add(new Field(fieldName, fieldValue, storageType, fieldIndexType))

The Field class constructor accepts the following parameters

  • fieldName is a field name that appears in lucene index.
  • fieldValue is the value of the field
  • storageType is a storage type for lucene field. It can have the following values:
    • no
    • yes
    • compress
  • indexType is an index type for lucene field. It can have the following values:
    • no
    • tokenized
    • untokenized
    • nonorms

Please Refer to Lucene documentation to find out what each of these options mean.

Creating our own custom search result class

This class will handle the custom search queries. We will create custom queries and search/filter the cutom fields to refine search results.

Please refer to the code. The code consists of the following:

public class CustomSearchResults – This class handles the custom search and return items which match the search criteria. We can use the List of Items to bind them to repeater and display the search results as per our requirements.

public List<Item> GetSearchResults(string firstName, string lastName, string practice) – This function is used for searching the people in the index. It takes First Name, Last Name and Practice as search parameters and returns List of People Items which match the search criteria. For now we have used AND clause for creating the search query, but we can create more complex queries depending on the requirement. We can use the List of Items to bind them to repeater and display the search results as per our requirements.

Adding Search Configuration Settings

There are two ways to achieve this. Either we can make changes to the web.config directly or we can create a separate configuration file and include it in the “Website\App_Config\Include” folder of our site or. Making a separate config file is a better option in case we need to patch elements within the /configuration/sitecore element. When processing the web.config file, Sitecore includes configuration files specified by <sc:include> elements as well as those in the /App_Config/Include subdirectory. This allows administrators to define module-specific, feature-specific, solution-specific, instance-specific and other specific configuration in files separate from the main web.config file, which can simplify configuration, release management, and other aspects of the system.

We can use the “ShowConfig” tool located in the /sitecore/admin/ folder to see the complete merged web.config:

We’ll create “CustomSearchSettings.config” file in “/App_Config/Include” folder.

CustomSearchSettings.config

 

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
   <sitecore>
      <search>
         <configuration>
            <indexes>
               <index id="CustomIndex" type="Sitecore.Search.Index, Sitecore.Kernel">
                  <param desc="name">$(id)</param>
                  <param desc="folder">__custom</param>
                  <Analyzer ref="search/analyzer"/>
                  <locations hint="list:AddCrawler">
                     <customSearch type="mvc.PeopleIndexer, mvc">
                     <!--NameSpace.Classname, Assembly/Dll name without extension-->
                        <Database>master</Database>
                        <Root>/sitecore/content/Home</Root>
                        <Tags>people</Tags>
                        <Boost>1.0</Boost>
                        <IndexAllFields>false</IndexAllFields>
                        <include hint="list:IncludeTemplate">
                           <includeTemplate>{E2033B38-3237-44CB-BF49-A729C29577E2}</includeTemplate>
                           <!--Person-->
                        </include>
                     </customSearch>
                  </locations>
               </index>
            </indexes>
         </configuration>
      </search>
   </sitecore>
</configuration>

 

The entry above is for Sitecore 6.6. We will have to do it differently for the lower versions of sitecore.

Search Configuration entry in Web.config for the custom indexer under <Search>

To create our own index, open the Web.config file and navigate to configuration -> sitecore -> search -> configuration -> indexes. Add a new child element named “index” with two attributes: “id” and “type.” The “id” attribute value is arbitrary; choose something meaningful to you. The “type” attribute value should be “Sitecore.Search.Index, Sitecore.Kernel.” For example:

 

<indexes hint="list:AddIndex">
   ...
   <index id="myindex" type="Sitecore.Search.Index, Sitecore.Kernel"></index>
</indexes>

 

Next, we’ll need to add 2 child elements named “param” to tell Sitecore the name and location of our index. For the “name” attribute of the first “param” element, we’ll use the “id” attribute of the index itself as. For the “folder” attribute of the second element, we’llI use the same naming convention that Sitecore uses for the system index. By default, the index resides on the file system in the Data/indexes folder of the Sitecore root.

 

<indexes hint="list:AddIndex">
   ...
   <index id="myindex" type="Sitecore.Search.Index, Sitecore.Kernel">
      <param name="$(id)" />
      <param folder="__myindex" />
   </index>
</indexes>

 

Now we’ll add a reference to the default analyzer that is responsible for creating the index:

 

<indexes hint="list:AddIndex">
   ...
   <index id="myindex" type="Sitecore.Search.Index, Sitecore.Kernel">
      <param name="$(id)" />
      <param folder="__myindex" />
      <Analyzer ref="search/analyzer" />
   </index>
</indexes>

 

Finally, we’ll add the database crawler definition that tells Sitecore exactly what we want to index:

 

<indexes hint="list:AddIndex">
   ...
   <index id="myindex" type="Sitecore.Search.Index, Sitecore.Kernel">
      <param desc="name">$(id)</param>
      <param desc="folder">__myindex</param>
      <Analyzer ref="search/analyzer" />
      <locations hint="list:AddCrawler">
         <customindex type="Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel">
            <Database>web</Database>
            <Root>/sitecore/content/home/mywebsite</Root>
            <Tags>keyword</Tags>
            <include hint="list:IncludeTemplate">
               <template>{74591C67-CB69-49E0-A1DE-8051447394A6}</template>
            </include>
         </customindex>
      </locations>
   </index>
</indexes>

 

Let’s examine this in more detail:

The locations node and “hint” attribute are mandatory. The first child element of the “locations” node is named “customindex.” The name is arbitrary, but we may want to use a name that reflects the nature of the content in the index. The “type” attribute and value are mandatory.

The first 3 child elements of the “customindex” node are “Database,” “Root,” and “Tags.” As you might guess, the “Database” element denotes the Sitecore database used to create the index. This element is mandatory. The “Root” element indicates the path to the item from which the crawler should start indexing content. The “Tags” element is used to list keywords associated with the content. The value can be a single word or a comma-delimited list of words.

The “include” element is used to filter the search content by a template. Without this element, the crawler will index all content under the “Root” (or in the entire database, if the “Root” element is missing). By default, the crawler indexes all fields.

Helpful References