4 Advanced configurations - Solr field types - DAM v4.8.0

A metadata field used in any search may be interpreted and queried in many different ways. The Digizuite DAM Center supports individual configuration for each field used in any search.

In order to decide how a metadata field is interpreted, Solr uses field types. A Solr field type is a definition of how data is interpreted and how it is queried. An example could be a simple field type definition could be one that splits text strings on white spaces.
If a description metadata field has the text: "This is a test", it would be saved in the Solr core as the four individual words "This", "is", "a", "test" and searching for "test" would return a hit. In order to understand Solr field types better, we refer to the following link.

How to configure Solr field types

Any metadata field that has a checkmark in the "Visible filter" (see picture below) can have customized field types. It is configured in the following way:

  1. Open System tools → Config Manager → Digizuite™ Media Manager → Searches
  2. Chose the search to edit (e.g. Digizuite_system_framework_search, see picture below)
  3. Note: Solr MUST be enabled (i.e. "Use Solr" checkmark must be checked and Solr must be installed and configured)
  4. Chose the field to edit and make sure the "Visible filter" is checked (see picture below)
  5. Add a custom attribute to the filter. Give it the key "SolrType" and one of the valid values (see list below).
  6. Save the search and repopulate it.



Valid solr field types

  1. Text_general - predefined solr field type with the following definition

    text_general
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType> 
  2. Text_custom - custom solr field type. By default this is the same as text_general, but it uses a classic tokenizer instead. It has the following definition

    text_custom
    <fieldType name="text_custom" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.ClassicTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.ClassicTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>
  3. test_ws - predefined solr field type

    text_ws
    <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      </analyzer>
    </fieldType>
  4. text_en_splitting_tight - predefined solr field type with the following definition

    text_en_splitting_tight
    <fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
        <filter class="solr.EnglishMinimalStemFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
  5. date - predefind solr field type with the following definition

    date
    <fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
  6. int - predefind solr field type with the following definition

    int
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
  7. string - predefind solr field type with the following definition

    string
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" />