4 Advanced configurations - Solr field types
A metadata field used in any search may be interpreted and queried in many different ways. The Digizuite DAM Center supports individual configuration for each field used in any search.
In order to decide how a metadata field is interpreted, Solr uses field types. A Solr field type is a definition of how data is interpreted and how it is queried. An example could be a simple field type definition could be one that splits text strings on white spaces.
If a description metadata field has the text: "This is a test", it would be saved in the Solr core as the four individual words "This", "is", "a", "test" and searching for "test" would return a hit. In order to understand Solr field types better, we refer to the following link.
How to configure Solr field types
Any metadata field that has a checkmark in the "Visible filter" (see picture below) can have customized field types. It is configured in the following way:
- Open System tools → Config Manager → Digizuite™ Media Manager → Searches
- Choose the search to edit (e.g. Digizuite_system_framework_search, see picture below)
- Note: Solr MUST be enabled (i.e. "Use Solr" checkmark must be checked and Solr must be installed and configured)
- Chose the field to edit and make sure the "Visible filter" is checked (see picture below)
- Add a custom attribute to the filter. Give it the key "SolrType" and one of the valid values (see list below).
- Save the search and repopulate it.
Valid solr field types
Text_general - predefined solr field type with the following definition
text_general<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
Text_custom - custom solr field type. By default this is the same as text_general, but it uses a classic tokenizer instead. It has the following definition
text_custom<fieldType name="text_custom" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.ClassicTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.ClassicTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
test_ws - predefined solr field type
text_ws<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType>
text_en_splitting_tight - predefined solr field type with the following definition
text_en_splitting_tight<fieldType name="text_en_splitting_tight" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_en.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> <filter class="solr.EnglishMinimalStemFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType>
date - predefind solr field type with the following definition
date<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
int - predefind solr field type with the following definition
int<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
string - predefind solr field type with the following definition
string<fieldType name="string" class="solr.StrField" sortMissingLast="true" />