DC 5.10 Cognitive Services
All cognitive services are configured for cloud hosted environments and just has to be configured in order to work.
This page will describe how to configure Cognitive Services together with Digizuite. The following subjects are covered:
- Setting up Computer Services from the Azure Portal
- Pricing details
- Finding endpoint and subscription key in the Azure Portal
- Changing the app settings in Digizuite to use the endpoint and subscription key.
Note: This guide shows you how to configure the AI feature on the Azure Portal: /wiki/spaces/PSBOK/pages/702939141 (Lasse Brønnum Brønsholt (Unlicensed) )
Configuring Azure Portal
- Add the Computer Vision service to your account through the Azure portal as illustrated below:
- Select a pricing tier that fits your service, use the link in the Pricing details section below. After adding the service, it will appear in your dashboard.
Pricing Details
Pricing details can be found here: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/computer-vision/
Please note the following, as stated on the Cognitive Services pricing page:
"For Recognize Text each POST call counts as a transaction. All GET calls to see the results of the async service are counted as transactions but are free of charge. For all other operations, each feature call counts as a transaction, whether called independently or grouped through the Analyze call. Analyze calls are used to make calling the API easier, but each feature used counts as a transaction. For instance, an Analyze call containing Tag, Face, and Adult would count as three transactions."
The features that are supported are shown below here. The number of transactions is dependent on the number of the desired visual features. This means that if the system is configured with all of them, then it will be 7 transactions per asset.
ImageType = 0,
Faces = 1,
Adult = 2,
Categories = 3,
Color = 4,
Tags = 5,
Description = 6
The default configuration for the service is set up to only analyze for Tags. This will mean that we out-of-the-box will have 1 transaction per asset.
Endpoint & subscription key
It's important for the Digizuite AI Service to have the endpoint and subscription key. These can be found when selecting the service in Azure as shown below:
Make sure that the resource you take the information from is of the API type "Computer Vision". See image above for more info ↑
Facial recognition
Using this feature involves storing biometric data on the user. As this is considered especially sensitive personal data, in case this feature is enabled in the EU, the GDPR needs to be taken into consideration in order to remain the part of the data processing agreement (DPA).
In order to use the facial recognition feature of the Cognitive Services, you must create a Faces resource in the Azure Portal.
Once the services are created, go to the Quick start tab of the resource and locate the API Key (Key1) and the endpoint:
These are the FacesKey and the FacesEndpointUrl in the app settings, respectively.
The FacesPersonGroup setting is a string that defines the name of the person group used for facial recognition.
It is not shown anywhere, but if there are multiple customers on the same Faces resource, they must each have a unique string here.
For customers with their own Face resources, simply use the name of the customer or similar. The string must be without spaces or special characters and must be all lowercase.
Make sure that the resource you take the info from, is of the API type "Face" (Can be found in the "Overview" of the resource under API type)
Video Transcription
In order to use the Video Transcription feature of the Cognitive Services, a videoindexer subscription key must be acquired and configured.
To acquire the key, go to https://api-portal.videoindexer.ai/products, and sign in with a Microsoft account.
Then go to https://api-portal.videoindexer.ai/profile and select the Product Authorization subscription.
Click Show on the Primary Key and place that key in the VideoIndexerKey options of the Appsettings.json file.
The VideoIndexerWatcherInterval is used to configure how often the cognitive service checks for new progress on the indexed videos. For example, "180" represents checking every 180 seconds (3 minutes).
To entirely disable the Video service, please set the VideoIndexerWatcherInterval to 0.
Translation services
Translation services are available under the resource type "translation" in the Azure Portal. Once the resource has been created, the keys are available the same way as specified under Endpoint & subscription key above.
App settings in the Digizuite
These must be correctly added to app.settings under damcenter/DigizuiteCore/cognitiveservice/appsettings.json:
"subscriptionKey": "..................................",
"AzureEndpointUrl": "https://northeurope.api.cognitive.microsoft.com/",
"FacesKey": "..................................",,
"FacesEndpointUrl": "https://digi-faces-dev.cognitiveservices.azure.com/",
"FacesPersonGroup": "digipersons",
"VisualFeatures": "Tags",
"VideoIndexerKey": "..................................",,
"VideoIndexerWatcherInterval": 180
}
and
"TranslationApi": {
"EndpointUrl": "https://api.cognitive.microsofttranslator.com",
"SubscriptionKey": "..................................",
"ApiVersion": "3.0"
}
If you want AI translation, then the `SubscriptionKey` has to be provided.
If you want video indexing, then the `VideoIndexerKey` has to be provided.
It's important to notice that the Visual Features are based on the list from Azure. The default is these 5 which can be configured. Changing it will impact pricing.
To enable more Visual Features, simply separate them by a comma. For example: VisualFeatures": "Tags,Adult,Description"
The SubscriptionKey for Translation is not the same as the one for computer vision.
Translation pricing is based on number of characters translated, and is on a pure pay-as-you-go basis. See Azure pricing list here: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/ (Under the "Language" tab)
OCR text extraction from PDF and image files
The details of this service (including setup and use) are described here: DC 5.10 Search in asset content (OCR)
Automatically translate a metafield
Automatic translation of metafields is available if Translation Services have been configured as mentioned above and if your field is either a "string" or "note" field. To make the field be translated automatically, you need to change the "Autotranslate_GOOGLE" column for the metafield to true in the database. There is currently no UI available to do this.
After the field has been changed, recycle at least the LegacyService and main DAM app pools.
The cognitive auto translate respects the normal AutoTranslate/AutoTranslateOverwriteExisting flags. This means that the following matrix explains when a translation will be done.
This is what happens when you are editing a field in your language, and what will happen for other languages for the same field. This field is always the first field in the metadata editor in Media Manager.
Field has value | Field is empty | |
---|---|---|
No autotranslate | Nothing | Nothing |
Autotranslate | Nothing | Main value is translated and copied |
AutotranslateOverwriteExisting | Main value is translated and copied | Main value is translated and copied |
If you are editing a value that is not your logged in language, the value will never be translated through Cognitive Services, though normal AutoTranslate behavior still applies.
Validate using this guide:
MM5.6 AI Tagging Configuration - Documentation - Confluence (atlassian.net)