Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The content extraction has some limitations (see https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/concept-recognizing-text). In particular, be aware of the following limitations:

    • Supported file formats: JPEG, PNG, BMP, PDF, and TIFF.

    • For PDF and TIFF files, up to 2000 pages (only first two pages for the free tier) are processed.

    • The file size must be less than 50 MB (4 MB for the free tier) and dimensions at least 50 x 50 pixels and at most 10000 x 10000 pixels.

    • The PDF dimensions must be at most 17 x 17 inches, corresponding to legal or A3 paper sizes and smaller.

  • If an error occurs, an error message will end up in the CognitiveServices.error queue in RabbitMQ. You can retry the content extraction by moving the failed message to the CognitiveServices queue.

  • There is a cost associated with extracting contents from files. Please see https://azure.microsoft.com/en-us/pricing/details/cognitive-services/computer-vision/ for more details.

  • If the extracted contents are included in freetext searches, they are searched on equal terms with the asset titles. Thus, it might be more difficult to find an asset by title.

  • Each time an asset is published or republished, its contents are extracted again.

  • Repopulating search caches and assets might take significantly more time if content extraction has been enabled.

  • Freetext searches might be slower when asset contents are included. However, we do not expect this to be significant.

...