As you know, Office 365 comes with capabilities to help you protect your data and ensure compliance – namely Pureview Information Protection and Pureview Data Loss Prevention (DLP).
Well, good news as you can now enable Optical Character Recognition (OCR) for both features (in preview).
After turning it on, compliance policies like DLP and auto-labeling will use OCR to scan and extract printed text from images (JPEG, PNG, BMP, TIFF, PDF) from the location(s) you have defined.
To start using it, you must have an Azure subscription, setup Microsoft Syntex (from the Office 365 administration portal) as pay-as-you-go and finally turn on the OCR capability in the Compliance/Pureview portal.
To configure Syntex, connect to your Office 365 administration portal (https://admin.microsoft.com/) with either a Global Admin or SharePoint admin account to access the Setup blade to then scroll to the Files and content section to start the Use content AI with Microsoft Syntex task to configure the billing as pay-as-you-go
connect to your Compliance/Pureview portal (https://compliance.microsoft.com/) to access the Settings\Optical character recognition (OCR) blade