Using VisionKit’s ImageAnalyzer to scan images for text

As I needed to run some text recognition within one of my apps, I thought it would be fun to share how easy it is to recognize text with VisionKit. Let’s jump right in and see with how few lines we can make it work.

At first, we should check if the feature is even available on your user’s device. Otherwise, your app will crash and leave people disppointed.

/* imports */
import VisionKit

/* within your class */
func analyzeReceiptImage(image: UIImage) async -> String?  {
  if (!ImageAnalyzer.isSupported) {
      return nil
}

Besides UIImage the ImageAnalyzer also supports NSImage, CGImage, and CVPixelBuffer, so you can cover a wide range of use-cases with it.

As you can see, we’re also using the async-await syntax to increase legibility.

Within our function, we’re now creating the ImageAnalyzer and are initializing it. If your users are also using less common languages, it might also be useful to check if they’re supported by the ImageAnalyzer first.

let analyzer = ImageAnalyzer()
        
var configuration = ImageAnalyzer.Configuration([.text]);
configuration.locales = [Locale.current.identifier]

As we only want to recognize text, we’re configuring it to recognize text.

Let’s run the analyzer now:

let analysis = try? await analyzer.analyze(image, configuration: configuration)
if let success = analysis {
    return success.transcript
}

return nil

And that’s already it. If the recognizer ran successfully, it’ll return a string with all the text it recognized.