We compared the 3 best image analysis API’s — here’s what we learned
One of the flagship applications of modern machine learning has been working with images — training computers to analyze, classify, and alter different types of pictures. Last year, Google’s DeepDream software made waves by creating a series of terrifying, nightmare-inducing images:
While image and video classification (that is, telling us what is depicted in the chosen media) is not so cutting edge anymore, this is actually good news. A number of cheap services have sprung up to make image classification quite accessible.
Here at MuseFind we’ve been looking for ways to help both brands and social media influencers produce better content — and part of that effort is figuring out the correlations between the most successful posts and their content.
Here’s our example picture, from one of MuseFind’s influencer marketing campaigns with John Varvatos. Courtesy of @rusty.blade:
Below are the top three image/video analysis services we considered: Clarifai, Google Cloud Vision, and Amazon Rekognition. Let’s see what they can tell us about this image.
Clarifai, unlike any other API’s on this list, has the added bonus of video analysis. View a demo below:
Upload your own photos or try a link on the web to see Clarifai's image recognition predictions and probability scores…clarifai.com
For videos, Clarifai offers scene recognition. For images, it can also do sentiment analysis, text recognition, logo detection, and face detection, as well as a more robust version of Resemble’s image attribute detection: brightness, colour, dominant colour.
Here’s what it shows about our image:
It was able to recognize some key concepts (the first five are fairly accurate), but also added some irrelevant tags. No scarf, no child, no woman.
Still, if we took the top five tags and used that to automatically categorize our image, we’d be in a good place.
Price: 5,000 free operations, 10,000 free inputs per month- $1.20/$.80 per 1000 after that
Cloud Vision API
Cloud Vision by Google includes many of the key features of Clarifai (sentiment analysis, text recognition, logo detection, facial analysis) and adds a couple of bonuses: landmark detection and a simple REST API.
A few days ago, fellow Google Cloud Developer Advocate Sara Robinson wrote a great blog post about the landmark…medium.com
Unlike Clarifai, you can’t create your own models to test against — but you also have access to an API backed by Google, one they seem quite dedicated to constantly improving.
For our image, it gave us the following labels:
Cloud Vision was accurate in all but one tag (Spring) and managed to identify not just the dog but the activity of dog walking.
However, it did fail to detect the face in the image. Here’s a preview of its sentiment analysis with a different image:
Price: Free up to 1000/month, $1.50 per thousand beyond that
By now, the features will sound familiar: scene detection, sentiment analysis. Amazon boasts of a more robust suite of facial analysis tools, including facial recognition (not offered by Google or Clarifai) across images, and detailed information like image quality of face, beard recognition (yes), and facial comparison (how likely is it that two faces are the same person?).
It also promises integration with AWS services like S3 and Lamba.
Amazon Rekognition makes it easy to add deep learning-based image analysis to your applications. Search, verify, and…aws.amazon.com
Let’s start with object and scene detection. Here’s the results:
It nailed the dog aspect (bonus points for ‘Adorable’) but failed to recognize any other aspect of the scene.
However, it did pick up the face:
That killer beard detection. A bit of a mixed bag (no sunglasses, no mustache), but also reasonable accuracy.
Price: $1 per 1000 images processed (up to a million a month)
We haven’t settled on a service yet — we’re going to continue experimenting and clarifying our goals over the coming weeks.
But based on these results, we already get an idea of the strengths and weaknesses of the different API’s. Clarifai has the strongest concept modeling, Google the best scene detection and sentiment analysis, and Amazon the best facial analysis.
Let us know in the comments your thoughts on these API’s, and whether you have any additional comments. And please like and share if this article was useful to you!