Product Image Optimisation With Chrome’s Convolutional Neural Network

Today I made a bulk image analysis tool for our team. It takes in a URL, fetches images and runs them through Google’s shopping intent classifier which then determines whether the image is optimised for shopping intent or not.

This is particularly useful for image and on-page optimisation of product pages and categories on eCommerce websites.

table

As I completed my first report I started to wonder. What is it about image 117 that caused Google’s model to misclassify it as “no shopping intent” when it’s clearly a product image.

I don’t know either.

But I think it’s worth tweaking it a little, perhaps an angle shot or a different crop to show greater detail.

Since the model didn’t come with label definitions I had to reverse-engineer them by testing them across various product types and as of now they appear to be as follows:

  • LABEL_1: No Shopping Intent
  • LABEL_2: Fashion & Style
  • LABEL_3: Home & Garden
  • LABEL_4: Tools, Vehicles, Electronics & Appliances (General/Other?)

The next iteration of this tool will probably have a more robust image ingestion pipeline such as scrape-based, image sitemap or client’s CDN data feed. It would be really interesting to run classification of this type on thousands of images and determine the extent of overall optimisation opportunity based on true label mismatches.

Practical Application

There are at least three practical applications of this classifier. It can be used to determine whether the analysed image is:

  1. Misclassified by shopping intent (image 59)
  2. Misclassified by shopping category
  3. Intent and Category Ambiguous (image 10, 35).

Is it really Google’s shopping intent classifier?

Yes. It’s Google’s custom convolutional neural network in TensorFlow Lite v3 format based on mobilenet V3 small model.

Where did I get it from?

I extracted it from Chrome.

How does it work?

The model takes in a pre-processed image (224×224) and returns two sets of probabilities:

  1. shopping intent (4 labels)
  2. sensitive image (2 labels)

Technical Details

  • Model Name: shopping_intent_x_sensitivity_classifier
  • Checkpoint: mobilenet_v3_small_224_04132253_ckpt_3006395
  • Description: The model is MLIR Converted. Classifies if the image is sensitive or has shopping intent.
  • Model Author: lens-proactive-dev
  • Denotation: Image(RGB)
  • tensor: float32[1,224,224,3]

Inputs

name: normalized_input_image_tensor
tensor: float32[1,224,224,3]
denotation: Image(RGB)
Input image to be classified. The input is expected to be RGB image with type UINT8.
identifier: 0

Outputs

name: shopping_intent
tensor: float32[1,4]
denotation: Feature
Probability whether image is sensitive or has shopping intent
identifier: 222

name: sensitive
tensor: float32[1,2]
denotation: Feature
Probability whether image is sensitive or has shopping intent
identifier: 220

Model Architecture

Full model architecture is available as: PNG | SVG (right click, save as)

Dan Petrovic, the managing director of DEJAN, is Australia’s best-known name in the field of search engine optimisation. Dan is a web author, innovator and a highly regarded search industry event speaker.
ORCID iD: https://orcid.org/0000-0002-6886-3211

2 Points