Dog breed classifier

Distinguishes people and dogs. Classifies breed if dog is depicted. Suggests a resembling breed if a human is depicted.

Description

Before making a distinction between different dog breeds, the application tries to decide if human or dog is depicted on the image. OpenCV's implementation of Haar feature-based cascade classifiers is used to detect human faces in images.

A pre-trained ResNet-50 model is used, to detect dogs. The application downloads the ResNet-50 model, along with weights that have been trained on ImageNet, a large and popular dataset used for image classification and other computer vision tasks. ImageNet contains over 10 million URLs, each linking to an image containing an object from one of 1000 categories. Given an image, this pre-trained ResNet-50 model returns a prediction (derived from the available categories in ImageNet) for the object that is contained in the image.

Distinguishing between dog breeds is a complex task but Convolutional Neural Networks perform well on it. While it is possible to learn the network from scratch, it is reasonable to use transfer learning. The key advantage of transfer learning is the usage of bottleneck features of a pre-trained model. A new model consists mostly of fully-connected layers and takes bottleneck features as input. Only those layers are trained afterward. Models built like that, show excellent performance and spare a lot of time and computational power to be trained.

The pre-trained model named Xception is used to solve the problem. Global average pooling layer and Fully-connected layer provide a new model trained on available images. To make a classification, Xception takes an image as input and returns bottleneck features. The features go into the new model as input and new model makes the classification.

Image to analyze
Result of analysis: unknown

Results

Detecting faces

Current approach (OpenCV and Haar Cascades) is quite accurate. However, it seems to be not perfect for detecting faces from noisy input provided by real users of the application like this. An image's rotation of more than 19 degrees makes it impossible for Haar cascade to recognize a face.

OpenCV algorithms can be tuned by using more features. Except for frontal face images, right and left profiles are available. However, it still makes sense to try out other approaches such as CNNs. Convolutional neural networks can recognize patterns on images regardless of their positions.

Classifying dog breeds

The solution uses transfer learning for building the model. It obtains accuracy of 85% and takes only two minutes to train. Training Xception model from scratch could take days or even weeks depending on hardware.

Improvements

The Algorithm can be improved by using multiple models. For example, models can be combined in an ensemble that chooses predictions provided by the majority of models. Also, additional layers of the neural network can probably improve performance.