Dog or Cat: Deep Learning Convolutional Neural Network model
This is a Deep Learning Convolutional Neural Network (CNN) model, trained to identify images of dogs and cats with about 90% accuracy.
The model was trained using a total of over 8,000 images of dogs and cats, roughly split in half, about 4,000 images of dogs, and the about the same number of images of cats.
How the CNN Model was created
Data Preprocessing
Initially, the training set images were transformed (geometrically) to avoid overfitting by applying variation to the 8,000 images. This was done by using the Keras library (ImageDataGenerator function).
Both the training set and test set were scaled in size to match.
CNN Model Creation
- Initiate the CNN
- Created the Initial CNN layer by
- Adding the Convolution Layers including features detectors arguments, activation function (rectifier linear unit), and image size based off the data preprocessing parameters
- Adding Pooling Layers
- Additional layers were added by adding additional convolution layers (this time without image size) and pooling layers
- Applied Flatenning
- Created the Full Connection of all neurons, including the number of neurons and activation function (rectifier linear unit)
- Created the Output layer with a sigmoid activation function due to the binary output of this model. If it was not a binary classification the activation function to used would be softmax. Sigmoid is equivalent to a 2-element softmax.
- Compiled the CNN
Training
Trained the CNN model utilizing the training set and test set created during the data preprocessing phase, including the number of epochs (training cycles). For this model an epoch of 25 cycles proved to be sufficient for a very accurate prediction.
This model reached an accuracy of 91.60%, as can be seen in the following video after it reached the 25th epoch.
Prediction
The input image is converted into a Numpy array and scaled to match the data sets used during training since this is the format expected by the CNN model. We receive a binary result from the CNN predict function. In our model, it is a dog if the result equals 1 and a cat if the result equals 0.
Once the model is created and tested it is later saved into our website and loaded every time there is a call to predict an image. Therefore the model is only built once.
Challenges
Processing
The model has to be created locally and only the finished product can be uploaded to run on the website. Trying to build the model off the webserver is not viable due to the amount of data, processing power, and time required to build the model online. Doing so would be detrimental to the web server and user experience.
Data
Good and abundant data is hard to find. This project was done utilizing images of dogs and cats which are very easy to find. I tried to expand the scope for multiple animals but the quality and number of images for other animals proved to be a lot more challenging and time-consuming.
This is the main reason why this is a binary model which will identify only a dog or a cat, regardless of the content of the image provided as input.
Technologies used
- Python
- Flask
- HTML5
- CSS
- Numpy
- Keras
- Tensorflow