In this tutorial, we will learn how to create a deep learning model using a Convolutional Neural Network (CNN) to classify paddy diseases. Paddy (rice) is one of the most important crops worldwide, and diseases affecting it can lead to significant economic losses. Machine learning can help farmers detect diseases early by analyzing images of affected leaves.
We will build a CNN model that can classify different paddy leaf diseases using a dataset of images. The process involves:
We start by importing the necessary libraries for building and training our CNN model.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense,Conv2D,MaxPooling2D,Flatten,BatchNormalization,Dropout
pd
): Used for data
manipulation and analysis.np
): Provides support for
large, multi-dimensional arrays and matrices.plt
): A plotting library
to visualize data and model results.sns
): Based on Matplotlib,
Seaborn provides high-level interface for drawing attractive statistical
graphics.tf
): A deep learning
library, providing functions to build and train machine learning
models.keras
): A high-level neural
networks API, included within TensorFlow, that simplifies the process of
building models.We will load the dataset from the given directory and preprocess it by resizing images and splitting it into training and validation sets.
= 40
seed = keras.utils.image_dataset_from_directory(
train_ds ='/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images',
directory="inferred",
labels="int",
label_mode=None,
class_names="rgb",
color_mode=32,
batch_size=(256, 256),
image_size=0.2,
validation_split="training",
subset=seed # Add seed argument
seed
)
= keras.utils.image_dataset_from_directory(
validation_ds ='/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images',
directory="inferred",
labels="int",
label_mode=None,
class_names="rgb",
color_mode=32,
batch_size=(256, 256),
image_size=0.2,
validation_split="validation",
subset=seed # Add seed argument
seed )
Found 10407 files belonging to 10 classes.
Using 8326 files for training.
Found 10407 files belonging to 10 classes.
Using 2081 files for validation.
image_dataset_from_directory
: This function helps load
images directly from the directory and automatically labels them based
on folder structure.directory
: Path to the folder containing the paddy
disease images.labels="inferred"
: Automatically infers labels from
folder names.label_mode="int"
: Labels are returned as
integer-encoded values.image_size=(256, 256)
: Each image is resized to 256x256
pixels to maintain uniformity.validation_split=0.2
: 20% of the data is reserved for
validation, while 80% is used for training.subset
: Specifies whether this dataset is used for
training or validation.seed
: Setting a random seed ensures that the dataset
split is reproducible.Before training the model, we need to normalize the pixel values of the images. Neural networks tend to perform better when the data is normalized, i.e., pixel values are scaled between 0 and 1 instead of their original range of 0 to 255.
# Normalizing the data
def process(image,label):
= tf.cast(image/255,tf.float32)
image return image,label
= train_ds.map(process)
train_ds = validation_ds.map(process) validation_ds
process(image, label)
: This function normalizes each
image by dividing the pixel values by 255 (scaling them from 0-255 to
0-1).tf.cast(image/255, tf.float32)
: Converts the pixel
values to float32 data type for compatibility with TensorFlow
models.train_ds.map(process)
: Applies the normalization
function to each image in the training dataset.validation_ds.map(process)
: Applies the normalization
function to each image in the validation dataset. Now that our images
are normalized, we can move on to building and compiling the CNN
model.We need to determine how many disease categories (classes) are present in the dataset. This will help us define the output layer of our CNN model correctly.
import os
# Get the directory path from the DirectoryIterator object
= '/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images'
dataset_path
# Count the number of subdirectories (classes)
= len([name for name in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, name))])
num_classes
print("Number of classes:", num_classes)
Number of classes: 10
os.listdir(dataset_path)
: Lists all the directories and
files in the given dataset path.os.path.isdir()
: Checks if the listed item is a
directory (which corresponds to a class in our case).num_classes
: This variable stores the total number of
subdirectories, each representing a disease class. The number of classes
is an important parameter when defining the output layer of the CNN
model. Next, we will use this to finalize the architecture of our
model.We will now define the architecture of the Convolutional Neural Network (CNN) for paddy disease classification. The model consists of several convolutional layers for feature extraction, followed by pooling layers for downsampling, and dense layers for classification.
# create CNN model
= Sequential()
model
128,kernel_size=(3,3),padding='valid',strides = 1,activation='relu',input_shape=(256,256,3)))
model.add(Conv2D(=(3,3),padding='valid'))
model.add(MaxPooling2D(pool_size
64,kernel_size=(3,3),padding='valid',strides = 1,activation='relu'))
model.add(Conv2D(=(3,3),padding='valid'))
model.add(MaxPooling2D(pool_size
32,kernel_size=(3,3),padding='same',strides = 1,activation='relu'))
model.add(Conv2D(=(3,3),padding='valid'))
model.add(MaxPooling2D(pool_size
16,kernel_size=(3,3),padding='same',strides = 1,activation='relu'))
model.add(Conv2D(=(3,3),padding='valid'))
model.add(MaxPooling2D(pool_size
model.add(Flatten())
128,activation='relu'))
model.add(Dense(0.1))
model.add(Dropout(64,activation='relu'))
model.add(Dense(0.1))
model.add(Dropout(='softmax')) model.add(Dense(num_classes,activation
Conv2D(128, kernel_size=(3,3))
: This is a 2D
convolution layer with 128 filters and a kernel size of 3x3. It extracts
features from the input images. The activation function used is
ReLU.MaxPooling2D(pool_size=(3,3))
: The pooling layer
reduces the spatial dimensions of the output, helping to reduce the
number of parameters.Flatten()
: This layer flattens the 2D matrix output
into a 1D vector, preparing it for the dense layers.Dense(128, activation='relu')
: A fully connected layer
with 128 neurons. ReLU activation is used to introduce
non-linearity.Dropout(0.1)
: Dropout is applied to prevent
overfitting. It randomly drops 10% of the neurons during training.Dense(64, activation='relu')
: Another fully connected
layer with 64 neurons.Dense(num_classes, activation='softmax')
: The output
layer with a number of neurons equal to the number of classes. Softmax
is used for multi-class classification. This architecture is designed to
extract rich feature maps from the paddy disease images and classify
them into the appropriate disease categories.Once the architecture of the CNN model is defined, the next step is to compile it. During compilation, we specify the optimizer, loss function, and evaluation metrics.
from tensorflow.keras.optimizers.legacy import Adam
model.summary()compile(optimizer='adam',
model.='sparse_categorical_crossentropy',
loss=['accuracy']) metrics
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ conv2d (Conv2D) │ (None, 254, 254, 128) │ 3,584 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d (MaxPooling2D) │ (None, 84, 84, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_1 (Conv2D) │ (None, 82, 82, 64) │ 73,792 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_1 (MaxPooling2D) │ (None, 27, 27, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_2 (Conv2D) │ (None, 27, 27, 32) │ 18,464 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_2 (MaxPooling2D) │ (None, 9, 9, 32) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ conv2d_3 (Conv2D) │ (None, 9, 9, 16) │ 4,624 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ max_pooling2d_3 (MaxPooling2D) │ (None, 3, 3, 16) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ flatten (Flatten) │ (None, 144) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 128) │ 18,560 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 128) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 64) │ 8,256 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 64) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 10) │ 650 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 127,930 (499.73 KB)
Trainable params: 127,930 (499.73 KB)
Non-trainable params: 0 (0.00 B)
model.summary()
: Displays a summary of the model
architecture, showing the layers, output shapes, and the total number of
trainable parameters.optimizer='adam'
: The Adam optimizer is used for
training the model. Adam is an efficient and widely used optimization
algorithm that adjusts the learning rate during training based on
momentum and past gradients.loss='sparse_categorical_crossentropy'
: This loss
function is used for multi-class classification when the labels are
integers (not one-hot encoded). It measures the performance of the model
based on the difference between predicted and actual class labels.metrics=['accuracy']
: Accuracy is used as the
evaluation metric to measure the percentage of correctly classified
images during training and validation.Now that the model is compiled, we can move on to the training phase.
Now that the model is compiled, we proceed to the training phase. To prevent overfitting and ensure the model doesn't train for too many epochs, we employ Early Stopping. This will stop the training process when the validation loss stops improving for a certain number of epochs.
= keras.callbacks.EarlyStopping(
early_stopping ="val_loss",
monitor=0,
min_delta=5,
patience=0,
verbose="auto",
mode=None, # Set to the value of val_loss at the desired epoch
baseline=False,
restore_best_weights
)= model.fit(train_ds, validation_data=validation_ds, epochs=500,callbacks=[early_stopping] ) history
Epoch 1/500
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1728655961.819403 82 service.cc:145] XLA service 0x7c73c400ded0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1728655961.819481 82 service.cc:153] StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
3/261 ━━━━━━━━━━━━━━━━━━━━ 11s 43ms/step - accuracy: 0.0816 - loss: 2.3086
I0000 00:00:1728655969.761492 82 device_compiler.h:188] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
261/261 ━━━━━━━━━━━━━━━━━━━━ 35s 96ms/step - accuracy: 0.1964 - loss: 2.1369 - val_accuracy: 0.3878 - val_loss: 1.7645
Epoch 2/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.3765 - loss: 1.7563 - val_accuracy: 0.4224 - val_loss: 1.6145
Epoch 3/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 50ms/step - accuracy: 0.4598 - loss: 1.5672 - val_accuracy: 0.5103 - val_loss: 1.4529
Epoch 4/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 53ms/step - accuracy: 0.5365 - loss: 1.3625 - val_accuracy: 0.5579 - val_loss: 1.3062
Epoch 5/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.6096 - loss: 1.1708 - val_accuracy: 0.6468 - val_loss: 1.0728
Epoch 6/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.6581 - loss: 1.0325 - val_accuracy: 0.6444 - val_loss: 1.0699
Epoch 7/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7060 - loss: 0.8909 - val_accuracy: 0.7088 - val_loss: 0.9128
Epoch 8/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7196 - loss: 0.8401 - val_accuracy: 0.7299 - val_loss: 0.8351
Epoch 9/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.7680 - loss: 0.7192 - val_accuracy: 0.7400 - val_loss: 0.8307
Epoch 10/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7833 - loss: 0.6559 - val_accuracy: 0.7564 - val_loss: 0.8046
Epoch 11/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8006 - loss: 0.5920 - val_accuracy: 0.7684 - val_loss: 0.7828
Epoch 12/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8176 - loss: 0.5438 - val_accuracy: 0.7674 - val_loss: 0.7860
Epoch 13/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8381 - loss: 0.4842 - val_accuracy: 0.7746 - val_loss: 0.7670
Epoch 14/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8524 - loss: 0.4322 - val_accuracy: 0.7987 - val_loss: 0.7146
Epoch 15/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8601 - loss: 0.4153 - val_accuracy: 0.8160 - val_loss: 0.6475
Epoch 16/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.8770 - loss: 0.3655 - val_accuracy: 0.8097 - val_loss: 0.6803
Epoch 17/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8883 - loss: 0.3430 - val_accuracy: 0.7847 - val_loss: 0.7986
Epoch 18/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8857 - loss: 0.3348 - val_accuracy: 0.8169 - val_loss: 0.7037
Epoch 19/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8796 - loss: 0.3517 - val_accuracy: 0.8169 - val_loss: 0.7019
Epoch 20/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8951 - loss: 0.3125 - val_accuracy: 0.8318 - val_loss: 0.7073
EarlyStopping
: This callback monitors the model's
performance on the validation set and halts the training if the
monitored metric (in this case, val_loss) doesn't improve for a set
number of epochs.monitor="val_loss"
: The validation loss is monitored to
determine when to stop training.min_delta=0
: The minimum change in the monitored
quantity to qualify as an improvement.patience=5
: The training will stop if no improvement in
val_loss is observed for 5 consecutive epochs.restore_best_weights=False
: If set to True, the model
will restore the weights from the epoch with the best validation
loss.model.fit()
: This function begins the training of the
CNN model. train_ds: The training dataset.validation_data=validation_ds
: The validation dataset
is used to monitor performance during training.epochs=500
: The model is trained for up to 500 epochs
(which is too much), but early stopping may halt it earlier if no
improvement is observed.callbacks=[early_stopping]
: The early stopping callback
is passed to stop training when necessary.Training will continue until the validation loss plateaus, ensuring that the model doesn't overfit and saving time during training.
After training the model, it’s essential to visualize the performance by plotting the accuracy on both the training and validation datasets across epochs. This will give us insights into how well the model is learning and whether it is overfitting.
# Plotting the training and validation accuracy
'accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.plot(history.history['Epoch')
plt.xlabel('Accuracy')
plt.ylabel('Training and Validation Accuracy')
plt.title(
plt.legend() plt.show()
history.history['accuracy']
: Contains the training
accuracy values for each epoch.history.history['val_accuracy']
: Contains the
validation accuracy values for each epoch.plt.plot()
: Used to plot both training and validation
accuracy over the epochs.plt.xlabel('Epoch')
: Sets the label for the x-axis,
which represents the number of epochs.plt.ylabel('Accuracy')
: Sets the label for the y-axis,
which represents the accuracy percentage.plt.title('Training and Validation Accuracy')
: Adds a
title to the plot.plt.legend()
: Displays the legend to differentiate
between training and validation accuracy.This plot will help you visually assess how well the model generalizes. If there’s a significant gap between training and validation accuracy, it could indicate overfitting.
After training the model, it's important to save it for future use, allowing us to load it later without needing to retrain. This can save a significant amount of time, especially for deep learning models.
'paddy_model.h5') model.save(
model.save('rice_model.h5')
: This function saves the
entire model, including the architecture, weights, and training
configuration, to a file named rice_model.h5. The .h5 extension
indicates that the model is saved in the HDF5 format, which is commonly
used for storing large amounts of numerical data.Once saved, the model can be easily loaded in the future for inference or further training, using:
from tensorflow.keras.models import load_model
= load_model('rice_model.h5') loaded_model
This step ensures that your hard work in training the model is preserved for future use.
In this tutorial, we successfully built a Convolutional Neural Network (CNN) to classify paddy diseases using image data. We went through the following key steps:
By following these steps, you can develop a robust machine learning model to classify paddy diseases, which can aid farmers and agricultural experts in identifying and managing crop health issues effectively.
Feel free to adapt this tutorial for other types of image classification tasks, as the principles of CNNs can be applied across various domains.
Thank you for following along, and happy coding!