Paddy Disease Classification Using CNN

In this tutorial, we will learn how to create a deep learning model using a Convolutional Neural Network (CNN) to classify paddy diseases. Paddy (rice) is one of the most important crops worldwide, and diseases affecting it can lead to significant economic losses. Machine learning can help farmers detect diseases early by analyzing images of affected leaves.

We will build a CNN model that can classify different paddy leaf diseases using a dataset of images. The process involves:

Loading and preprocessing the dataset: We will prepare our dataset by resizing images, normalizing pixel values, and applying data augmentation to prevent overfitting.
Building the CNN architecture: We'll create a CNN with multiple convolutional, pooling, and dense layers.
Training the model: The model will be trained on the preprocessed images.
Evaluating and testing: Finally, we will evaluate the model's performance using accuracy and loss metrics. Let's dive in and build a paddy disease classification model step by step!

Step 1: Importing Libraries

We start by importing the necessary libraries for building and training our CNN model.

import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns 
import tensorflow as tf 
from tensorflow import keras 
from keras import Sequential
from keras.layers import Dense,Conv2D,MaxPooling2D,Flatten,BatchNormalization,Dropout

Explaination

Pandas (pd): Used for data manipulation and analysis.
NumPy (np): Provides support for large, multi-dimensional arrays and matrices.
Matplotlib (plt): A plotting library to visualize data and model results.
Seaborn (sns): Based on Matplotlib, Seaborn provides high-level interface for drawing attractive statistical graphics.
TensorFlow (tf): A deep learning library, providing functions to build and train machine learning models.
Keras (keras): A high-level neural networks API, included within TensorFlow, that simplifies the process of building models.
Sequential: A linear stack of layers in Keras for creating neural networks step by step.
Conv2D: Convolutional layers used for feature extraction from images.
MaxPooling2D: Used to reduce the spatial dimensions of the output after convolution.
Flatten: Converts a 2D matrix of features into a vector.
BatchNormalization: Normalizes the output of a previous activation layer to speed up training.
Dropout: A regularization technique to prevent overfitting by randomly dropping units during training. Next, we'll set up the dataset preprocessing and architecture of the CNN model.

Step 2: Loading and Preprocessing the Dataset

We will load the dataset from the given directory and preprocess it by resizing images and splitting it into training and validation sets.

seed = 40
train_ds = keras.utils.image_dataset_from_directory(
    directory='/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images',
    labels="inferred",
    label_mode="int",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(256, 256),
    validation_split=0.2,
    subset="training",
    seed=seed  # Add seed argument
)

validation_ds = keras.utils.image_dataset_from_directory(
    directory='/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images',
    labels="inferred",
    label_mode="int",
    class_names=None,
    color_mode="rgb",
    batch_size=32,
    image_size=(256, 256),
    validation_split=0.2,
    subset="validation",
    seed=seed  # Add seed argument
)

Found 10407 files belonging to 10 classes.
Using 8326 files for training.
Found 10407 files belonging to 10 classes.
Using 2081 files for validation.

Explaination

image_dataset_from_directory: This function helps load images directly from the directory and automatically labels them based on folder structure.
directory: Path to the folder containing the paddy disease images.
labels="inferred": Automatically infers labels from folder names.
label_mode="int": Labels are returned as integer-encoded values.
image_size=(256, 256): Each image is resized to 256x256 pixels to maintain uniformity.
validation_split=0.2: 20% of the data is reserved for validation, while 80% is used for training.
subset: Specifies whether this dataset is used for training or validation.
seed: Setting a random seed ensures that the dataset split is reproducible.

Step 3: Normalizing the Dataset

Before training the model, we need to normalize the pixel values of the images. Neural networks tend to perform better when the data is normalized, i.e., pixel values are scaled between 0 and 1 instead of their original range of 0 to 255.

# Normalizing the data 
def process(image,label):
    image = tf.cast(image/255,tf.float32)
    return image,label
train_ds = train_ds.map(process)
validation_ds = validation_ds.map(process)

Explaination

process(image, label): This function normalizes each image by dividing the pixel values by 255 (scaling them from 0-255 to 0-1).
tf.cast(image/255, tf.float32): Converts the pixel values to float32 data type for compatibility with TensorFlow models.
train_ds.map(process): Applies the normalization function to each image in the training dataset.
validation_ds.map(process): Applies the normalization function to each image in the validation dataset. Now that our images are normalized, we can move on to building and compiling the CNN model.

Step 4: Counting the Number of Classes

We need to determine how many disease categories (classes) are present in the dataset. This will help us define the output layer of our CNN model correctly.

import os

# Get the directory path from the DirectoryIterator object
dataset_path = '/kaggle/input/paddy-disease-dataset/paddy-disease-classification/train_images'

# Count the number of subdirectories (classes)
num_classes = len([name for name in os.listdir(dataset_path) if os.path.isdir(os.path.join(dataset_path, name))])

print("Number of classes:", num_classes)

Number of classes: 10

Explaination

os.listdir(dataset_path): Lists all the directories and files in the given dataset path.
os.path.isdir(): Checks if the listed item is a directory (which corresponds to a class in our case).
num_classes: This variable stores the total number of subdirectories, each representing a disease class. The number of classes is an important parameter when defining the output layer of the CNN model. Next, we will use this to finalize the architecture of our model.

Step 5: Building the CNN Model

We will now define the architecture of the Convolutional Neural Network (CNN) for paddy disease classification. The model consists of several convolutional layers for feature extraction, followed by pooling layers for downsampling, and dense layers for classification.

# create CNN model

model = Sequential()


model.add(Conv2D(128,kernel_size=(3,3),padding='valid',strides = 1,activation='relu',input_shape=(256,256,3)))
model.add(MaxPooling2D(pool_size=(3,3),padding='valid'))

model.add(Conv2D(64,kernel_size=(3,3),padding='valid',strides = 1,activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3),padding='valid'))

model.add(Conv2D(32,kernel_size=(3,3),padding='same',strides = 1,activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3),padding='valid'))

model.add(Conv2D(16,kernel_size=(3,3),padding='same',strides = 1,activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3),padding='valid'))

model.add(Flatten())

model.add(Dense(128,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(num_classes,activation='softmax'))

Explanation of the Model Layers:

Conv2D(128, kernel_size=(3,3)): This is a 2D convolution layer with 128 filters and a kernel size of 3x3. It extracts features from the input images. The activation function used is ReLU.
MaxPooling2D(pool_size=(3,3)): The pooling layer reduces the spatial dimensions of the output, helping to reduce the number of parameters.
Additional Conv2D and MaxPooling2D layers: The model adds more convolution and pooling layers with fewer filters to progressively learn complex features while reducing dimensions.
Flatten(): This layer flattens the 2D matrix output into a 1D vector, preparing it for the dense layers.
Dense(128, activation='relu'): A fully connected layer with 128 neurons. ReLU activation is used to introduce non-linearity.
Dropout(0.1): Dropout is applied to prevent overfitting. It randomly drops 10% of the neurons during training.
Dense(64, activation='relu'): Another fully connected layer with 64 neurons.
Dense(num_classes, activation='softmax'): The output layer with a number of neurons equal to the number of classes. Softmax is used for multi-class classification. This architecture is designed to extract rich feature maps from the paddy disease images and classify them into the appropriate disease categories.

Step 6: Compiling the CNN Model

Once the architecture of the CNN model is defined, the next step is to compile it. During compilation, we specify the optimizer, loss function, and evaluation metrics.

from tensorflow.keras.optimizers.legacy import Adam
model.summary()
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 254, 254, 128)  │         3,584 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 84, 84, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 82, 82, 64)     │        73,792 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 27, 27, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D)               │ (None, 27, 27, 32)     │        18,464 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D)  │ (None, 9, 9, 32)       │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_3 (Conv2D)               │ (None, 9, 9, 16)       │         4,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_3 (MaxPooling2D)  │ (None, 3, 3, 16)       │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 144)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 128)            │        18,560 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 128)            │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 64)             │         8,256 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 10)             │           650 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 127,930 (499.73 KB)

 Trainable params: 127,930 (499.73 KB)

 Non-trainable params: 0 (0.00 B)

Explanation:

model.summary(): Displays a summary of the model architecture, showing the layers, output shapes, and the total number of trainable parameters.
optimizer='adam': The Adam optimizer is used for training the model. Adam is an efficient and widely used optimization algorithm that adjusts the learning rate during training based on momentum and past gradients.
loss='sparse_categorical_crossentropy': This loss function is used for multi-class classification when the labels are integers (not one-hot encoded). It measures the performance of the model based on the difference between predicted and actual class labels.
metrics=['accuracy']: Accuracy is used as the evaluation metric to measure the percentage of correctly classified images during training and validation.

Now that the model is compiled, we can move on to the training phase.

Step 7: Training the CNN Model with Early Stopping

Now that the model is compiled, we proceed to the training phase. To prevent overfitting and ensure the model doesn't train for too many epochs, we employ Early Stopping. This will stop the training process when the validation loss stops improving for a certain number of epochs.

early_stopping = keras.callbacks.EarlyStopping(
    monitor="val_loss",
    min_delta=0,
    patience=5,
    verbose=0,
    mode="auto",
    baseline=None,  # Set to the value of val_loss at the desired epoch
    restore_best_weights=False,
)
history = model.fit(train_ds, validation_data=validation_ds, epochs=500,callbacks=[early_stopping] )

Epoch 1/500

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1728655961.819403      82 service.cc:145] XLA service 0x7c73c400ded0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1728655961.819481      82 service.cc:153]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0

  3/261 ━━━━━━━━━━━━━━━━━━━━ 11s 43ms/step - accuracy: 0.0816 - loss: 2.3086

I0000 00:00:1728655969.761492      82 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.

261/261 ━━━━━━━━━━━━━━━━━━━━ 35s 96ms/step - accuracy: 0.1964 - loss: 2.1369 - val_accuracy: 0.3878 - val_loss: 1.7645
Epoch 2/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.3765 - loss: 1.7563 - val_accuracy: 0.4224 - val_loss: 1.6145
Epoch 3/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 50ms/step - accuracy: 0.4598 - loss: 1.5672 - val_accuracy: 0.5103 - val_loss: 1.4529
Epoch 4/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 53ms/step - accuracy: 0.5365 - loss: 1.3625 - val_accuracy: 0.5579 - val_loss: 1.3062
Epoch 5/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.6096 - loss: 1.1708 - val_accuracy: 0.6468 - val_loss: 1.0728
Epoch 6/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.6581 - loss: 1.0325 - val_accuracy: 0.6444 - val_loss: 1.0699
Epoch 7/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7060 - loss: 0.8909 - val_accuracy: 0.7088 - val_loss: 0.9128
Epoch 8/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7196 - loss: 0.8401 - val_accuracy: 0.7299 - val_loss: 0.8351
Epoch 9/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.7680 - loss: 0.7192 - val_accuracy: 0.7400 - val_loss: 0.8307
Epoch 10/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.7833 - loss: 0.6559 - val_accuracy: 0.7564 - val_loss: 0.8046
Epoch 11/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8006 - loss: 0.5920 - val_accuracy: 0.7684 - val_loss: 0.7828
Epoch 12/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8176 - loss: 0.5438 - val_accuracy: 0.7674 - val_loss: 0.7860
Epoch 13/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8381 - loss: 0.4842 - val_accuracy: 0.7746 - val_loss: 0.7670
Epoch 14/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8524 - loss: 0.4322 - val_accuracy: 0.7987 - val_loss: 0.7146
Epoch 15/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8601 - loss: 0.4153 - val_accuracy: 0.8160 - val_loss: 0.6475
Epoch 16/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 52ms/step - accuracy: 0.8770 - loss: 0.3655 - val_accuracy: 0.8097 - val_loss: 0.6803
Epoch 17/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8883 - loss: 0.3430 - val_accuracy: 0.7847 - val_loss: 0.7986
Epoch 18/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 14s 54ms/step - accuracy: 0.8857 - loss: 0.3348 - val_accuracy: 0.8169 - val_loss: 0.7037
Epoch 19/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8796 - loss: 0.3517 - val_accuracy: 0.8169 - val_loss: 0.7019
Epoch 20/500
261/261 ━━━━━━━━━━━━━━━━━━━━ 13s 51ms/step - accuracy: 0.8951 - loss: 0.3125 - val_accuracy: 0.8318 - val_loss: 0.7073

Explanation:

EarlyStopping: This callback monitors the model's performance on the validation set and halts the training if the monitored metric (in this case, val_loss) doesn't improve for a set number of epochs.
monitor="val_loss": The validation loss is monitored to determine when to stop training.
min_delta=0: The minimum change in the monitored quantity to qualify as an improvement.
patience=5: The training will stop if no improvement in val_loss is observed for 5 consecutive epochs.
restore_best_weights=False: If set to True, the model will restore the weights from the epoch with the best validation loss.
model.fit(): This function begins the training of the CNN model. train_ds: The training dataset.
validation_data=validation_ds: The validation dataset is used to monitor performance during training.
epochs=500: The model is trained for up to 500 epochs (which is too much), but early stopping may halt it earlier if no improvement is observed.
callbacks=[early_stopping]: The early stopping callback is passed to stop training when necessary.

Training will continue until the validation loss plateaus, ensuring that the model doesn't overfit and saving time during training.

Step 8: Plotting Training and Validation Accuracy

After training the model, it’s essential to visualize the performance by plotting the accuracy on both the training and validation datasets across epochs. This will give us insights into how well the model is learning and whether it is overfitting.

# Plotting the training and validation accuracy
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()
plt.show()

Explanation:

history.history['accuracy']: Contains the training accuracy values for each epoch.
history.history['val_accuracy']: Contains the validation accuracy values for each epoch.
plt.plot(): Used to plot both training and validation accuracy over the epochs.
plt.xlabel('Epoch'): Sets the label for the x-axis, which represents the number of epochs.
plt.ylabel('Accuracy'): Sets the label for the y-axis, which represents the accuracy percentage.
plt.title('Training and Validation Accuracy'): Adds a title to the plot.
plt.legend(): Displays the legend to differentiate between training and validation accuracy.

This plot will help you visually assess how well the model generalizes. If there’s a significant gap between training and validation accuracy, it could indicate overfitting.

Step 9: Saving the Trained Model

After training the model, it's important to save it for future use, allowing us to load it later without needing to retrain. This can save a significant amount of time, especially for deep learning models.

model.save('paddy_model.h5')

Explanation:

model.save('rice_model.h5'): This function saves the entire model, including the architecture, weights, and training configuration, to a file named rice_model.h5. The .h5 extension indicates that the model is saved in the HDF5 format, which is commonly used for storing large amounts of numerical data.

Once saved, the model can be easily loaded in the future for inference or further training, using:

from tensorflow.keras.models import load_model
loaded_model = load_model('rice_model.h5')

This step ensures that your hard work in training the model is preserved for future use.

Conclusion

In this tutorial, we successfully built a Convolutional Neural Network (CNN) to classify paddy diseases using image data. We went through the following key steps:

Data Preparation: We loaded and normalized the paddy disease dataset, ensuring that the images were correctly processed for the model.
Model Architecture: We defined a CNN model with multiple convolutional and pooling layers, followed by fully connected layers for classification. This architecture enables the model to learn and extract important features from the input images effectively.
Model Compilation: We compiled the model using the Adam optimizer and sparse categorical crossentropy loss function, setting accuracy as our evaluation metric.
Training with Early Stopping: The model was trained with early stopping to prevent overfitting, monitoring the validation loss throughout the training process.
Evaluation: We visualized the training and validation accuracy to assess the model's performance and ensure it was learning effectively.
Model Saving: Finally, we saved the trained model for future use, making it easy to load and utilize without retraining.

By following these steps, you can develop a robust machine learning model to classify paddy diseases, which can aid farmers and agricultural experts in identifying and managing crop health issues effectively.

Feel free to adapt this tutorial for other types of image classification tasks, as the principles of CNNs can be applied across various domains.

Thank you for following along, and happy coding!