Difference between Loss, Accuracy, Validation loss, Validation accuracy in Keras

Upasana | August 11, 2022 | 3 min read | 8,750 views

What is model validation and its importance?

When we have built the model but would like to validate it by inducing different datasets. Usually we face constraint in terms of amount of accurate data we have for training. But validating model is also necessary so that we can rely on model based on it evaluation through validation dataset.

We evaluate trained model on validation dataset before testing on training dataset.

There are two ways of doing that:

1. Taking validation dataset from training dataset.

This approach is being used by many and even the famous Random Forest algorithm as well.

we divide training dataset in two dataset with some x:y ratio. But this is not static. We split the dataset at every epoch rather than splitting it in start. We split the dataset at every epoch and makes sure that training and validation dataset is always different by shuffling dataset. This way, we can get better insights of model’s performance.

Now, lets see how it can be possible in keras.

In above image, you can see that we have specified arguments validation_split as 0.3 and shuffle as True.

validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling.
shuffle: Boolean (whether to shuffle the training data before each epoch) or str (for 'batch').

When we mention validation_split as fit parameter while fitting deep learning model, it splits data into two parts for every epoch i.e. training data and validation data and since we are suing shuffle as well it will shuffle dataset before spitting for that epoch. It trains the model on training data and validate the model on validation data by checking its loss and accuracy. Now, we can evaluate model while training parallely with random shuffled dataset.

2. Keeping different validation set while splitting main dataset.

This approach is based on when we split dataset in three different dataset like below:

In below image, you can see that we have specified argument validation_data as (x_val, y_val)

Difference between accuracy, loss for training and validation while training (loss vs accuracy in keras)

When we are training the model in keras, accuracy and loss in keras model for validation data could be variating with different cases. Usually with every epoch increasing, loss should be going lower and accuracy should be going higher.

But with val_loss(keras validation loss) and val_acc(keras validation accuracy), many cases can be possible like below:

val_loss starts increasing, val_acc starts decreasing. This means model is cramming values not learning
val_loss starts increasing, val_acc also increases.This could be case of overfitting or diverse probability values in cases where softmax is being used in output layer
val_loss starts decreasing, val_acc starts increasing. This is also fine as that means model built is learning and working fine.

That’s all for now. Happy Reading..

Here is a similar article worth having a look: https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras

ebook PDF - Cracking Java Interviews v3.5 by Munish Chandel

Book you may be interested in..

ebook PDF - Cracking Spring Microservices Interviews for Java Developers

Find more on this topic:

Machine Learning

Data science, machine learning, python, R, big data, spark, the Jupyter notebook, and much more

Last updated 1 week ago

Subscribe to Interview Questions

Do you like cookies? 🍪 We use cookies to ensure you get the best experience on our website. Learn more