Difference between Loss, Accuracy, Validation loss, Validation accuracy in Keras
Carvia Tech | December 07, 2019 | 3 min read | 7,612 views
What is model validation and its importance?
When we have built the model but would like to validate it by inducing different datasets. Usually we face constraint in terms of amount of accurate data we have for training. But validating model is also necessary so that we can rely on model based on it evaluation through validation dataset.
We evaluate trained model on validation dataset before testing on training dataset.
There are two ways of doing that:
1. Taking validation dataset from training dataset.
This approach is being used by many and even the famous Random Forest algorithm as well.
we divide training dataset in two dataset with some x:y ratio. But this is not static. We split the dataset at every epoch rather than splitting it in start. We split the dataset at every epoch and makes sure that training and validation dataset is always different by shuffling dataset. This way, we can get better insights of model’s performance.
Now, lets see how it can be possible in keras.
In above image, you can see that we have specified arguments
validation_split as 0.3 and
shuffle as True.
validation_split: Float between 0 and 1. Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. The validation data is selected from the last samples in the x and y data provided, before shuffling.
shuffle: Boolean (whether to shuffle the training data before each epoch) or str (for 'batch').
When we mention
validation_split as fit parameter while fitting deep learning model, it splits data into two parts for
every epoch i.e. training data and validation data and since we are suing
shuffle as well it will
shuffle dataset before spitting for that epoch. It trains the model on training data and validate the model on validation
data by checking its loss and accuracy. Now, we can evaluate model while training parallely with random shuffled dataset.
2. Keeping different validation set while splitting main dataset.
This approach is based on when we split dataset in three different dataset like below:
In below image, you can see that we have specified argument
validation_data as (x_val, y_val)
Difference between accuracy, loss for training and validation while training (loss vs accuracy in keras)
When we are training the model in keras, accuracy and loss in keras model for validation data could be variating with different cases. Usually with every epoch increasing, loss should be going lower and accuracy should be going higher.
But with val_loss(keras validation loss) and val_acc(keras validation accuracy), many cases can be possible like below:
val_loss starts increasing, val_acc starts decreasing. This means model is cramming values not learning
val_loss starts increasing, val_acc also increases.This could be case of overfitting or diverse probability values in cases where softmax is being used in output layer
val_loss starts decreasing, val_acc starts increasing. This is also fine as that means model built is learning and working fine.
That’s all for now. Happy Reading..
Top articles in this category:
- Python coding challenges for interviews
- Top 100 interview questions on Data Science & Machine Learning
- Flask Interview Questions
- Google Data Scientist interview questions with answers
- Creating custom Keras callbacks in python
- Find if credit card number is valid or not
- Imbalanced classes in classification problem in deep learning with keras
Find more on this topic:
Data science, machine learning, python, R, big data, spark, the Jupyter notebook, and much more
Last updated 1 week ago
Recommended books for interview preparation:
- Configure Logging in gunicorn based application in docker container
- Connect to Cassandra with Python 3.x and get Pandas Dataframe
- Connect to MySQL with Python 3.x and get Pandas Dataframe
- Connect to Postgresql with Python 3.x and get Pandas Dataframe
- Python - Get Google Analytics Data
- Installing PySpark with Jupyter notebook on Ubuntu 18.04 LTS
- Python send GMAIL with attachment
- Send rich text multimedia email in Python
- Blueprints in Flask API Development
- Singleton Design Pattern in Python