In a job interview you may be asked when a model is under or overfiting in data science and how to acually avoid it. Here are the answers you can return when you get this questions:
First, what is acually Under fitting? It means that the model you created is not able to handle the data, or in other words: The model is not able to learn the (complex) pattern in your data. You can detect underfitting using several approaches
Overfitting is the other extreme: you have a to complex model that can handle the training data very well but fail on the test data.
To prevent overfitting, you have multiple options.
Make your model more simple if possible. This again can be done in different ways depending on the model you are using. In general it is a good pattern to start with the simplest model possible to have it as a baseline (or going even more simple by just randomly picking a value)
In general you can reduce the amount of feature you pass to the model.
More data will help as well. Given your dataset is (heavy) unbalanced, every additional datapoint in the minority class will help your model a lot. But remember that the data have to be in the same quality as the existing dataset and not introduce some noise.
Given you want to split your data into the 5 classical buckets where 1 bucket (about 20%) should be used as the test set. This slice or bucket will rotate and all metrics will be averaged to make sure that the model is more stabil.
When your model no more improve, interrupt the training. When you train using the keras library, you can add a callback function after an epoch or batch that can terminate the training.
Regularization will also help to reduce overfitting. The loss function will get extended, to have a some punish on small values of coefficients.
This is the process of generating new data from existing one. This is very common when you train on images: You add small modifications to the existing image like mirroring or rotating by several degrees.