Should Machines Learn Only from Failures? Exploring Pomodoro Technique

9 November 2025 Machine Learning

Machine Learning Deep Learning Bayesian Networks Academic Exercise

📝 Note: This is a summary of a pseudo-realistic scientific paper I wrote as a practice exercise. It explores an interesting concept: what if machines could learn from their successes, not just their failures?

The Core Problem

In supervised learning tasks, the goal of learning consists in minimising the prediction error. The error itself is calculated with a loss function, which measures the difference between the true and the estimated value. In standard machine learning, learning means choosing the optimal parameters that minimise this loss. But what happens if our model predicts something correctly? Then no update is done.

Learning from Success

The approach of learning from successes and not only from failures has already been explored in different contexts. However, reinforcing the model where it is already performing well can easily lead to one of the biggest problems in machine learning: overfitting. We address this issue in the case of a binary supervised classification problem, where classes are often unbalanced. For instance, if the model initially predicts the majority class correctly, it could have simply guessed it. Rewarding such a prediction without checking its reliability would introduce a dangerous bias. The model would be rewarded without actually learning.

Bayesian Neural Networks to the Rescue

Thanks to Bayesian Neural Networks, we can now estimate how confident the model is in each prediction. This allows us to distinguish between epistemic (model) and aleatoric (data) uncertainty. We can therefore apply learning from success only when the model is truly confident, ensuring that it reinforces meaningful knowledge rather than noise. Adding a secondary learning mechanism inevitably increases computation, so an efficient implementation is essential. In this paper, we address the general problem of how to exploit success cases during training without causing overfitting or excessive computational cost.

Introducing Pomodoro Learning

We propose Pomodoro Learning, a two-phase training cycle inspired by the idea of alternating work and reflection. First, the model trains on a quarter of the dataset using standard cross-entropy loss. Then, we apply Monte Carlo Dropout, a simple method for estimating uncertainty without converting the model into a full Bayesian network. We select the top-q% of correctly predicted samples per class – those with high confidence and low uncertainty – apply small data augmentations, and perform a short fine-tuning phase. This phase includes two regularisers:

A margin term – which encourages larger absolute logits for selected samples, pushing their decision boundaries farther from uncertainty
A stochastic-consistency term – which aligns two dropout passes of the same input, making predictions more stable

Connections and Implications

Pomodoro Learning sits at the intersection of curriculum learning, active learning, and Bayesian uncertainty estimation. It introduces a lightweight and interpretable way to learn from success while maintaining calibration and generalisation. Limitations include sensitivity to selection hyperparameters such as confidence thresholds and the proportion of samples reinforced. Future work should evaluate its performance on larger datasets and deeper architectures, investigate behaviour under distribution shift, and explore adaptive hyperparameters that evolve during training.

The Bigger Picture

In a broader sense, Pomodoro Learning provides a structured framework that connects human-inspired reflection to data-driven optimisation, suggesting that learning can advance not only through correcting mistakes but also by understanding why success occurs.

💭 Personal Reflection: This exercise was really useful for practicing scientific writing and thinking critically about machine learning concepts. The idea of learning from successes is intriguing – it mirrors how humans often learn by reinforcing what works, not just correcting what doesn't.