
Data Products
Improving Predictions in Deep Learning by Modelling Uncertainty
13/09/2018
Our ability to foresee recurrent income and expenses in an account is unique in the sector. This kind of forecasting helps customers plan budgets, act upon a financial event, or avoid overdrafts.
At BBVA we have been working for some time to leverage transactional data of our clients and Deep Learning modes to offer a personalized and meaningful digital banking experience. Our ability to foresee recurrent income and expenses in an account is unique in the sector. This kind of forecasting helps customers plan budgets, act upon a financial event, or avoid overdrafts. All of this, while reinforcing the concept of “peace of mind”, which is what a bank such as BBVA aims to herald.
The application of Machine Learning techniques to predict an event as being recurrent, together with the amount of money involved, allowed us to develop this functionality. As a complement to this project, at BBVA Data & Analytics we are investing in research and development to study the feasibility of Deep Learning methods in forecasting1 . This has already been explained in the post “There is no such thing as a certain prediction”. The goal was not simply to improve the current system, but to generate new knowledge to validate these novel techniques.
As a result, we have observed that Deep Learning contributes to reducing errors in forecasting. Nonetheless, we have also have seen that there are still cases in which certain expenses are not predictable. Indeed, simple Deep Learning for regression does not offer a mechanism to determine uncertainty and hence measure reliability.
Making good predictions is as important as detecting the cases in which those predictions have an ample range. Therefore, we would like to be able to include this uncertainty in the model. This would be useful not only for showing clients reliable predictions but also for prioritizing actions related to the results shown. This is why we are now researching Bayesian Deep Learning models that can measure uncertainty related to the forecast.
Measure uncertainty to help clients
The detection of confident user behaviors for prediction purposes requires an analysis of the concept of the uncertainty of the prediction. However, what is the source of uncertainty? Although the clarification of this concept is still an open debate, in Statistics (or at times in Economics),it is classified in two categories: The aleatoric uncertainty (i.e. the uncertainty manifested due to the variability of the different possible correct solutions given the same information to predict) or epistemic uncertainty (i.e. the uncertainty related with our ignorance of what is the best model to use in order to solve our problem, or even our ignorance vis á vis this new kind of data that we were not able to appreciate in the past). From a mathematical point of view, we could try to find a function based on certain input data (logged transactions of a certain user) that would return the value of the next transaction in a time series in the most accurate way possible. Nevertheless, there are limits to this approach: in our case, given the same information -past transactions-, the results are not necessarily the same. Two clients with the same past transactional behavior do not necessarily imply similar transactions in the future. The following figure visualizes the concept of uncertainty and tries to answer the question of what the value would be of the red dot during a certain time interval.





Notes
- This research was synthesized in an article accepted for publication that will be presented in the ECML-PKDD 2018, and that is opened in the e-Print service arXiv. ↩︎