The world is changing, and doing so at high speed. We are witnessing a technological revolution of a magnitude never observed before.
This is not a transitory event. The paradigm shift rate (the rate at which new ideas are adopted) doubles every decade; while it took nearly half a century for the telephone to be adopted, and while acceptance of television and radio took several decades, it took under 10 years for computers, Internet and mobile phones to catch on. In 2014, the number of mobile phones already equaled the number of people on the planet (7 billion) and a third of them were smartphones, while the number of Internet users reached almost 3 billion.
Every year information technology doubles its capacity and price/performance ratio, as predicted by Moore's Law, which to date has proven to be true. The result is exponential growth in the technology available and an equivalent reduction in its cost, regardless of the crises experienced over the past few years, and this trend is expected to continue in the coming decades.
But this technological revolution has taken on a new dimension in recent years. Along with increased technical performance has come increased capacity to generate, store and process information, and at an exponential rate too, a situation that has been called the «big data» phenomenon. Some evidence of this is:
- The total volume of data in the world doubles every 18 months.
- Over 90% of the data that exist today were created in the last two years.
- The per capita capacity to store data has doubled every 40 months since 1980 and its cost has decreased by more than 90%.
- Processing capacity has increased 300-fold since the year 2000, making it possible to process millions of transactions per minute.
The impact of this technological transformation is particularly significant in the financial industry because it adds to four other major trends that are shaping this industry:
A macroeconomic environment characterized by weak growth, low inflation and low interest rates, which has penalized the banking industry’s profit margins in mature economies for a long period of time; and uneven performance in emerging countries, with a trend towards slower growth and a rise in default levels.
A more demanding and intrusive regulatory environment, where regulation is becoming global in terms of corporate governance, solvency, liquidity, bail-out limitation, consumer protection, fraud prevention and data and reporting requirements, among other areas.
A profound change in customer behavior, as consumers’ financial culture has improved and customers expect and demand excellence in service while manifesting growing confusion at the complexity and disparity of the products and services offered, which makes them more dependent on opinion leaders.
New competitors entering the financial market, some with new business models that impact the status quo.
The combined effect of these four factors, together with technological transformation among other reasons, is causing industry players to put the focus on the efficient use of information, thus giving rise to a financial industry discipline which so far has been more focused on the IT industry: data science.
Data science is the study of the generalizable extraction of knowledge from data using a combination of automated learning techniques, artificial intelligence, mathematics, statistics, databases and optimization, together with a deep understanding of the business context.
All the above disciplines were already used in the financial industry to varying degrees, but data science has features that make this specialist subject essential to successfully navigate the industry transformation already underway.
More specifically, addressing all elements in the complex financial industry environment previously mentioned requires large data volumes and elaborate analytical techniques, which is exactly the specialty field of data science. Also, data science as a discipline has become more prominent as a result of the big data phenomenon, and therefore data scientists are professionals that are qualified to handle massive quantities of unstructured data (such as those from social networks) which are increasingly significant for financial institutions.
This dramatic increase in data creation, access, processing and storage, as well as in data-based decision making, together with other circumstantial factors already described, has not gone unnoticed by regulators. Indeed, there is a global trend, substantiated among others by the Basel Committee on Banking Supervision (through BCBS 239), towards a requirement for a robust data governance framework that will ensure data quality, integrity, traceability, consistency and replicability for decision-making purposes, especially (but not only) in the field of Risk.
This trend has been complemented by the US Federal Reserve and OCC regulations requiring entities to implement robust model governance frameworks to control and mitigate the risk arising from the use of models, known as «model risk».
Financial institutions are taking decisive steps to develop these governance frameworks (data and models), which together constitute the governance of the data science capabilities.
In this changing environment, the transformation of financial institutions is not a possibility: it is a necessity to ensure survival. This transformation is closely linked to intelligence, which is ultimately the ability to receive, process and store information to solve problems.
Against this backdrop, the present study aims to provide a practical description of the role played by data science, particularly in the financial industry. The document is divided into three sections which respond to three objectives:
- Describing the IT revolution in which the financial industry is immersed and its consequences.
- Introducing the data science discipline, describing the characteristics of data scientists and analyzing the trends observed in this respect, as well as their impact on governance frameworks for data and data models in financial institutions.
- Providing a case study to illustrate how data science is used in the financial industry, consisting of the development of a credit scoring model for individuals using data from social networks.