AI Safety at Scale: Addressing the Complexities of AI Model Validation
The potential of AI in financial services is immense. With years of large data sets and access to real-time transactional data, banks have a tremendous opportunity to develop unrivalled client experience and personalization.
However, along with its myriad benefits, AI brings a host of new challenges which require the enhancing of governance processes and validation tools to ensure it is deployed safely and effectively within the enterprise.
With our combined expertise in AI safety, regulation, and model governance, Borealis AI and RBC have been navigating the complexities of this space to develop a robust, comprehensive AI validation process.
Model validation has played an integral role in banks’ traditional data analytics for many years. It helps to ensure that models perform as expected, identifies potential limitations and assumptions, and assesses possible negative impacts. Guidance from the US Federal Reserve dictates that “all model components—inputs, processing, outputs, and reports—should be subject to validation" Banks in Canada have to adhere to similar regulations and have already developed extensive validation processes to meet these requirements and ensure that they manage model risk appropriately. However, the advent of AI poses a number of challenges for traditional validation techniques.
First, it is costly to validate the large volume and variety of data used by AI models. AI models can make use of significantly more variables—referred to as “features” in AI parlance—than conventional quantitative models, and ensuring the integrity and suitability of these large datasets requires more computational power and more attention from validators. This challenge is particularly acute for AI models that use unstructured natural-language data like news feeds and legal or regulatory filings, which require new validation tools as well as more resources. Moreover, AI modelers often use “feature engineering” to transform raw data prior to training, which further increases the dimensionality of the data that must be validated.
Second, the complexity of AI methodologies makes it more difficult for validators to predict how AI models will perform after they are deployed. Compared to conventional models with relatively few features, it is harder to determine how AI models will behave—and why they behave this way—across the full range of inputs these models could face once deployed. AI models’ complexity can also make it more difficult to explain the reasons behind these models’ behavior, which in turn can make it harder to identify biased or unfair predictions. Ensuring that models do not lead some groups of customers to be treated unfairly is an important part of the validation process.
Finally, the dynamic nature of many AI models also creates unique validation challenges. Conventional models are typically calibrated once using a fixed training dataset before being deployed. AI models, on the other hand, often continue to learn after deployment as more data become available, and model performance may degrade over time if these new data are distributed differently or are of lower quality than the data used during development. These models must be validated in a way that takes their adaptiveness into account and frequently monitored to ensure that they remain robust and reliable.
To meet these challenges, banks must develop new validation methods that are better equipped to deal with the scale, complexity, and dynamism of AI. Borealis AI and RBC’s model governance team have joined forces to research and develop a new toolkit that automates key parts of the validation process, provides a more comprehensive view of model performance, and explores new approaches in areas like adversarial robustness and fairness. This pathbreaking technology is designed from the ground up to overcome the unique challenges of AI. AI safety is central to everything we do at Borealis AI, much like strong governance and risk management practices are central to RBC. This research will help to support faster AI deployment and more agile model development, and it will provide validators with more comprehensive and systematic assessments of model performance.