Saturday, April 20, 2024
HomeTechThese new metrics help grade AI models’ trustworthiness

These new metrics help grade AI models’ trustworthiness

Whether it’s diagnosing patients or driving cars, we want to know whether we can trust a person before assigning them a sensitive task. In the human world, we have different ways to establish and measure trustworthiness. In artificial intelligence, the establishment of trust is still developing.

In the past years, deep learning has proven to be remarkably good at difficult tasks in computer vision, natural language processing, and other fields that were previously off-limits for computers. But we also have ample proof that placing blind trust in AI algorithms is a recipe for disaster: self-driving cars that miss lane dividers, melanoma detectors that look for ruler marks instead of malignant skin patterns, and hiring algorithms that discriminate against women are just a few of the many incidents that have been reported in the past years.

Recent work by scientists at the University of Waterloo and Darwin AI, a Toronto-based AI company, provides new metrics to measure the trustworthiness of deep learning systems in an intuitive and interpretable way. Trust is often a subjective issue, but their research, presented in two papers, provides clear guidelines on what to look for when evaluating the scope of situations in which AI models can and can’t be trusted.

[Read: How to build a search engine for criminal data]

How far do you trust machine learning?

For many years, machine learning researchers measured the trustworthiness of their models through metrics such as accuracy, precision, and F1 score. These metrics compare the number of correct and incorrect predictions made by a machine learning model in various ways. They can answer important questions such as if a model is making random guesses or if it has actually learned something. But counting the number of correct predictions doesn’t necessarily tell you whether a machine learning model is doing its job correctly.