These new metrics help grade AI modelsâ€™ trustworthiness

December 2, 2020

Whether itâ€™s diagnosing patients or driving cars, we want to know whether we can trust a person before assigning them a sensitive task. In the human world, we have different ways to establish and measure trustworthiness. In artificial intelligence, the establishment of trust is still developing.

In the past years,Â deep learningÂ has proven to be remarkably good at difficult tasks in computer vision, natural language processing, and other fields that were previously off-limits for computers. But we also have ample proof that placing blind trust in AI algorithms is a recipe for disaster: self-driving cars thatÂ miss lane dividers, melanoma detectors thatÂ look for ruler marksÂ instead of malignant skin patterns, andÂ hiring algorithmsÂ that discriminate against women are just a few of the many incidents that have been reported in the past years.

Recent work by scientists at the University of Waterloo and Darwin AI, a Toronto-based AI company, provides new metrics to measure the trustworthiness of deep learning systems in an intuitive and interpretable way. Trust is often a subjective issue, but their research, presented in two papers, provides clear guidelines on what to look for when evaluating the scope of situations in which AI models can and canâ€™t be trusted.

[Read:Â How to build a search engine for criminal data]

How far do you trust machine learning?

For many years, machine learning researchers measured the trustworthiness of their models through metrics such as accuracy, precision, and F1 score. These metrics compare the number of correct and incorrect predictions made by a machine learning model in various ways. They can answer important questions such as if a model is making random guesses or if it has actually learned something. But counting the number of correct predictions doesnâ€™t necessarily tell you whether a machine learning model is doing its job correctly.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

These new metrics help grade AI modelsâ€™ trustworthiness

How far do you trust machine learning?

Overconfident of overcautious?

Setting up a hierarchy of trust scores

The machine learning trust matrix

Putting trust metrics to use

ABOUT US

FOLLOW US