MIT researchers recently made one of the boldest claims related to artificial intelligence we’ve seen yet: they believe they’ve built an AI that can identify a person’s race using only medical images. And, according to the popular media, they have no idea how it works!
Sure. And I’d like to sell you an NFT of the Brooklyn Bridge.
Let’s be clear up front, per the team’s paper, the model can predict a person’s self-reported race:
In our study, we show that standard AI deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities.
Prediction and identification are two entirely different things. When a prediction is wrong, it’s still a prediction. When an identification is wrong, it’s a misidentification. These are important distinctions.
AI models can be fine-tuned to predict anything, even concepts that aren’t real.
Here’s an old analogy I like to pull out in these situations:
I can predict with 100% accuracy how many lemons in a lemon tree are aliens from another planet.
Because I’m the only person who can see the aliens in the lemons, I’m what you call a “database.”
I could stand there, next to your AI, and point at all the lemons that have aliens in them. The AI would try to figure out what it is about the lemons I’m pointing at that makes me think there’s aliens in them.
Eventually the AI would look at a new lemon tree and try to guess which lemons I would think have aliens in them.
If it were 70% accurate at guessing that, it would still be 0% accurate at determining which lemons have aliens in them. Because lemons don’t have aliens in them.
In other words, you can train an AI to predict anything as long as you:
- Don’t give it the option to say, “I don’t know.”
- Continue tuning the model’s parameters until it gives you the answer you want.
No matter how accurate at predicting a label an AI system is, if it cannot demonstrate how it arrived at its prediction, those predictions are useless for the purposes of identification — especially when it comes to matters relating to individual humans.
Furthermore, claims of “accuracy” don’t mean what the media seems to think they do when it comes to these kinds of AI models.
The MIT model achieves less than 99% accuracy on labeled data. This means, in the wild (looking at images with no labels), we can never be sure if the AI’s made the correct assessment unless a human reviews its results.
Even at 99% accuracy, MIT’s AI would still mislabel 79 million human beings if it were given a database with an image for every living human. And, worse, we’d have absolutely no way of knowing which 79 million humans it mislabeled unless we went around to all 7.9 billion people on the planet and asked them to confirm the AI’s assessment of their particular image. This would defeat the purpose of using AI in the first place.
The important bit: teaching an AI to identify the labels in a database is a trick that can be applied to any database with any labels. It is not a method by which an AI can determine or identify a specific object in a database; it merely tries to predict — to guess — what label the human developers used.
The MIT team concluded, in their paper, that their model could be dangerous in the wrong hands:
The results from our study emphasise that the ability of AI deep learning models to predict self-reported race is itself not the issue of importance.
However, our finding that AI can accurately predict self-reported race, even from corrupted, cropped, and noised medical images, often when clinical experts cannot, creates an enormous risk for all model deployments in medical imaging.
It’s important for AI developers to consider the potential risks of their creations. But this particular warning bears little grounding in reality.
The model the MIT team built can achieve benchmark accuracy on big databases but, as explained above, there’s absolutely no way to determine if the AI is correct unless you already know the ground truth.
Basically, MIT’s warning us about the possibility for evil doctors and medical technicians to practice racial discrimination at scale, using a system similar to this.
But this AI can’t determine race. It predicts labels in specific datasets. The only way this model (or any model like it) could be used to discriminate is with a wide net, and only when the discriminator doesn’t really care how many times the machine gets it wrong.
All you can be sure of, is that you couldn’t trust an individual result without double-checking it against a ground truth. And the more images the AI processes, the more mistakes it’s certain to make.
In summation: MIT’s “new” AI is nothing more than a magician’s illusion. It’s a good one, and models like this are often incredibly useful when getting things right isn’t as important as doing them quickly, but there’s no reason to believe bad actors can use this as a race detector.
MIT could apply the exact same model to a grove of lemon trees and, using the database of labels I’ve created, it could be trained to predict which lemons have aliens in them with 99% accuracy.
This AI can only predict labels. It doesn’t identify race.