Earlier, I made a statement regarding the facial match results and how those might affect the likelihood of the group as a whole being the Knickerbockers (namely that if each subject has a high match, then it greatly increases the likelihood of the group as a whole). While I asserted it as an "If A then B" statement, it was received as me promoting the idea that this meant there was a 99.9999% probability of this being a Knickerbockers photo by many here (despite my careful phrasing intended to avoid that conclusion).
Anyhow, I think this is worth revisiting now that I've had a chance to play around with their software a bit more. What I wrote earlier is quoted below for reference. Note my qualifier statements highlighted in bold.
Quote:
Originally Posted by Snowman
...if the probability of each person being a "match" is 90%, then the probability of the group being the Knickerbockers is equivalent to the 1 - (0.1^6) = 0.999999 or 99.9999% chance that this is the Knickerbockers. However, this is based on the assumption that a "90% match" actually means the individuals in two photos are 90% likely to be the same person. I don't know if this assumption holds true, and wouldn't be surprised at all if it didn't. I don't know enough about facial recognition software to make that claim...
|
Perhaps it's obvious to everyone by now, but I think it's worth noting that the output from the facial match recognition software definitely does not indicate the probability of two people being the same person. This much is clear from the results you get when simply uploading random subjects or when uploading two images of the same person. There are any number of ways that someone could create algorithms for facial matches and the scoring output from those models can be set up almost arbitrarily. That's not to say that the output of such a model is meaningless though, as the higher match % two images get, the more likely they are to truly be a match. That said, it should be noted that one cannot make probabilistic estimates based on these values as I proposed above since the % match values don't actually represent probabilistic estimates. They're more of an arbitrary scoring system.