Net54baseball.com Forums - View Single Post

Snowman · #17 08-11-2021, 01:46 AM

Part 1 of 3...

I apologize in advance for the ridiculously long string of posts that this is going to be, but I did warn you that this is a lengthy discussion, so I guess I'm at least partially delivering on that promise lol.

There is a lot of confusion here, and across the hobby, about what the term "AI" means and how it works. I'll try to clear up some of this confusion as this is my field of expertise. I write "AI" or "ML" code for work every day and have a strong understanding of how these algorithms work, why they work, where they are likely to go wrong and why.

First, I have to point out that while 'Artificial Intelligence' and 'Machine Learning' are perhaps kissing cousins, they're not quite the same thing. The problem of grading cards with computational algorithms is a 'Machine Learning' problem, not an 'Artificial Intelligence' problem. Artificial Intelligence is when you have recursive algorithms that make use of something like deep learning or ensemble models with deep neural networks where the algorithms can often learn from themselves. This would be something like Alpha Go or Alpha Zero (the best chess "player" in the world) or Deep Mind, or Tesla's self-driving cars that learn how to drive better through simulations of millions of miles. The people coding these algorithms set up the framework and outline the "rules" for it to be able to learn on its own and then sorta turns it loose on the problem. This is not applicable to the challenge of grading cards.

Grading cards is a 'Machine Learning' problem. Specifically, computer vision and classification. In order to understand why grading cards is not a problem well suited for machine learning, you have to first understand how these algorithms work and what their limitations are because there are some important limitations that I would argue render this technology borderline useless for the specific application of grading cards. I will go into this in more detail below, but for now, I'm just pointing out that I take issue with using the term 'AI' for grading. This is NOT an AI problem. It is a machine learning problem. However, "AI" sounds cooler, so "AI grading" it is, right? This is a marketing ploy. OK, I digress.

How computer vision works:

Imagine you have a photo of you and your family sitting down in the park having a picnic, surrounded by fields of green grass. Try to visualize the photo. You know how that image looks to you, but what does it "look like" to a computer? Everyone is familiar with the binary 1s and 0s at the operating system level of a computer, but let's pull back from that and try to interpret how a computer might see color and objects in a photo. Most of you are probably familiar with RGB colors, but if not, it's helpful to know that colors on your computer screen can be rendered using RGB color values, which range from 0 to 255 for Red, Green, and Blue. So every tiny little pixel in your family photo can actually be defined by it's RGB color values. Those pixels in your photo that are part of the green grass surrounding you all look something like [0, 255, 0] meaning 0 parts red, 255 parts green, and 0 parts blue. Now imagine how that entire photo could be represented in a giant map mathematically. Break it down into 3 different matrices or grids: one for red values, one for green, and one for blue. The matrix for red would have a ton of ~0s in it (pretty much everywhere that the green grass is located) but would have higher values in the center of the matrix which correspond to where the people are sitting since people have red color tones. The green matrix would have a ton of ~255 values in it since there is grass all around you, but it would have lower values in the center where the people are since people aren't green. Make sense?

OK, so now we have 3 different matrices, or think of them as maps if that helps, where each pixel from the photo has a corresponding color value. These matrices full of numeric values are what enable computers to "learn" from photos. What sorts of things can a computer learn from an image? Quite a lot actually. One of the primary ways a computer can tell that something is different about a particular section of a photograph is through something known as "edge detection". Edge detection makes use of some fancy math to identify where the edges of an object are located in the photo. So in our example here, one of the "edges" would be where the green grass meets up against the people in the center of the photo. The mathematical values are different on each side of this "edge", which helps the computer to detect that this is an important location in the photo, and it learns to pay attention to it. Make sense? Great. If not, well, I apologize for being a crappy teacher. But this is the gist of how computers see an image and how they use mathematics to identify key aspects of a photo (or a scan in the case of grading cards). If you're perceptive and you can visualize how these matrices of numbers might look to the ML algorithms, you can probably already see how this could be problematic for grading cards. I'll get into that below. It's a pretty lengthy discussion though.

How machine learning classification models work:

One of the most common machine learning problems that data scientists work on is building classification models which aim to classify (or "categorize", or "label") something as belonging to a particular class. A simple example of this might be to build a model to predict whether or not someone is Male or Female based on a set of attributes that the computer learns from. So we might train that algorithm by feeding it the height, weight, hair length, ring finger to middle finger length ratios, hip measurements, shoe size, eyelash length, how fast they can run, and how much time they spend each month shopping or talking on the phone (stereotypes be damned). Then we feed all of that data to the algorithm for each person and tell the computer whether that person is a male or a female. The computer would then learn what each profile looks like and would be able to provide probabilistic estimates of someone being a male or a female for any new data you threw at it. So a person who is 6'2", 195 lbs with a size 12 shoe that runs a 5.1-second 40-yard dash with medium length hair might get classified as having something like an 81% probability of being a male and a 19% probability of being a female according to the algorithm (note that these algorithms are almost never quite as confident as you might want them to be). In addition to binary classes like 'Male' and 'Female', or 'yes' and 'no', or 'true' and 'false' type problems, there are also what are known as multi-class classification problems. So this might be something like classifying whether an animal is a fish, bird, mammal, reptile, or amphibian. The output for a model like this might be something like 3% probability of being a fish, 12% bird, 7% mammal, 46% reptile, and 32% amphibian if you were feeding it with the data of a monitor lizard. Multi-class classification problems are much less performant than binary classification problems for obvious reasons. More options lead to more variance leads to more uncertainty, which equals more errors made by the computer when classifying.