AI Talks 2: Multiclass Classification


As you know already, the world around us is very complex. So, how do computers recognise all the objects around them?

In this episode, we’re going to try understand how multiclass classification works. This technique allows computers to recognise multiple objects at the same time in a given scene.

In the last AI Talks blog post, we used images and jelly beans to understand how a computer represents visual information. In this episode we’ll do the same, but we’ll use differently coloured jelly beans to indicate pictures with different content.

In the memory of the computer, each jelly bean represents what we call a ‘feature vector’. A feature vector is a list of numbers that encodes the appearance of an image. Simply put, it translates the image into numerical data because that’s what the computer understands! The last ingredient is the label that is associated with each image: face, dog and building. Now we have everything on hand to train the computer to recognise these objects. For simplicity’s sake, we’re only keeping the jelly beans and we'll drop the rest.

Now we're ready for our last step, which also happens to be the most important one. To effectively train the computer, we have to add more images so that the computer can start learning - practice makes perfect! At every reiteration, the computer reorganises the beans into categories or groups. Then, we add a label.

So, each group gets a category that corresponds to/with the image type - face, dog and building. The last thing we need to do is to separate these beans or images. This step is called ‘classification’. We use lines to separate each group from each other.

And that’s it - we now have three groups separated into three different regions.


When we classify a new image, the computer extracts features for that image and places the image in the correct position. In this case, the image is at the boundary between the person and building regions. The computer made the right decision because as you’ve likely noticed, we have an image of a person with a building in the background. So in this case, we have two labels attached to the same image.

So, what did we learn this week? First of all, we learnt that each image is represented in the computer by a list of numbers called feature vectors. If the computer is trained with a lot of diverse data, the computer can learn multiple things at the same time. This more advanced technique is called multi class classification.


If you liked this post, check out the rest of our blog and our YouTube channel!