How Computers Understand Images and Videos with AI


Artificial intelligence is one of the most spoken of yet least understood technologies that exists in our world today. It’s no understatement that AI is proving to be one the fastest growing modern technologies.

Nonetheless, there’s still an aura of unnecessary confusion that clouds the entire industry. While it can be beneficial to get in-depth insight, the vast majority of people will be turned off by the prospect unless they have previous experience with of AI. We can sympathise, so this simple explainer hopes to reconcile that.

So, what is artificial intelligence? The Merriam-Webster dictionary defines it as “a branch of computer science dealing with the simulation of intelligent behavior in computers”. Interesting, but this definition doesn’t reveal much. In an effort to gently introduce AI, we’re going to briefly run through how it can allow a computer to ‘see’ and understand its environment.

With the help of an imaging device (read: camera), a computer can be taught to identify what it ‘sees’ from things it’s already been ‘taught’. In reality, the user inputs lots and lots of data into the computer which will be used later on to determine how much of a match a certain thing is to that data. Thus, if one wants to use AI to automatically identify a certain thing, the computer analyses the object in question and compares it to the data it already has. This is more easily demonstrated in an analogy. At 12 months old, a human baby can interpret basic objects and can identify very simple objects, e.g. a ball, a dog, a parent. The baby sees basic colours, patterns and objects appearing with shapes and it’s from past experience that he/she can determine what that ‘thing’ is. Computers are similar in this regard. They can identify things too, and they do it in a very similar way to babies: they compare what they see to what they’ve already seen in the past. Based on that information, it’s relatively simple for the computer to determine whether that thing is or isn’t a match.

To further this explanation, consider a more realistic albeit simplified example: a researcher wishes to identify whether an object is or is not a cat with the aid of artificial intelligence. The researcher has to bear in mind that there are different types of cat. There are different colours, shapes, fur types, poses and even expressions. What he/she has to do is to feed the computer data on all types of cats. In order for the computer to have a better chance of succeeding, the computer has to receive thousands and thousands of cat pictures. This wealth of information is called the model. The model is the information that the computer can use to compare to the object it sees. While a baby would try to compare what he/she is seeing to what he/she has already seen, the computer doesn’t have any memory so it must determine if the object it sees is/isn’t a match by comparing it to the model. It’s from this data that the computer makes a prototype. The prototype is just the ‘idea’ of what a cat is. It is the collection of data that epitomises a cat.

In real-life cases, AI isn’t usually used to identify whether or not an object is a cat but the same technology can be applied to a variety of different categories including animals, buildings, food, etc. The concept is the same: the algorithm determines whether or not the object that it sees is a match to the model. If it doesn’t recognise the object, it will try to match the object in question with other models it has in its database. From this, an educated guess can be made which may or may not be accurate. On top of this, recent advancements have allowed AI to work with images as well as videos. While their ability is somewhat limited, they can still be programmed to recognise several thousands of objects. In regards to video, the concept is largely the same as in the above examples that used pictures.

That concludes our simple explanation of AI. Hopefully we managed to break it down well enough - there are more complicated avenues to discover but for now we think this post is a good starting point for anyone interested in AI. Our next blog post will investigate how models are created and stored, so if you enjoyed this article then feel free to bookmark us so you can get the next post as soon as it’s published.