Understanding AI: The Difference Between Knowledge and Prediction

Shama Mahajan
Jan 31
7 min read

In this article, Shama explains how the AI systems within our commonly used tools like ChatGPT work, and if Generative AI can really be a direct substitute for search engines.

The following news articles indicate signs that we live in a time where Artificial Intelligence (AI) is an undisputed part of our lives:

“Lawyers caught citing fake AI-generated case laws in their briefs”

“Lawyer ordered to pay S$800 to the opponent for citing fake AI generated case”

“Self-driven car manufacturer under investigation after cars run the red light and drive on the wrong side of the road”

“Tesla’s Auto-pilot feature involved in 13 fatal accidents”

“Cambodian Ministry condemns false deepfake that seeks to jeopardise diplomatic relations with Thailand”

Seeing how AI is taking over every facet of our lives, from driving on the road to even legal research, it would be inaccurate to say that AI does not affect us or is limited to certain parts of our lives. In fact, this diagram below shows the use of AI in various fields. The question then becomes: ‘To what extent should we use AI in our lives?’ To answer this, we then have to determine how the AI works and therefore our individual level of comfort in using it. In this explainer, we explore how AI functions and whether it should be seen as ‘an assistant’ or ‘a substitute’.

What is AI?

Intelligence is a human quality. When machines are programmed to display this same attribute, the phenomenon is termed AI. At its core, AI is a branch of computer science aimed at creating machines that can imitate and display human behaviour, specifically intelligence. So when ‘Siri’ cracks a joke upon your request, it is not just being funny, but also displaying a small slice of human-like behaviour. Today, AI is also used as a marketing buzzword to make the software sound more appealing — a futuristic machine with supposed “human-like qualities” yet superhuamn efficiency generates curiosity surrounding the product.

The term AI was coined by John McCarthy among various other names proposed including ‘automata studies’ and ‘complex information processing’ for securing research funding. The current level at which AI development stands is considered to be the most preliminary one termed as ‘Artificial Narrow Intelligence’. (If that has triggered your curiosity, dive deeper here)!

Picture Source: https://www.gptechblog.com/5-diagrams-to-help-you-understand-generative-ai/

Machine Learning, Cognitive Computing and AI: Same or Different?

AI comes with a lot more terms that we now use in our day-to-day conversations. We may believe that they are all synonymous, but that’s not true. Artificial Intelligence, Cognitive Computing and Machine Learning are related but it is crucial to understand their differences and understand what exactly they do.

Artificial Intelligence (AI)

AI can be thought of as a specialised problem-solving tool, or a set of technologies implemented in a system to enable it to reason, learn, and act to solve a complex problem. AI systems are good at quickly breaking down huge amounts of data to recognise patterns and make autonomous decisions based on the inferred rules.

Picture Source: https://www.baeldung.com/cs/cognitive-computing-vs-ai

Cognitive Computing (CC)

The use of the term “cognitive” is representative of what CC does - it simulates human thought processes. CC is a subset of AI. It sorts out vast, messy, and unstructured data (e.g., natural language, images, video) to understand context and nuance, similar to how a human understands the world. If you refer to the diagram above, you will see that CC can “suggest a career path” – this is more sophisticated than machine learning because it takes into account the user’s input (e.g your preferences and requests) to generate output, and it also can explain its recommendations.

Machine Learning (ML)

Machine Learning is also a subset of AI, and it focuses on optimizing algorithms to predict outcomes from large amounts of structured and quantitative data. It allows the machines to use algorithms without explicit programming (meaning constant human programming) for performing specific tasks. At this time, the ML therefore lacks the ‘general intelligence’. In other words, ML models do not ‘understand’ but are only good at ‘recognising’ patterns. There is a reason why Siri sometimes says ‘Sorry but I do not understand’ because the words do not fit in the recognised and set patterns.

There is more to ML than ChatGPT!

Before we rush to link ChatGPT with ML, let us understand that ML itself has further forms.

Source: https://www.pynetlabs.com/different-types-of-machine-learning/

Supervised ML – This involves training the model on a labelled dataset. The training data contains input and output, and the model learns to map the input with correct output. For example: If the model is to be trained on credit worthiness, then the input labelled data will include asset ownership, sources of income, liabilities, investments, employment status, etc. to output which will be likelihood of default and scoring on a numeric/qualitative scale.
Unsupervised Learning – This form does not require the data to be labelled. This is more equivalent to human learning as it identifies the patterns in the data on which it is trained.
Reinforcement Learning – This can be simply understood as trial-and-error learning. In this form, the model interacts with the environment and receives feedback (rewards and penalties) which it uses to improve its performance. Examples where this model is used include algorithmic trading and autonomous vehicles (Read more here).

While these are the broad forms of ML, ML can also be classified into two broad categories based on the output predictability which are:

Probabilistic Models – They predict distributions over an outcome. Thus, the same input can yield different yet plausible results. (Now we all know where ChatGPT will fall!)
Deterministic Models – As the name suggests the output is fixed and predictable i.e. for the same input the output will always be the same.

This glossary of terms is incomplete without ‘Deep Learning’ and ‘Generative AI’ (GenAI). Both are subsets of ML. Deep Learning is a form of ML that utilizes neural networks which simulate the human brain’s working system. They are used for processing of unstructured data. For example, finding all the photos of which you are a part of in your gallery employs deep learning. GenAI uses the probabilistic ML model to create new content including text, audio and images. The GenAI models training could employ deep learning methodology.

Now that we have many terms, let us just simplify them by understanding their relations in a diagrammatic form:

Source: https://k21academy.com/ai-ml/deep-learning-ml-generative-ai/

LLMs and Reliability: Confidence does not make it correct!

In a world where even before a problem is narrated, we are ready with ChatGPT, Gemini, Claude and Perplexity to solve it. Now you may say, “What is the big deal? Aren’t they just replacing Google, Yahoo and Bing.” But that is not necessarily the case, and we attempt to explain why in the next section.

What is the perception at large?

We conducted an informal survey with youth under the age of 30 to understand the reliance on GenAI tools and what purposes were they being used for by the youth — we should highlight that the majority of responses received are from university students. Our respondents indicated that the majority of the GenAI tool users of ChatGPT, Gemini and Perplexity used it ‘to get ideas’ and ‘for research’. These tools are equally well-regarded for serving other purposes among the users like ‘finding answers’, ‘verifying conclusions’ and ‘finding new perspectives’. Very few of our respondents rely on these tools for assistance in improving writing, creating summaries, generating images etc.

Despite knowing that these tools may not be accurate, the users find it reliable for reasons like efficiency, speed, and it’s seen to be easier to develop on a basic structure created by GenAI tools rather than working from scratch, new and better perspectives etc. The distrust in these tools is majorly for reasons of hallucination and privacy concerns. Despite this, the users preferred to continue using these tools over their non-use for reasons of efficiency, competitive reasons as everyone/majority will be using it, it is now an essential skill and because it’s a huge component of research work. Let us now unpack certain findings from the survey!

Why is an LLM and Google Search not substitutable?

Though majority of the respondents believed that GenAI tools can’t substitute search engines, some believe in its possibility as well. We would suggest that these are not direct substitutes, because a search engine like Google helps you find an answer while LLM simply gives you the answer.

Google will scan the internet to find results that match your query so you can read and decide which information is relevant to you. To put it in other words, Google is like a library. Separately, research conducted also highlighted the factors of what governs the choice of users between a search engine and an LLM. It shows that LLMs were preferred when users wanted a more nuanced response than just fact verification. Search engines were still preferred for fact-based enquiry.

The reason why LLMs can’t be used for everything is because they are in function probabilistic; this means that even if we were to input the same prompt, the output necessarily may not be the same. A number of our respondents who rely on these tools indicated that they were unfamiliar with how the AI uses a probabilistic model.

Language Processing works differently than indexing of pages by a search engine. LLMs provide you (or they claim to at least) with the answer using predictions based on the probability of the next correct word based on previous word. It is this probability-based model which causes hallucinations by the LLM models because the models are rewarded for guessing even when they don’t know the answer. Thus, it is not possible to fix the problem by adding “no hallucinations” in your prompt, because hallucinations are a feature of the LLM. So, if you think Claude knows what it is saying then, you are mistaken. It is only guessing its answers, just like a student who isn’t sure of the answer attempting MCQs.

As indicated by the majority of the survey responses, no one can say ‘stop using AI’ for indeed it has its advantages, and to refuse to use it in an increasingly AI-heavy workforce and world may put you at a disadvantage to your AI-savvier counterparts. However, to believe that AI can “think” and hence is 100% trustworthy would be a mistake. Knowing AI is the first step in ensuring you use it correctly and responsibly. Your ability to scope your queries well and then to correctly verify the veracity of the AI’s generated output is what makes it useful to you. Even at its best, AI cannot rival you on your worst day in deciding what the answer should be!

Understanding AI: The Difference Between Knowledge and Prediction

Recent Posts

Comments