Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
546 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
What you can - and can't - hope to understand about how AI works Part three of a four-part series of blogs |
---|
I'm sick of reading blogs pretending to explain how Large Language Models work, when the truth is that few people on earth can ever hope to understand how they are constructed. This blog is my attempt to explain as much as 99.99% of us can ever hope to understand about how AI works, and why the other bits will always remain opaque to most people.
|
So this is where our journey of understanding ends - well, almost.
Here's a simplistic (and believe me, simplicity is good when consdering AI tools) view of the steps involved in building a large language model:
Step | Description |
---|---|
1 | Assemble lots of data (the more the better). One of the most obvious potential sources of data might be the text on the world's websites, although this raises questions of bias and of ownership (as discussed in the last part of this blog). |
2 | Parse it to build a large language model. This involves not just building up a list of all the tokens used (see previous part of this blog), but also creating a list of parameters or dimensions and where each token belongs within that parameter's space. Parameters/dimensions might include things like how important a word is or where it might appear in a sentence, but in truth only the model itself is likely to understand all of the parameters created. |
3 | Train this model (for example by asking it questions and grading its responses to get it to fine-tune its parameters and data). |
The process of building and training a large language model is famously expensive, requiring huge inputs of data, computer processing power and electricity.
So how does an organisation like OpenAI, Gemni or Antropic build a large language model?
Transformers provide the T in GPT (the acronym stands for Generative Pre-trained Transformers). Invented by Google as recently as 2017, a transformer involves recalibrating the input embedded vectors (remember them?) representing your question by applying the principle that some words in a sentence are more important than others. If you think that sounds straightforward, here's a core diagram from the Wikipedia page on attention in machine learning:
You're going to have to be good at vector spaces, tensors, probability distributions, vectors and matrices to understand this (and those are just the words I recognised in the articles on this subject). Most of these are things you would only cover properly as part of a maths or physics degree.
There are a lot of blogs trying to explain how large language models work, but they nearly all have the hallmark of having been created by AI tools, or just copying what is written elsewhere. I suspect that the number of people in the world who truly understand how large language models work could be numbered in the tens of thousands, or even just thousands.
If it sounds like I'm trying to put you off learning about this, that's because ... I am. I have no idea how many things work in life, beginning with our cat, but in the modern world you just have to accept that you're never going to understand many things and move on. I don't have the time or intellectual capacity to understand the maths behind the Transformer algorithm, and I would make a bet that you don't either!
Parts of this blog |
---|
|
Some other pages relevant to the above blogs include:
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.