What are GenAI tools?

The difference between older AI tools such as those mentioned in the previous section and Generative Artificial Intelligence (GenAI) is that GenAI tools can generate outputs which are original in terms of the combination of words, images, or sounds. To understand how this works, you need to know a little about Large Language Models (LLMs).

LLMs are computer systems which have been trained to learn patterns and structures in text by analysing existing examples of documents, computer code, and digital image and audio information – anything which can be represented by letters and numbers. They learn which patterns of letters and numbers are most common, and how the patterns are usually structured in digital works. This process is called machine learning.

Learn more about this topic: If you want to understand how LLMs work in more detail, read this clearly written article Links to an external site. by Timothy B Lee and Sean Trott (2023).

The existing examples are selected from huge collections of ‘training data’ which people working with software companies have selected to be representative of data humans might want to read, see, produce, or edit. This selection process is a potential limitation which we will look at in module 2.

Next, companies have developed software which use the LLM to predict the next word (or computer code, or image or audio pattern) to generate text, images, and sounds in response to simple prompts written in natural language. They are using the patterns and structures they have learned from the training material to generate new outputs.

These outputs then get checked to ensure that they make sense, and the responses to the checking are fed back into the LLM so that mistakes don’t get repeated. This checking process is heavily dependent on humans. Sometimes we have probably all done this kind of training work for free, by clicking on an image where we see buses, bridges, or some other feature ‘to prove we are human’ before we can proceed to another web page – these tests are called CAPTCHAs ((Completely Automated Public Turing tests to tell Computers and Humans Apart, if you need the full title). This type of image checking has been used to train self-driving software for vehicles, so that the driving software can recognise different features.

A set of 9 images representing different scenes which might be seen while driving. The instruction asks the user to select all those with crosswalks.

Screen capture of a CAPTCHA test: Image CC licensed from Wikimedia.

Note: one previous type of CAPTCHA asked you to type in text with different fonts and angles, rather like handwriting. This was used to develop software to interpret handwriting, allowing the full digitisation of archives from pre-digital ages – this is a really useful application for researchers in the humanities.

With text outputs, people read through the outputs and identify sentences which don’t make sense, are factually inaccurate, or are offensive. We will think about this checking process in more detail in module 3, but in this module we will focus on the basics of the tools.

Note: Natural language means how we normally speak, rather than using special codes or clicking particular buttons in a program. You may already have experienced natural language processing if you use voice-activated tools such as Siri, Alexa or Google. To return to the email example, clicking on ‘Send’ does the same thing as telling the programme to ‘Send’ if you are using a voice-activated interface, and using natural language you might be able to say “Send an email to Peter Svensson with the subject ‘Fika tomorrow’.”

Software programs which use LLMs to generate new digital artefacts such as text, images, and sounds are collectively known as GenAI tools, which is what we will be focusing on in the rest of this resource.