Bias

Remember from module 1 that LLMs do not learn in the sense that they are able to evaluate their inputs and create new outputs based on reflection and thought – they are simply consuming the inputs and allocating a probability to certain words coming after other words in a certain context. The old computer rule is “Garbage in = garbage out”, and that training material really matters. If you train your GenAI tool on material which has a lot of content about, say, disbelieving climate scientists, then it will be more likely to produce an output which suggests that there are few impacts of climate change. That’s an example which is fairly simple to address, because most people agree with climate scientists, and the software developers will probably select peer-reviewed academic papers as part of the training material (maybe – we just don’t know at the moment how they do it).

Now think about a more controversial example, where people are less likely to be in agreement. Suppose there is an election coming up, and you have some questions about the issues. What if the search engine you use to ask your questions has been trained with data which supports a particular political party and is very negative about another? All of the outputs you get will tell you why party number 1 is better. Again, maybe you think that is a very obvious example, and you wouldn’t fall for it. You may well be right – a study has showed that changing social media algorithms so that we see less of the material that we already agree with does not have a significant impact on political polarisation (Garcia, 2023). A study of the impact of a scandal involving Facebook data, AI, and a company called Cambridge Analytica shows that most of us probably think we are immune to manipulation (Hinds, Williams, & Joinson, 2020), but the existence of the academic discipline and profession of Marketing suggests that marketing does work. At the very least we need transparency over what goes into the training data and how the software is trained to ensure balance. Rozado (2023) and Rutinowski et al (2024) both show that if you ask ChatGPT questions which are on standard political orientation tests, then the outputs tend to be ‘left-leaning’ or 'progressive', even though the outputs claim to be politically neutral. These tests are not entirely suitable for use with LLMs because they are designed for humans, but they are useful for illustrating the idea that certain opinions may be more likely to be presented by GenAI tools.