Hallucinations in Large Language Models

OMKAR HANKARE
Blog
5 MINS READ
0flag
24 flag
20 January, 2025

Large Language Models (LLMs) have revolutionized the field of natural language processing, enabling sophisticated text generation, comprehension, and interaction capabilities. However, an open letter titled "Pause Giant AI Experiments" recently sparked global attention, calling for a temporary halt on the development of AI systems more powerful than GPT-4. This underscores growing concerns about the rapid development of advanced AI systems and their potential risks. 

Among these risks, one particularly pressing issue is the phenomenon of ‘Hallucinations’ in LLMs. Hallucinations occur when these AI systems generate inaccurate, misleading, or completely fabricated information, despite presenting it with high confidence. Let us explore the various forms of Hallucinations, and the consequences false AI-generated information can have across industries. 

In 2023, the Cambridge Dictionary named "Hallucinate" its Word of the Year, emphasizing the term's evolving significance in the context of Artificial Intelligence. This recognition reflects the growing awareness of AI's capabilities and its limitations, particularly the challenges posed by Hallucinations. 

Unlike traditional errors that stem from incorrect data or algorithmic flaws, Hallucinations in LLMs resemble creative misinterpretations of the input. Essentially, the LLM produces output that is entirely fictional or unrelated to the provided context. These hallucinations can take various forms, such as:

  • Generating Non-Existent Entities: The model might invent new entities, such as fictional people, places, organizations, or objects, presenting them as real.
  • Fabricating Events: It could describe events or scenarios that never occurred, weaving them into narratives as though they were factual.
  • Making Up Facts: The LLM might provide false or inaccurate information about historical events, scientific data, or other topics, confidently stating them as truths.

Major Types of AI Hallucinations:

  • False Negative: An Al reviewing medical records may miss signs of a serious condition, wrongly concluding that there is no issue (false negative), even though symptoms suggest otherwise.
  • False Positive: The same Al could also mistakenly identify a disease that isn't present (false positive), "seeing" symptoms of a condition based on patterns that aren't relevant, causing unnecessary alarm and needless additional testing.

According to “A Survey on Hallucination in Large Language Models” research paper, there are three types of LLM Hallucinations.

  • Input-Conflicting Hallucination: LLMs generate content that deviates from the source input provided by users.

For input-conflicting hallucination, the LLM makes a mistake in the person name (Hill⇒Lucas) during summarizing.

  • Context-Conflicting Hallucination: LLMs generate content that conflicts with previously generated information by itself.

For the context-conflicting hallucination, the LLM discusses Silver in the early stage, who was later referred to as Stern, resulting in a contradiction.

  • Act-Conflicting Hallucination: LLM generates text that contradicts established facts and knowledge about the world.

For the fact-conflicting hallucination, LLMs said the mother of Afonso II was Queen Urraca of Castile, while the correct answer is Dulce Berenguer of Barcelona.

Why are Hallucinations a Problem?

Hallucinations of LLMs can have serious consequences, such as the spread of misinformation, data breaches and security concerns for real-world applications. For example, a hallucinated report generated from patient information in the medical field can pose a serious risk to the patient. Such hallucinations ultimately also affect the general trust of users in this technology, which is why it is important to address this problem quickly.

A real-life example is the case of a professor at Texas A&M University who failed his entire student body after ChatGPT falsely claimed their papers were written by AI. This resulted in many students being denied their degrees, which not only jeopardised their academic careers, but also undermined confidence in the reliability of such technologies.

ChatGPT also made a false accusation of sexual harassment against George Washington University Law Professor Jonathan Turley. The AI model invented a non-existent Washington Post article and falsely accused Turley of harassing a female student during a class trip. Such incidents show how dangerous and misleading AI-generated content can be, and emphasise the need to establish stricter vetting mechanisms and ethical guidelines for the use of AI. 

Quick Tips to Reduce Hallucinations as an End User

  • Ask for Sources: When interacting with an AI, prompt it to reference sources or cite data. This practice encourages the AI to rely on factual and verifiable information rather than generating ungrounded content. For example, asking "Can you provide references for this information?" can help ensure the output is accurate and reliable.
  • Break Down Complex Questions: Instead of asking multi-step or overly complex questions, split them into smaller, straightforward parts. This helps the AI remain focused on one aspect of the problem at a time, minimizing the likelihood of generating fabricated or unrelated information.

Using Hallucinations

Hallucinations in large language models can sometimes be viewed as a beneficial feature, especially when creativity and diversity are desired. These scenarios illustrate how hallucinations can be utilized effectively:

  • Creative Applications

Hallucinations enable language models to generate unique and original content. For instance, if you ask a model like ChatGPT to craft a fantasy story, you would want it to produce an entirely new plot with original characters, settings, and storylines, rather than replicating existing stories. This creativity stems from the model's ability to "hallucinate" by not strictly relying on its training data but instead generating imaginative, novel outputs.

  • Idea Generation

Hallucinations can foster diversity when exploring ideas. For example, during brainstorming, you may want the model to deviate from existing concepts in its training data and offer fresh perspectives. This ability to derive possibilities beyond known ideas allows users to explore innovative solutions and alternatives.

  • Adjusting Hallucinations with the Temperature Parameter

Many language models include a "temperature" setting, which controls the randomness of the model's output. Higher temperature values result in more varied and creative responses, introducing more hallucinations, while lower values make the output more deterministic and grounded in the training data. By adjusting the temperature through APIs, users can fine-tune the balance between creativity and accuracy based on their specific needs.

Conclusion

As LLM technology evolves, collaboration among researchers, developers, and policymakers will be critical to ensure its responsible development and deployment. Emphasizing the reduction of hallucinations and enhancing the benefits of LLMs will be key to unlocking their full potential while managing the associated risks.

References:

https://futureoflife.org/open-letter/pause-giant-ai-experiments  
https://arxiv.org/pdf/2309.01219.pdf  
https://www.rollingstone.com/culture/culture-features/texas-am-chatgpt-ai-professor-flunks-students-false-claims-1234736601  
https://www.indiatoday.in/technology/news/story/chatgpt-falsely-accuses-us-law-professor-of-sexually-harassing-a-student-2357597-2023-04-09 

COMMENTS()

  • Share

    Get in Touch

    Fill your details in the form below and we will be in touch to discuss your learning needs
    Enter First Name
    Enter Last Name
    CAPTCHA
    Enter the characters shown in the image.

    I agree with Terms & Conditions.

    Do you want to hear about the latest insights, Newsletters and professional networking events that are relevant to you?