-
The views expressed herein do not constitute research, investment advice or trade recommendations, do not necessarily represent the views of all AB portfolio-management teams and are subject to change over time.
Artificial intelligence (AI) is a transformative technology, but as with any technology, it isn’t foolproof. AI tends to hallucinate, generating results not grounded in reality. Organizations can’t fully eliminate hallucinations, but they can take concrete steps to manage the risk. And they might find positive uses for AI hallucinations.
As AI’s huge strides redefine tools, processes and techniques across industries, we’d all prefer that it be 100% accurate all the time. But that’s not realistic, so firms must take strides to identify and reduce hallucinations. It’s a big reason why we believe AI strategies must combine AI’s prowess with human oversight. Here are five ways human experts can work with AI to tackle the challenge.
While AI models are extremely powerful, much of their effectiveness comes down to an age-old technology truism: user input makes a big difference. That means prompting with clear instructions that can guide models toward grounded answers and discourage them from filling in gaps with plausible-sounding but unsupported information.
A key aspect of good prompting is letting the model know what to do in the case of uncertainty. For example, human experts should instruct it to distinguish facts from assumptions, flag unknown values and ask for further clarification rather than speculating about what the user is trying to achieve or the context of the request.
One effective way to reduce AI hallucinations is to create a “fence” of sorts that defines the specific knowledge base AI will use to provide answers. Pointing AI models to a focused set of thoroughly vetted documents helps reduce freelancing and ensure that answers come directly from trusted sources.
By using this approach, Retrieval Augmented Generation (RAG), users combine document retrieval with AI’s responses. It helps ensure that AI doesn’t simply guess or make up information beyond the scope of the curated documents. RAG is especially critical in using AI agents, which handle many tasks on their own without human involvement. Human experts review the output, but RAG helps ensure it’s on point.
We’ve all reviewed AI summary answers in internet search engines that provide specific citations to source material. It makes sense to include this same feature in an organization’s AI governance. Users can easily check a model’s answers by clicking through to the original content.
Requiring citations helps promote accountability and verification. If a model cites a macroeconomic result or summarizes the performance of a company’s product line, validation is a click away. Citations also encourage AI models to tap credible sources and avoid fabricating answers to satisfy their human colleagues’ requests.
In financial services, the “maker-checker” framework is an operational control that requires two people to perform the same task. Applying that framework to the AI world translates into human experts checking models’ output for quality and accuracy.
Let’s say an analyst prompts ChatGPT to analyze several documents and provide specific financial projections for a company. That user might decide to ask the model the same question several times and compare all the responses before moving forward with a conclusion.
Other models can get in on the quality-control act, too. An operations expert might receive an answer from ChatGPT, and then turn around and provide the same input to another model, say Gemini, to assess how the models answer. Users can also extend this “multi-model validation” to another dimension by asking one AI model to directly check the output of another model’s work.
One great thing about organizations that cultivate high-level human talent is that there’s no shortage of in-house experts across all types of disciplines. In the era of AI, those experts are a potent source of feedback on how AI models are delivering in terms of quality and accuracy.
Think of an economist who reviews a model’s macroeconomic analysis and finds that the answer lacks nuance and may be misinterpreting an underlying relationship. If developers are equipped with that feedback, they can identify weaknesses, correct inaccuracies and help the model better grasp a complex topic. This cycle helps the model continue to improve how it interprets its knowledge base.
Given the robust effort to reduce AI hallucinations, it might be natural to assume that they’re all counterproductive. But we’ve found several ways that hallucinations offer creative and practical value.
Yes, hallucinations create risk in terms of factual accuracy, but they can also foster innovative ideas the same way that human brainstorming functions. One out-of-the-box idea, even if it’s not completely feasible, may spur related ideas that might not have surfaced without the spark of the hallucination.
Helping fill gaps in incomplete or missing data is another way hallucinations can contribute. Researchers frequently incorporate historical data into their analysis, and data may be unavailable for certain time periods—the dreaded broken time series. Rather than throw away the series, analysts can use AI models to create fill-in data that, in the human expert’s judgement, may be sensible given the context.
As organizations embrace AI’s potential, a close integration of human expertise and machine intelligence is essential in developing new AI uses and refining the tools. A blend of effective techniques and expert insight, in our view, can help tame hallucinations and, in some cases, put them to good use.
The views expressed herein do not constitute research, investment advice or trade recommendations, do not necessarily represent the views of all AB portfolio-management teams and are subject to change over time.
Investment involves risk. The information contained here reflects the views of AllianceBernstein L.P. or its affiliates and sources it believes are reliable as of the date of this publication. AllianceBernstein L.P. makes no representations or warranties concerning the accuracy of any data. There is no guarantee that any projection, forecast or opinion in this material will be realized. Past performance does not guarantee future results. The views expressed here may change at any time after the date of this publication. This article is for informational purposes only and does not constitute investment advice. AllianceBernstein L.P. does not provide tax, legal or accounting advice. It does not take an investor's personal investment objectives or financial situation into account; investors should discuss their individual circumstances with appropriate professionals before making any decisions. This information should not be construed as sales or marketing material or an offer of solicitation for the purchase or sale of, any financial instrument, product or service sponsored by AllianceBernstein or its affiliates. This presentation is issued by AllianceBernstein Hong Kong Limited (聯博香港有限公司) and has not been reviewed by the Securities and Futures Commission.