I’ve spoken to several people around this topic in the past few weeks.
One guy I met who owns a range of medical clinics around Australia was in the process of implementing LLMs for the reconciliation process for his accounting software.
Another one was leveraging LLMs to serve up sensitive health data to his internal team as a knowledgebot.
This is not the correct way to approach LLMs.
Simply slotting an LLM into these use-cases can cause massive business catastrophes, with pile-up of errors, that you don’t even know exist.
Even worse, is that LLMs are so good at appearing truthful, that you WON’T even know these problems exist until much later on.
The core issue is something called Hallucinations.
What are hallucinations? Well you know that time you were at a party and some dude told you all this grand stuff about him that is suspiciously probably not true. Well sometimes LLMs is THAT guy. LLMs sometimes only appear to know everything, but really its reaching into its back pocket to make something up that YOU will believe, but has no factual evidence whatsoever.
Here’s the proof:
Can you see the “Factual Consistency Rate” above? Yeah that’s not a made up stat. This is a team of professionals who essentially run LLMs through fact-check exercises, and finds that in 2.5% of the time, GPT 4 Turbo, the BEST LLM in the WORLD, MAKES UP FACTS.
Now, assuming your business is running these on scalable exercises, a business that processes 800 invoices a day running the reconciliation process via LLM can expect 20 errors a day. That’s 100 errors a week!
The Solution
Well there’s not quite a 100% way to deal with hallucinations, as its alikened to human error. See as humans we haven’t quite found a way to NEVER MAKE MISTAKES, but we are very good at setting up closed environment systems to get pretty close. (Consider a manufacturing plant that screws bottle lids on bottles, and how rigorously tested it is, to achieve close to perfection results).
So just like our closed environment system there are two things you can do:
First method: Do it yourself
- Find a machine learning engineer to fine-tune or train specifically for your use case. You will need plenty of data, and plenty of reliable examples without any mistakes.
- Your machine learning engineer will need to be well trained to understand your use-case, and will need to understand how to transform the data to obtain reliable outputs. To achieve close to 100% reliable outputs you may need hundreds of thousands of rows of data.
- Your machine learning engineer will then test your model against your initial data to ensure it has the correct answers and is answering correctly. Again a lot of data is preferred here to curate a more accurate modle.
- You will then need a software engineer to build the model into a process that fits with your process flow. If you have one already then great! If you don’t, then you will need to find one that can also understand how a business works. (Could be hard to find).
- Lastly, you need your machine learning engineer and software engineer to come up with an intelligent way on how to detect errors in these responses, to be flagged for manual review.
Second method: Find a company that understands both the business process, the generative AI models, and the software engineering components.
Yes, this is a bit of a self promotion. But everything in the “do-it-yourself” section we can help you with. We would probably advise against doing something like reconciliation with LLM models, due to the technical nature of it (and because someone will come up with a way to do this more reliably on a more scalable basis soon).
But lets say the cost benefit to your business was there (you’re processing hundreds of thousands of volumes), then you would temporarily hire us as part of your team. We would work with your team to understand exactly how the process should look like, map it out, and send it through rigorous testing.
See this is where I feel our place is in the world for businesses. Helping them with problems that are almost “once-off” solutions, helping them maintain that for a low fee, without them having to over commit. Then going around the industry to help businesses with the same natured issue.
So…that’s us! Hope you enjoyed this.
Leave a Reply