SHAKO

("Shay-ko") 

Chat with your PDFs

We've been exploring new ways to utilize OpenAI's tools for everyday business tasks. One intriguing idea is to use AI to chat with your organization's internal documents. This tool enables you to receive chat-like written responses about specific information contained within the document you choose. 

Often, reading and comprehending an entire document can be time-consuming, especially with product manuals or detailed spec documents that contain loads of information and span hundreds of pages. Public Chatbots like ChatGPT lack knowledge of your company's internal documents. However, we have implemented a process called "Fine Tuning" a Large Language Model (LLM), allowing you to customize the Chatbot with your own set of documents. 

Below is an example where I "tuned" a Chatbot with the manual for a Lincoln Electric Torchmate Plasma Cutting Table, which is 56 pages long and approximately 9.5 MB, packed with setup information, run parameters, maintenance details, etc. The Chatbot is written in Python. The process works by providing the Chatbot with the local PDF document location, it digests all the information within the PDF, then responds to your questions naturally. Let's put it to the test!"



Pretty helpful resource! 

Taking it a step further, you can also fine-tune with multiple documents and obtain comprehensive answers based on information spanning several sources. A simple use case for this is learning how to integrate components together. You could fine-tune the Chatbot with manuals for two or more components and then prompt it with questions like, "Explain how to connect the Lincoln Electric Torchmate Plasma Table to the NETGEAR Multi-Gigabit Ethernet Switch." Alternatively, you could fine-tune the Chatbot with your company's training documents and use it as a resource for company procedures. 

This powerful approach allows you to quickly access summarized information from multiple complex documents relevant to your project or task. If you're curious about how to do this, read on for more information in the bonus section! Or, you can send me an email at justin@shako.tech.

If you enjoyed this content, hit the subscribe button below so you never miss a post. It's free!

Please help us out by sharing this article with friends and colleagues!

BONUS!

How it works - In order to understand how fine tuning works, it's imperative to understand generally how an AI Chatbot is built.

GPT is the name of the AI model used, and it stands for Generative Pre-Trained Transformer. Let's break that down. 

The word "transformer" refers to a programming architecture that takes text input and generates text output. 

Think of a transformer like a pet parrot. A parrot has the ability to hear language and generate language, but it doesn't know what to say right out of the box. You have to train a parrot over time by giving it examples of what to say when it hears a particular prompt.

GPT has been "pre-trained", meaning it has been given a vast amount of diverse text data from the internet. It has learned to understand the structure, patterns, and context of language in a generalized way. Being pre-trained means that GPT does not have specific knowledge about anything in particular; instead, it serves as a foundation for understanding language.

Right, so when you fine-tune an AI model like GPT, there are a few steps. First you use GPT to process the PDF words and phrases that it recognizes from it's training. It basically gives the PDF text a list of scores that reflect the relevance to language that it knows about.

For example, if GPT is scoring the text "beautiful", that text would receive high scores in relation to text like "sunset", "bride", or "song".

Then when you ask GPT a question, it does the same thing with the language in your question - it will apply scores to the question text based on its relevance to language it's been trained about. 

Finally, you take the question scores and match them to the document scores to get the most likely and relevant information. From that subset of information, GPT generates the highest scoring natural language response.

In the example I built above, I applied GPT to my PDF document and only that document. So even though GPT is pre-trained with internet data, it doesn't consider any factual information from the internet when generating responses. It only uses its pre-trained data for language comprehension. It's using its training to read for you.

The diagram below depicts how the Chatbot program flows. GPT is used three times to process language - comprehending the PDF text, comprehending the question, and generating the answer.

This program can actually be written with only a couple dozen lines of code using Python. However, you need to utilize a GPU to process your documents quickly and you have to pay each time you use OpenAI's models. 

About the author: Justin Pratt