Converting PDF Documents to Text Using AI
In today's video, we're discussing how to convert PDF documents into text and then use that text for various purposes. For instance, let's say you have a contract and you want to pull out the line items. Traditionally, you would have to do this manually, which can be time-consuming, especially if you have multiple contracts to go through.
Introduction to Make.com
We're using a platform called Make.com to achieve this. If you're new to Make.com, there's a crash course available that takes you through everything you need to know. However, if you want to continue with this video, we'll break it down so you can easily follow along.
Converting PDF documents to text using OCR
The blueprints for this process are available for free in the description. All you have to do is download the JSON file, click the three dots, and import it into your Make.com account. Within 30 seconds, you'll have this exact scenario ready to go.
Triggers for the Workflow
There are typically three or four locations where you might want to get your documents from. Maybe you want to get them from Gmail, Google Drive, your CRM, or when they're added into your CRM. You can automatically start this workflow and insert those contracts into things like Google Sheets or add them into QuickBooks to log a payment.
Triggering the workflow from Google Drive
Using OCR to Convert PDF to Text
The first step in the workflow is using OCR (Optical Character Recognition) to convert the PDF document into plain text. The response looks like this, with the contract for Jon O Catly and the automatable contract and pricing. It does a decent job pulling out numbers, emails, and other important information.
Results of using OCR to convert PDF to text
Extracting Important Information
Once we have the text, we want to extract all the important information from the document. This is where we use the Open AI Chat GPT module. We're taking the text out in key-value pairs, with keys like invoice number and values that will always change.
Extracting important information from the document
Using an Iterator to Post Line Items
We use something called an iterator to go through each line item one by one and post it into the Google sheet. We start with the first line item, post it into the Google sheet, and then move on to the second line item, and so on, until we've gone through all nine line items.
Posting line items into the Google sheet using an iterator
Conclusion
In conclusion, we've learned how to convert PDF documents to text using AI and then extract important information from those documents. We've also learned how to use an iterator to post line items into a Google sheet. This process can be automated using Make.com, and the blueprints are available for free in the description.
Thank you for watching, and I hope you found this valuable. If you did, make sure to subscribe and leave your thoughts in the comment section.