PDF Invoice Extractor (AI) End-to-end pipeline: Watch Drive ➜ Download PDF ➜ OCR text ➜ AI normalize to JSON ➜ Upsert Buyer (Account) ➜ Create Opportunity ➜ Map Products ➜ Create OLI via Composite API ➜ Archive to OneDrive. Node by node (what it does & key setup) 1) Google Drive Trigger Purpose : Fire when a new file appears in a specific Google Drive folder. Key settings : Event: fileCreated Folder ID: google drive folder id Polling: everyMinute Creds: googleDriveOAuth2Api Output : Metadata { id, name, ... } for the new file. 2) Download File From Google Purpose : Get the file binary for processing and archiving. Key settings : Operation: download File ID: ={{ $json.id }} Creds: googleDriveOAuth2Api Output : Binary (default key: data ) and original metadata. 3) Extract from File Purpose : Extract text from PDF (OCR as needed) for AI parsing. Key settings : Operation: pdf OCR: enable for scanned PDFs (in options ) Output : JSON with OCR text at {{ $json.text }} . 4) Message a model (AI JSON Extractor) Purpose : Convert OCR text into strict normalized JSON array (invoice schema). Key settings : Node: @n8n/n8n-nodes-langchain.openAi Model: gpt-4.1 (or gpt-4.1-mini ) Message role: system (the strict prompt; references {{ $json.text }} ) jsonOutput : true Creds: openAiApi Output (per item) : $.message.content → the parsed JSON (ensure it’s an array). 5) Create or update an account (Salesforce) Purpose : Upsert Buyer as Account using an external ID. Key settings : Resource: account Operation: upsert External Id Field: tax_id__c External Id Value: ={{ $json.message.content.buyer.tax_id }} Name: ={{ $json.message.content.buyer.name }} Creds: salesforceOAuth2Api Output : Account record (captures Id ) for downstream Opportunity . 6) Create an opportunity (Salesforce) Purpose : Create Opportunity linked to the Buyer (Account). Key settings : Resource: opportunity Name: ={{ $('Message a model').item.json.message.content.invoice.code }} Close Date: ={{ $('Message a model').item.json.message.content.invoice.issue_date }} Stage: Closed Won Amount: ={{ $('Message a model').item.json.message.content.summary.grand_total }} AccountId: ={{ $json.id }} (from Upsert Account output) Creds: salesforceOAuth2Api Output : Opportunity Id for OLI creation. 7) Build SOQL (Code / JS) Purpose : Collect unique product codes from AI JSON and build a SOQL query for PricebookEntry by Pricebook2Id . Key settings : pricebook2Id (hardcoded in script): e.g., 01sxxxxxxxxxxxxxxx Source lines: $('Message a model').first().json.message.content.products Output : { soql, codes } 8) Query PricebookEntries (Salesforce) Purpose : Fetch PricebookEntry.Id for each Product2.ProductCode . Key settings : Resource: search Query: ={{ $json.soql }} Creds: salesforceOAuth2Api Output : Items with Id , Product2.ProductCode (used for mapping). 9) Code in JavaScript (Build OLI payloads) Purpose : Join lines with PBE results and Opportunity Id ➜ build OpportunityLineItem payloads. Inputs : OpportunityId: ={{ $('Create an opportunity').first().json.id }} Lines: ={{ $('Message a model').first().json.message.content.products }} PBE rows: from previous node items Output : { body: { allOrNone:false, records:[{ OpportunityLineItem... }] } } Notes : Converts discount_total ➜ per-unit if needed (currently commented for standard pricing). Throws on missing PBE mapping or empty lines. 10) Create Opportunity Line Items (HTTP Request) Purpose : Bulk create OLIs via Salesforce Composite API. Key settings : Method: POST URL: https://<your-instance>.my.salesforce.com/services/data/v65.0/composite/sobjects Auth: salesforceOAuth2Api (predefined credential) Body (JSON): ={{ $json.body }} Output : Composite API results (per-record statuses). 11) Update File to One Drive Purpose : Archive the original PDF in OneDrive. Key settings : Operation: upload File Name: ={{ $json.name }} Parent Folder ID: onedrive folder id Binary Data: true (from the Download node) Creds: microsoftOneDriveOAuth2Api Output : Uploaded file metadata. Data flow (wiring) Google Drive Trigger → Download File From Google Download File From Google → Extract from File → Update File to One Drive Extract from File → Message a model Message a model → Create or update an account Create or update an account → Create an opportunity Create an opportunity → Build SOQL Build SOQL → Query PricebookEntries Query PricebookEntries → Code in JavaScript Code in JavaScript → Create Opportunity Line Items Quick setup checklist 🔐 Credentials : Connect Google Drive, OneDrive, Salesforce, OpenAI. 📂 IDs : Drive Folder ID (watch) OneDrive Parent Folder ID (archive) Salesforce Pricebook2Id (in the JS SOQL builder) 🧠 AI Prompt : Use the strict system prompt; jsonOutput = true . 🧾 Field mappings : Buyer tax id/name → Account upsert fields Invoice code/date/amount → Opportunity fields Product name must equal your Product2.ProductCode in SF. ✅ Test : Drop a sample PDF → verify: AI returns array JSON only Account/Opportunity created OLI records created PDF archived to OneDrive Notes & best practices If PDFs are scans, enable OCR in Extract from File . If AI returns non-JSON, keep “ Return only a JSON array ” as the last line of the prompt and keep jsonOutput enabled. Consider adding validation on parsing.warnings to gate Salesforce writes. For discounts/taxes in OLI: Standard OLI fields don’t support per-line discount amounts directly; model them in UnitPrice or custom fields. Replace the Composite API URL with your org’s domain or use the Salesforce node’s Bulk Upsert for simplicity.