Portuguese Invoice Extraction:
What Finance Teams Actually Need
Most invoice extraction advice assumes one tax rate per document. Portuguese invoices (faturas) come with three — 23%, 13%, and 6% — and they all need their own column. Strip the ATCUD code, separate the IVA by rate, keep the NIF intact. If your spreadsheet structure is wrong, your quarterly VAT return is wrong. If it's right, the rest is fast.
Key Takeaways
- Most extraction tools were designed for a world with one tax column.
- A Portuguese fatura carries three IVA rates and your VAT return demands each one in its own column.
- Name your columns by what the field means and one structure works for every supplier invoice every month.
What Makes Portuguese Invoices (Faturas) Different
A Portuguese invoice (fatura) follows EU invoicing standards, but adds layers of domestic compliance that most generic extraction tools were never designed to parse. Three differences matter for anyone building an extraction workflow.
Every Portuguese invoice carries a tax identification number (NIF — Número de Identificação Fiscal), a unique document code (ATCUD), and a QR code encoding key tax fields — all mandated under Decreto-Lei n.º 28/2019 and Portaria n.º 195/2020. A generic invoice extraction workflow that only captures "VAT Number" and "Total" skips the data that makes the document traceable under Portuguese tax law.
The NIF is not a VAT number. A Portuguese NIF is a 9-digit tax identifier issued to every taxable entity — individuals, companies, non-residents with a Portuguese fiscal obligation. It serves as both the VAT registration number and the general tax ID, which means it appears in multiple roles on the same invoice: as the supplier's identifier, the customer's identifier, and sometimes on the payment reference. This differs from jurisdictions like Germany, where USt-IdNr. and Steuernummer are separate identifiers for separate purposes — one format, one number, but context determines what it means.
The IVA rates come in three tiers. On mainland Portugal, the standard rate (taxa normal) is 23%, the intermediate rate (taxa intermédia) is 13%, and the reduced rate (taxa reduzida) is 6%. A single invoice often carries items at different rates — a restaurant supply order might include food at 13% and equipment at 23% on the same document. If your extraction output has one "Tax Amount" column, you cannot file the declaração periódica de IVA (periodic VAT return) without manually re-splitting every mixed-rate invoice. The tax authority (Autoridade Tributária e Aduaneira, AT) requires separate reporting by rate.
The ATCUD (Código Único do Documento) is a validation anchor. Since January 2023, every invoice must carry an ATCUD: a unique alphanumeric identifier generated from a validation code the AT issues to the supplier. The accompanying QR code encodes the same data — supplier NIF, ATCUD, tax base per rate, tax amount per rate — in a standardized format. These are not decorative; they make the invoice independently verifiable against the AT's database. In an extraction context, capturing the ATCUD gives you a direct link from every row in your spreadsheet back to the legally registered document.
For finance teams processing Portuguese supplier invoices, this means the extraction problem isn't just "read the PDF." It's "read the PDF in a way that produces a spreadsheet the AT will accept as evidence during an audit" — which is a higher bar than most generic invoice OCR tools are built for.
The Spreadsheet Structure Your VAT Return Actually Needs
Before you extract anything, decide what your output looks like. In Portugal, a supplier invoice spreadsheet that only has "Date," "Supplier," and "Total" is halfway done — useful for payment scheduling, useless for tax filing. The structure below is what a contabilista certificado (certified accountant) needs to reconcile against e-Fatura, prepare the periodic VAT return, and import into accounting software.
The non-negotiable minimum: supplier NIF, invoice number, issue date, taxable base per IVA rate, IVA rate percentage, IVA amount per rate, and invoice total. If a single column tries to hold all tax information, you are redoing the work manually every VAT period.
| Column | Why It Matters for Portugal |
|---|---|
| Supplier NIF | Unique identifier for every Portuguese taxable entity. Required for e-Fatura reconciliation and for verifying that the supplier is registered with the AT. |
| Supplier Name | Full legal name as it appears on the invoice, not the trading name. Cross-referenced with the NIF for supplier master data validation. |
| Invoice Number | Unique per supplier. Combined with the supplier NIF, this forms the composite key for matching against e-Fatura records. |
| Issue Date | Determines which monthly VAT period the invoice belongs to. Format: YYYY-MM-DD. |
| Due Date | Feeds treasury management and accounts payable aging. Not filed with the AT but essential for payment scheduling. |
| Taxable Base — 23% | Net amount before IVA at the standard rate. Feeds directly into the corresponding field of the periodic VAT return. |
| Taxable Base — 13% | Net amount before IVA at the intermediate rate. |
| Taxable Base — 6% | Net amount before IVA at the reduced rate. |
| IVA Amount — 23% | Tax amount at standard rate — the deductible IVA figure for this bracket. |
| IVA Amount — 13% | Tax amount at intermediate rate. |
| IVA Amount — 6% | Tax amount at reduced rate. |
| Total | Gross invoice total. Used for quick reconciliation and cross-checking that base + IVA = total. |
| ATCUD | The unique document code. Capturing this gives every row a direct audit trail back to the registered document at the AT. Optional but high-value for compliance. |
| Source File / Page | Reference to the original PDF and page number. Non-negotiable for audit readiness — any inspector will ask you to trace a row back to its source document. |
For most monthly supplier invoice batches, one row per invoice (header-level extraction) is sufficient. If you need cost analysis by item or center-of-cost allocation, drop to line-item level — each invoice line becomes its own row, with the invoice number and NIF repeated. The column structure stays the same; the granularity changes.
Mixed-rate invoices need special handling. When a single invoice contains items at different IVA rates — common in hospitality, food distribution, and construction supplies — forcing all tax into one row loses the rate breakdown. The cleanest solution: either split the invoice into multiple header rows (one per rate, repeating the invoice number and NIF), or go directly to line-item extraction where each row inherits its own rate.
How to Extract Portuguese Invoice Data Into Excel
With the column structure defined, the extraction itself is straightforward. The approach described here uses Custom Column Extraction: you type the field names you want — "Supplier NIF," "Taxable Base 23%", "ATCUD" — and the AI reads each invoice, locates the corresponding values, and populates the columns. Unlike template-based OCR, which requires you to draw bounding boxes for each field on each supplier's layout, this works across any invoice format without pre-configuration.
Gather Your Supplier Invoices
Collect all supplier invoices (faturas) for the period — native PDFs, scanned documents, or photos from mobile. Portuguese suppliers use a mix: large distributors typically send native PDFs from certified software like Primavera or PHC; smaller suppliers may email scanned paper invoices or even WhatsApp photos of printed documents. A capable extraction tool should handle all three without requiring you to pre-sort by format.
Define Your Extraction Columns
This is the step that determines whether your output is usable or not. Enter the column names from the structure above: "Supplier NIF (Número de Identificação Fiscal)," "Supplier Name," "Invoice Number," "Issue Date," "Due Date," "Taxable Base 23%", "Taxable Base 13%", "Taxable Base 6%", "IVA Amount 23%", "IVA Amount 13%", "IVA Amount 6%", "Total," "ATCUD." You can also add an Inferred Column like "IVA Rate Category (23% / 13% / 6%)" to have the AI classify each extracted line automatically.
Upload and Process the Batch
Upload all invoices in one batch. The AI processes each document in parallel, typically taking 5-10 seconds per page. For a batch of 50 single-page invoices, expect results in 2-3 minutes. The output is a structured spreadsheet with the columns you defined, one row per invoice (or per line item, depending on the granularity you chose).
Review and Validate
Spot-check a sample: verify that base + IVA = total on 5-10 invoices, confirm NIFs are 9 digits, check that mixed-rate invoices show the correct breakdown. The source file/page column lets you jump directly from any suspicious row back to the original PDF. A 10-minute review on a batch of 100 invoices is typically enough to catch any systematic issues — far less time than manually entering the data.
Files are processed securely and not stored.
The process above handles most invoices cleanly, but three edge cases deserve attention when working with varied invoice formats:
- Exempt invoices and reverse charge. Invoices exempt under Article 9 of CIVA or subject to reverse charge (autoliquidação, common in construction subcontracting and intra-EU acquisitions) carry a zero IVA rate but with a specific legal mention. Add a column for "IVA Regime / Legal Reference" and instruct the AI to capture the exact exemption or reverse charge notation. Without it, your accountant cannot distinguish an exempt invoice from one that simply omitted IVA.
- Multi-page PDFs with multiple invoices. Some suppliers concatenate several invoices into one PDF. The extraction tool should detect document boundaries and create separate rows per invoice, not one row per file. Pages like cover sheets and delivery confirmations should be ignored automatically.
- Handwritten or photographed invoices. Smaller Portuguese suppliers — local tradespeople, independent farmers, small service providers — often issue handwritten invoices that get photographed by phone. Modern AI extraction handles handwriting and low-quality mobile photos with reasonable accuracy, but test a sample batch from these suppliers before processing a full month's worth of handwritten documents.
Getting Extracted Data Into Portuguese Accounting Software
Extraction is half the job. The other half is getting the spreadsheet into the software your accountant uses — and in Portugal, that software must be certified by the AT. The major platforms all support CSV or Excel import, but each has its own expectations for column mapping.
| Software | Import Format | What to Watch For |
|---|---|---|
| Primavera BSS / Jasmin | CSV, Excel import via "Importar Documentos" | Requires account code (conta SNC) per row. Map supplier NIF to account code via VLOOKUP before import. Jasmin Express is free up to €30,000 annual turnover. |
| PHC Software (CS / GO) | Excel, XML import | Expects separate columns for each IVA rate's base and amount. If your extraction already gives you three IVA rate columns, no rework needed before import. |
| Sage Portugal | CSV, Excel import | Date format must be YYYY-MM-DD; decimal separator must be a period. Sage Portugal validates NIF length on import — 9-digit NIFs pass, anything else gets rejected. |
| TOConline | Excel (.xlsx) | Provided by the OCC (Ordem dos Contabilistas Certificados) to its members. Has built-in SAF-T PT export. Column order matters — match the import template exactly. |
| InvoiceXpress / Moloni | CSV import | Primarily designed as issuing (sales) software, but both support supplier invoice imports for purchase ledger tracking. Simpler import templates than ERP-grade software. |
The common thread: fresh imports need account code mapping. Most Portuguese accounting platforms require a conta SNC (chart of accounts code) on every imported line. Build this mapping once per supplier — either as a separate lookup table or directly in the extraction column definition using an Inferred Column that maps NIF to account code — and every subsequent month's extraction inherits it automatically.
For teams processing invoices from multiple countries, the platform that receives your Portuguese data may not be Portuguese. If your shared service center uses SAP, Oracle NetSuite, or Microsoft Dynamics, the column structure defined earlier still applies — just map the IVA rate columns to the corresponding tax code fields in your ERP's import template. The structural work (three tax columns instead of one) is the same; only the destination field names change.
Validating Extraction Results Against e-Fatura and ATCUD
Portugal's e-Fatura system gives finance teams a built-in reconciliation mechanism. Every month, suppliers submit their invoice data to the AT — by the 5th business day of the following month, under the SAF-T PT invoicing requirements. Your job as the buyer is to check that what you extracted matches what the supplier reported. If it doesn't, the AT sees a mismatch, and the invoice may be excluded from your deductible IVA.
The ATCUD code is the bridge. If you extracted it during processing, you can match each row in your spreadsheet directly to the supplier's registered document — no manual searching, no guessing which "Fatura 2026/0047" from "Fornecedor X" is the right one when the supplier reuses numbering sequences.
The validation workflow, once your extraction is complete:
- Export your spreadsheet with NIF, invoice number, date, total, and ATCUD columns.
- Log into the e-Fatura portal and filter by the period you processed. The portal shows all invoices where your NIF was registered as the buyer.
- Cross-reference by NIF and invoice number. Sort both lists by supplier NIF, then by invoice number. Mismatches — invoices on your sheet but not in e-Fatura, or vice versa — become immediately visible.
- Flag and investigate discrepancies. A missing invoice in e-Fatura means the supplier didn't submit it — contact them before the reporting deadline. A total mismatch means the supplier reported a different amount.
- For ATCUD-enabled rows, use the QR code data (if you extracted it) or the ATCUD itself to verify document authenticity through the AT's validation service.
This reconciliation used to be the most time-consuming part of monthly closing in Portugal — opening each PDF individually to cross-check against the e-Fatura portal, one document at a time. With structured extraction output, it becomes a spreadsheet operation: sort, filter, spot the exceptions, fix only those. The same principle applies to any document type that needs periodic reconciliation against a regulatory dataset.
FAQ
Can an AI extraction tool read the QR code on Portuguese invoices?
It depends on the tool. The Portuguese QR code encodes structured tax data (supplier NIF, ATCUD, tax base per rate, tax amount per rate) in a format specified by Portaria 195/2020. Tools that include QR decoding can extract this structured data directly, bypassing OCR for those fields entirely. If the tool does not decode QR codes, the same data can still be extracted by reading the visible text on the invoice — the QR simply provides a second, machine-readable source for cross-validation.
Do I need to extract the ATCUD for every invoice?
Not legally required for the buyer, but it's the single most useful field for audit readiness. The ATCUD uniquely identifies the document in the AT's system. If an auditor asks you to produce the original for any row in your spreadsheet, having the ATCUD means the lookup is instant. Without it, you are searching by supplier name and date range. For compliance-focused finance teams, it's worth the extra column.
How do I handle invoices from Madeira or the Azores, which have different IVA rates?
The autonomous regions apply their own rates: Madeira uses 22% (standard), 12% (intermediate), and 5% (reduced); the Azores use 16%, 9%, and 4%. If you receive invoices from suppliers in these regions, add separate column pairs for the regional rates in your extraction template. The rate itself tells you the origin — a 22% rate on a Portuguese invoice almost certainly means Madeira.
What's the difference between extracting from a PDF invoice and parsing a SAF-T PT XML file?
SAF-T PT is an XML audit file the supplier exports from their certified invoicing software and submits to the AT. It contains structured data you can parse directly. But as the buyer, you rarely have access to your supplier's SAF-T file. What you have is the PDF invoice they sent you. Extraction tools bridge this gap: they read the PDF and produce structured data comparable to what the SAF-T file would contain, without requiring the supplier to share their internal XML export.
How accurate is AI extraction on Portuguese invoices with mixed IVA rates?
Field-level accuracy for header data (NIF, invoice number, dates, totals) typically exceeds 95% with modern AI extraction tools. Line-item accuracy on mixed-rate invoices is lower — roughly 85-90% for correctly assigning individual line items to the right IVA rate — because the visual distinction between rate columns on a densely printed invoice can be subtle. The practical approach: process the batch, then spot-check mixed-rate invoices specifically. The 5-10 invoices that need manual correction are still far fewer than manually entering all 100.
From Monthly Grind to Monthly Export
A Portuguese supplier invoice is not just a document to be read — it's a structured tax record that carries legal weight. Treating it as a generic PDF to be OCR'd is like treating a tax return as a piece of paper to be photographed. The structure matters because the downstream systems — e-Fatura, the periodic VAT return, the accounting software import — all expect data in a specific shape.
The extraction column structure you define this month is the structure you'll use every month. Once the columns are right — three IVA rate pairs, NIF, ATCUD, source reference — the process becomes mechanical: upload the batch, export the spreadsheet, reconcile against e-Fatura, import into your accounting platform. The intellectual work happened when you decided what the spreadsheet should look like.
Test the extraction on a sample Portuguese invoice. See whether your current process — 3 minutes per invoice of manual typing — becomes 10 seconds per invoice in a batch that reconciles itself against e-Fatura.