Low-Cost NF-e Data Extraction
for Brazilian Small Business
A small manufacturing company in greater São Paulo receives 80 supplier NF-e invoices a month. The finance person opens each DANFE PDF, finds the CNPJ do emitente, the CFOP code, the ICMS base and rate — there are 40+ fields in the XML but only a dozen printed on the page — and types them into Excel or Omie. Five minutes per document. Six and a half hours a month. At a labor cost of roughly R$25 per hour, that's R$162 a month spent reading PDFs and pressing keys. The tools that could automate this — Arquivei, Omie, Conta Azul — start at R$99 to R$300 a month. For a company processing 80 invoices, the software costs almost as much as the problem.
Key Takeaways
- Brazil built one of the most advanced e-invoicing systems in the world — every NF-e XML contains 40+ structured fields authorized by SEFAZ in real time — and yet a small business in São Paulo still spends R$162 a month typing the same dozen fields from DANFE PDFs.
- The XML download path requires a R$150–400/year digital certificate, SEFAZ web service integration, and the 44-digit chave de acesso from each invoice — infrastructure a 20-employee company does not have, which is why the person opens the PDF and types instead.
- ImageToTable.ai reads every supplier's DANFE layout for R$50/month and feeds Excel data into Omie, Conta Azul, or the contador's spreadsheet — no digital certificate, no XML download, and the same "CNPJ emitente" column finds its target on a Gerdau invoice and a local wholesaler's form.
The Two Worlds of NF-e: XML Automation vs DANFE Typing
Brazil's Nota Fiscal Eletrônica (NF-e) — established by Ajuste SINIEF 07/2005 and mandatory for goods transactions since 2008 — is built on two parallel documents. The XML file, submitted to and authorized by SEFAZ (Secretaria da Fazenda) in real time, contains the complete transaction record: CNPJ of both parties, CFOP (Código Fiscal de Operações e Prestações) classifying the transaction type, NCM product codes, ICMS base and rate broken down to the line-item level, PIS/COFINS contributions, IPI where applicable, freight data, and the 44-digit chave de acesso (access key) that uniquely identifies every NF-e in the country. The XML holds roughly 10 times the data visible on the paper version.
The DANFE (Documento Auxiliar da Nota Fiscal Eletrônica) — the printed or PDF visual representation that accompanies goods during transport — is a summary. It shows what the SEFAZ portal calls "the minimum fields necessary for transit": issuer and recipient CNPJ, the chave de acesso as a barcode, total value, and a few tax totals. The line-item ICMS breakdown that fills 200 lines in the XML? The DANFE might show three aggregate numbers. The CFOP code determining whether this is a taxable purchase or a transfer between branches? Sometimes printed, sometimes not.
For a large company running SAP or Totvs with a dedicated IT team, this split is invisible. Their ERP generates the XML, submits it to SEFAZ via web services, receives the authorization protocol, and automatically posts the invoice to accounts payable — no DANFE involved on the issuing side. On the receiving side, they use the same XML infrastructure: download the XML via SEFAZ's distribution web service using an A1 or A3 digital certificate (certificado digital e-CNPJ, R$150–400 per year), parse the structured data, and post it to the ERP. The DANFE is a paper receipt for the truck driver.
But a Brazilian small business with 20 employees and 40 active suppliers doesn't have an XML integration. It doesn't have a digital certificate configured for automated XML downloads. Its suppliers email PDF DANFEs — or send them via WhatsApp, which in Brazil is as common for business documents as email. The finance person opens each PDF, reads the printed fields, and types. This is where the two worlds split: one processes structured data that was born digital; the other reads a PDF and retypes a fraction of the data that already exists in machine-readable form elsewhere.
Every supplier NF-e you receive as a DANFE PDF is a document whose complete, structured data already exists as an XML on a SEFAZ server — but the path from your inbox to that XML requires a digital certificate, a SEFAZ web service integration, and per-supplier download logic. Most small businesses skip the XML path entirely and process the PDF instead. The question is whether the tool reading the PDF can extract data as if it had the XML.
Why Brazilian SMEs Still Type Invoice Data by Hand in 2026
The answer is not resistance to technology. Brazil has one of the most advanced e-invoicing systems in the world — the national NF-e portal processes billions of documents across the entire country, and as of June 2026, Nota Técnica 2025.002 v1.50 has already updated NF-e layouts to accommodate the incoming IBS and CBS taxes that will replace ICMS, ISS, PIS, and COFINS under the Reforma Tributária. The infrastructure exists. The problem is structural: the tools that make extraction easy are built for companies with IT departments, and the tools priced for small businesses don't actually solve the extraction problem.
Three structural barriers explain why typing persists:
Barrier 1: XML download requires infrastructure most SMEs don't have. To download an incoming NF-e XML from SEFAZ, you need an e-CNPJ digital certificate (A1 or A3 model, R$150–400/year), software that can call the SEFAZ web services (Distribuição de DF-e), and the 44-digit chave de acesso for each invoice you want to download. The chave is printed on the DANFE — so you're already reading the PDF to get the key to download the XML. The circle is closed before it starts. For a small business, the cost of setting up automated XML downloading exceeds the cost of just typing the data for another year.
Barrier 2: Brazilian tools solve issuance, not inbound extraction. NF-e API providers like NFe.io (R$119/month for 120 documents), Focus NFe (R$109/month for 200 documents), and Spedy (R$89/month for 150 documents) are built for companies that issue NF-e — they generate XMLs, submit them to SEFAZ, and return the authorization protocol. Their pricing is per-issued-document. They do not read PDF DANFEs from your suppliers and output structured data. When a small business searches "automatizar NF-e entrada" (automate inbound NF-e), they find tools built for outbound flow.
Barrier 3: ERPs that do handle inbound NF-e are priced for scale. Omie, the cloud ERP that serves 180,000 Brazilian companies and processes R$35 billion in invoices monthly, captures NF-e XMLs automatically — but only for companies on its platform that have configured the digital certificate integration. Omie starts at R$99/month for basic plans, with costs scaling by module: fiscal modules, contador integration, multi-user access all add to the price. Conta Azul offers automated NF-e capture on its Controle plan (R$309.90/month for businesses billing R$81K–360K/year). Bling, popular with e-commerce sellers, starts at R$30–80/month but its NF-e module is built for issuance and marketplace integration, not supplier invoice processing. These tools are full ERPs — they replace your accounting stack, your inventory management, and your sales pipeline. If all you need is someone to read the DANFE, you're buying a building for the lobby.
What Brazilian NF-e Tools Actually Cost
The pricing landscape for NF-e handling in Brazil splits into four categories, and only one of them addresses the inbound PDF extraction problem directly:
| Tool | Starting Price (Monthly) | What It Does | Handles Inbound DANFE PDF? | Per-Document Cost at 80 Invoices/Month |
|---|---|---|---|---|
| NF-e API Providers NFe.io, Focus NFe, Spedy, Nota Fácil | R$89–129 | Issue NF-e, NFC-e, NFS-e; submit XML to SEFAZ; return authorization | No — issuance only | N/A (wrong direction) |
| NF-e Management Platforms Qive (formerly Arquivei) | Bespoke pricing (contact sales) | Download and manage NF-e XMLs from SEFAZ; capture supplier invoices automatically via XML; AP workflow; data intelligence | Partial — XML-based capture; PDF DANFE without XML still manual | Not publicly listed |
| Full ERPs Omie, Conta Azul, Bling | R$30–720 | Complete business management: NF-e issuance, inventory, financial, contador integration, bank reconciliation | Partial — XML capture via digital certificate; PDF DANFE still requires manual entry or external extraction | R$0.38–9.00 (plan cost ÷ 80 invoices, but plan includes non-extraction features) |
| AI Extraction Layer ImageToTable.ai | $9/month (~R$50) for 150 pages | Reads any DANFE PDF, photo, or screenshot; extracts defined fields by semantic understanding; outputs Excel | Yes — format-agnostic | ~R$0.33 per DANFE at Basic; ~R$0.25 at Pro |
The gap in the table is the one small businesses fall into: tools that handle inbound NF-e exist, but they're either full ERPs (Omie/Conta Azul at R$99–720/month) or enterprise platforms (Qive with bespoke pricing). There's no R$50/month tool in the Brazilian market that does one thing — read a DANFE PDF and output the data — without bundling it with inventory management, sales pipelines, and contador portals. For the broader context on how this pricing architecture affects buyers across markets, see the 2026 guide to AI document extraction pricing.
The Per-Image Math: R$0.31 vs a R$99 Monthly Commitment
Let's put numbers on three real-world scenarios for a Brazilian small business processing inbound supplier NF-e DANFEs.
| Business Profile | Monthly NF-e Volume | Manual Labor Cost (R$25/hr, 5 min/DANFE) | Omie Entry Plan (R$99/mo) | ImageToTable.ai Basic ($9/mo ≈ R$50, 150 pg) | ImageToTable.ai Pro ($19/mo ≈ R$100, 400 pg) |
|---|---|---|---|---|---|
| Food distributor, 15 staff, 25 active suppliers, São Paulo interior | 40 DANFEs | R$83 | R$99 | R$50 | R$100 |
| Construction materials retailer, 30 staff, 50 suppliers, Minas Gerais | 90 DANFEs | R$188 | R$99 | R$50 (110 pages left over) | R$100 (310 pages left over) |
| Auto parts distributor, 45 staff, 80 suppliers, Paraná | 160 DANFEs | R$333 | R$99 + manual catch-up on excess | R$50 (exceeds limit; needs upgrade) | R$100 (240 pages left over) |
At 40 DANFEs a month, the math is straightforward: R$50 for extraction vs R$83 for manual labor, with 110 pages left over on the Basic plan that can be used for NFS-e documents, expense receipts, or delivery notes. At 90 DANFEs, the savings widen: R$50 vs R$188 in labor, and Omie at R$99 is priced in the same ballpark but buys an entire ERP, not an extraction tool. At 160 DANFEs, the Pro plan at ~R$100 replaces R$333 in manual labor — a 3.3x return while leaving pages for other document types.
The key variable is what you're already paying for. If you already use Omie or Conta Azul for inventory and sales management, adding their NF-e capture module may make sense as an integrated step — but you're still configuring the digital certificate integration and paying the ERP subscription regardless of extraction volume. If you don't have an ERP, or if your ERP's NF-e module doesn't handle PDF-only DANFEs from suppliers who don't send the XML, the extraction-layer approach costs less and starts today. For a cost comparison across the broader invoice extraction market, see the 2026 invoice extraction tools price comparison.
The R$49 gap between R$50 (extraction) and R$99 (ERP) is small in absolute terms — about one pizza delivery in São Paulo. But the ERP commits you to R$1,188 per year for a platform that does a dozen things you might not need, while the extraction tool commits you to R$600 per year for one thing you definitely do. The question isn't which costs less per month. It's which one you're still paying for six months from now.
How Semantic Extraction Reads Any DANFE Layout
A DANFE from a large industrial supplier like Gerdau or Suzano looks nothing like a DANFE from a regional food wholesaler in Goiânia. The CNPJ might be top-left on one and bottom-right on another. The CFOP code might be printed next to the product description on one and buried in a "Dados Adicionais" (additional information) block on another. The ICMS breakdown — base de cálculo (taxable base), alíquota (rate), valor (amount) — might appear as three labeled fields on one DANFE and as a single line in a tax summary table on another.
A template-based OCR tool — the kind built into most Brazilian accounting software — needs a different template for each supplier's layout. When Fornecedor A changes its DANFE format because of a Nota Técnica update (and NT 2025.002 v1.50 is exactly that kind of update, restructuring NF-e layouts for the IBS/CBS tax reform taking effect August 2026), the template silently breaks. It doesn't produce an error. It maps the old field positions to new content, and the data in your spreadsheet is wrong.
ImageToTable.ai uses Custom Column Extraction instead. You type the field names you want extracted — "CNPJ emitente" (issuer CNPJ), "Número NF-e" (NF-e number), "CFOP", "Valor Total", "Base ICMS", "Alíquota ICMS", "Valor ICMS", "Valor PIS", "Valor COFINS" — and the AI locates each value on the page by understanding what the field means, not where it sits. A Gerdau DANFE with the CNPJ in the header and a local supplier DANFE with the same field in the footer are read by the same column definition, with zero template work. The mechanism behind this — semantic understanding vs coordinate-based extraction — is the same one detailed in our analysis of AI extraction vs traditional OCR.
For Brazilian NF-e specifically, this matters because of what the DANFE doesn't show. The XML contains fields like CST (Código de Situação Tributária — tax situation code) that determine whether the ICMS on this purchase is creditable (CST 00, taxable) or non-creditable (CST 40, isento — exempt). Some DANFEs print the CST; some don't. If yours does, defining a column for "CST ICMS" lets the AI find it regardless of position. If it doesn't, the column stays empty and you know to look it up. Either way, you're not retyping the fields that are present.
You can also use Computed Columns during extraction. If the DANFE shows the base de cálculo and alíquota but not the final ICMS value, define a column like "Valor ICMS (Base ICMS × Alíquota ICMS)" and the AI performs the multiplication during extraction — outputting the result, not the raw inputs. For DANFEs where the ICMS rate varies between products (e.g., 12% for one line item, 18% for another), this catches arithmetic errors the supplier may have made.
Files are processed securely and not stored.
The Exchange Rate Reality: Paying in Dollars for a Real–Denominated Problem
Every dollar-denominated tool sold into Brazil carries an exchange rate risk that local tools don't. At the June 2026 rate of roughly R$5.15 to the dollar, a $9/month Basic plan costs ~R$50/month. Six months from now, if the real weakens to R$5.50 or R$5.80 — the currency has moved within that range multiple times in the last 12 months, touching R$5.06 in early June and R$5.17 by mid-June — the same $9 plan costs ~R$52. That's a small absolute swing for a single subscription. But it's worth acknowledging: you're pricing your document processing in a different currency than your revenue.
This is where the pay-per-use option matters. ImageToTable.ai's Starter pack ($6 for 50 images, ~R$31) is a one-time purchase: you buy the credits, they don't expire for a year, and exchange rate fluctuations after purchase don't matter. For a small business that receives 40 DANFEs a month during the busy season and 20 during the off-season, the Starter pack handles one to two months of processing at a fixed cost of R$31 — no monthly commitment, no auto-renewal, and the exchange rate is locked in at purchase. For higher-volume users, the Pro plan at $19/month (~R$100) with 400 pages covers 160–400 DANFEs depending on whether invoices are single or multi-page.
The comparison that matters isn't dollars-to-reais. It's per-document delivered cost. A R$50/month extraction tool processing 80 DANFEs delivers R$0.63 per document in tool cost plus ~R$0.08 in labor (the seconds spent reviewing results vs the minutes spent typing). A R$99/month ERP processing the same 80 DANFEs delivers R$1.24 per document in tool cost alone, even before factoring in the time spent configuring digital certificate integrations. The labor you're replacing — typing at R$25/hour, 5 minutes per DANFE — costs R$2.08 per document. At all three volume scenarios in the table above, extraction costs less than half what labor costs, even with the exchange rate baked in. For a fuller analysis of per-document cost across extraction tiers, see our ranking of the most affordable AI document extraction tools.
When an ERP Makes More Sense Than an Extraction Layer
This article argues that standalone extraction is the right tool for the inbound DANFE problem. But honest comparison means acknowledging when it's not.
If your business issues its own NF-e (even a few per month), an ERP or NF-e API tool becomes necessary — you cannot issue NF-e from a document extraction tool. If your contador (accountant) works inside Omie or Conta Azul and your books run through that system end-to-end, adding the ERP's NF-e capture module keeps the workflow integrated: data enters the same system where it's posted, without an intermediate Excel export-import step. If you process more than 500 documents a month across all types (NF-e, NFS-e, CT-e, boletos, comprovantes), an enterprise platform like Qive with its automated SEFAZ XML download becomes cost-effective despite the bespoke pricing.
But these scenarios describe businesses that have already crossed the threshold where ERP investment pays for itself. The companies this article is written for — small manufacturers, distributors, retailers, and service firms with 10 to 50 employees — have not. They're still typing DANFE data into Excel and emailing the spreadsheet to the contador. For them, a R$50/month extraction tool that produces a clean Excel file is not a compromise between manual and automated. It's the first automated step they've ever had.
The Tax Reform Variable: Why Now Is the Moment to Stop Typing
Brazil's Reforma Tributária (Tax Reform), enacted through Lei Complementar nº 214/2024, begins replacing the five existing consumption taxes — ICMS (state), ISS (municipal), PIS, COFINS, and IPI (federal) — with two new taxes: IBS (Imposto sobre Bens e Serviços, a dual VAT managed jointly by states and municipalities) and CBS (Contribuição sobre Bens e Serviços, a federal VAT). The transition runs from 2026 to 2032. Between August 3, 2026 and 2032, NF-e XMLs must carry both the old and new tax fields simultaneously (NT 2025.002 v1.50, published June 3, 2026). NF-e without IBS/CBS fields will be rejected by SEFAZ starting August 3, 2026.
For a small business manually typing DANFE data, this transition adds concrete pain. The DANFE you receive from your supplier in August 2026 will have more fields printed on it than the DANFE you received in July 2026 — the dual tax regime requires both old (ICMS) and new (IBS) tax lines. A DANFE that took 5 minutes to manually type before now takes 6 or 7, because there are more numbers to transcribe and more tax codes to verify. A contador doing this for 10 clients, each with 60 DANFEs a month, faces 600 additional fields to type every billing cycle.
Template-based extraction tools — where you've defined parsing rules per supplier — now need all rules updated because the DANFE layout changed. Semantic extraction doesn't: the field called "Base ICMS" still says "Base ICMS", and the new field called "Base IBS" is recognized as a new column alongside it. The tax reform doesn't break the extraction. It just adds columns to your output spreadsheet. For a small business facing a 7-year dual tax regime, that matters: your extraction tool doesn't need reconfiguration every time Nota Técnica revisions change the layout. For a deeper look at how the German Mittelstand handles a similar extraction-layer approach with its own e-invoicing mandates, see how German SMEs are separating extraction from ERP.
FAQ
Does ImageToTable.ai integrate directly with SEFAZ to download NF-e XMLs?
No. ImageToTable.ai reads the visual content of documents — PDFs, images, screenshots — and extracts data from what it sees on the page. It does not connect to SEFAZ web services, download XMLs, or require a digital certificate. If you have the XML and can parse it programmatically, you don't need extraction at all. The tool is built for the scenario where you only have the DANFE PDF and need the data from it. The output is Excel (XLSX) or CSV — the universal format that Omie, Conta Azul, Bling, and any contador's system can import.
Can it extract all 40+ fields that exist in the NF-e XML from a DANFE PDF?
It can extract every field that is visible on the DANFE. Fields that exist only in the XML and are not printed on the DANFE cannot be extracted from the PDF — there is nothing on the page to read. This is a DANFE limitation, not an extraction limitation. If your process requires fields that only exist in the XML (such as per-line-item NCM codes or freight-specific tax scenarios that the DANFE summarizes), you need XML access. For the dozen-plus fields typically visible on a DANFE — CNPJ emitente/destinatário, número NF-e, data de emissão, CFOP, valor total, base ICMS, valor ICMS, valor PIS/COFINS, chave de acesso — the extraction covers what's on the page.
What about NFS-e (Nota Fiscal de Serviço Eletrônica)?
ImageToTable.ai handles NFS-e the same way as NF-e: it reads the PDF and extracts the fields you define. Since January 1, 2026, NFS-e uses a unified national XML standard under Lei Complementar 214/2024, replacing the fragmented municipal systems. The visual NFS-e layout varies by municipality, but semantic extraction reads the field values regardless of position. For a broader comparison of how one extraction tool handles multiple Brazilian document types, see our cost comparison across NF-e, NFS-e, and Holerite extraction.
Does the tool handle scanned paper DANFEs, or only digital PDFs?
It handles both. A photographed or scanned paper DANFE, a PDF generated from an ERP, a screenshot of a DANFE viewed on the SEFAZ portal, or a WhatsApp image of a DANFE sent by a supplier — all are processed the same way. The AI reads the visual content of the page whatever the source. Handwritten annotations on a DANFE (such as a contador's margin notes) are readable, though recognition accuracy on handwriting depends on legibility.
What happens when the DANFE layout changes due to Nota Técnica updates?
Nothing needs to change on your end. Semantic extraction doesn't rely on field positions or templates, so when a supplier's DANFE layout changes because of a Nota Técnica update, the same column definitions ("CNPJ emitente", "Valor Total", "Base ICMS") continue to find their targets because the AI is looking for what the field means, not where it sits. The only change you might need to make is adding new columns for new fields — for example, adding "Base IBS" and "Valor IBS" as the tax reform introduces them on DANFEs.
Is the tool compliant with Brazilian data privacy law (LGPD)?
ImageToTable.ai processes files in memory during extraction and does not retain them afterward. It is not a document archive and does not store NF-e data long-term. For the 5-year archiving requirement mandated by Ajuste SINIEF 07/2005, you must maintain your own archive — whether that's the ERP system, a DMS (document management system), or the contador's records. The extraction tool reads and outputs data. Archiving remains your responsibility under your existing compliance framework.
Brazilian small businesses have been typing NF-e data from DANFE PDFs since the electronic invoicing mandate went live in 2008 — not because the technology to stop didn't exist, but because it was always bundled with things they didn't need: ERP modules, digital certificate configurations, XML web service integrations. Separating the extraction step from the accounting stack it feeds changes the per-document cost from R$2.08 in labor to R$0.31 in tool time. Test it on your own supplier DANFEs — see if a column called "CNPJ emitente" finds its target on every layout in your inbox.