Japanese Qualified Invoice Extraction:
Complete Guide to 適格請求書 Data Processing
Since October 2023, every invoice issued in Japan that a buyer wants to use for consumption tax credit has carried six mandatory fields — including a T+13-digit registration number, a tax amount calculated per rate bracket, and line items that distinguish 8% from 10% taxable supplies. The system affects roughly 4.6 million registered businesses, from sole proprietors issuing handwritten 請求書 on A4 paper to corporations exchanging structured XML over the Peppol network. This guide covers the complete landscape: what makes a qualified invoice legally valid, what each field means and where to find it, why Japanese invoice layouts create extraction challenges that most automation tools don't solve, and how to build a processing workflow that handles any format — digital, scanned, or handwritten.
What the Qualified Invoice System Actually Changed
Before October 2023, Japan's consumption tax operated on a ledger-based method. A business claiming input tax credit did not need an invoice to match specific tax rates — it maintained accounting books that recorded transaction totals, and the credit was calculated from those books. The government introduced the Qualified Invoice System (適格請求書等保存方式, or インボイス制度) under Article 57-2 of the Consumption Tax Act (消費税法第57条の2) because the ledger-based approach became unworkable after Japan introduced a dual consumption tax rate in 2019 — 10% standard and 8% reduced for certain food and newspaper purchases. When two rates coexist on the same transaction, an invoice must specify which rate applies to which line item, and the buyer must retain that evidence to claim the correct credit.
The system shifted Japan from ledger-based to invoice-based input tax credit — the same architecture used by VAT regimes in Europe (and similar in structure to how Mexico's CFDI system operates, though Japan's approach is less rigid about electronic formats — see the complete guide to Mexican CFDI extraction for a comparison with a different national invoice system). Six fields became mandatory on every invoice that supports a tax credit claim. Suppliers who want their buyers to claim full input credits must register with the National Tax Agency (NTA) and include their registration number on every invoice. As of March 2025, the NTA reported approximately 4.61 million registered Qualified Invoice Issuers (QIIs), including roughly 2.2 million sole proprietors and 2.41 million corporations.
The practical effect for anyone processing Japanese supplier invoices is unambiguous: an invoice missing any of the six mandatory fields cannot support a full input tax credit claim. The NTA maintains a qualified invoice issuer public registry where AP teams can verify registration numbers — and every invoice that comes in requires that verification step. That verification did not exist before 2023. It is now a per-invoice operational cost.
To understand how this fits into the broader picture of automated invoice processing, see the overview of what invoice data extraction actually is and how it works — the concepts apply to Japanese qualified invoices the same way they apply to any structured document, with the additional compliance layer that the Japanese system adds.
The Six Mandatory Fields of a Qualified Invoice
The NTA requires exactly six data points on every qualified invoice. Missing any one of them means the buyer cannot claim full input tax credit (仕入税額控除) on that transaction. The six fields are defined in the NTA's official specification document (in Japanese):
| # | Field | Japanese | New Oct 2023? | Extraction Notes |
|---|---|---|---|---|
| 1 | Issuer name and registration number | 発行事業者の氏名又は名称及び登録番号 | Yes — registration number did not exist before 2023 | Format: T + 13 digits (e.g. T1234567890123). Must verify against NTA registry. |
| 2 | Transaction date | 取引年月日 | No | Often uses Japanese era format (令和8年3月10日). May include Western date in parentheses. |
| 3 | Transaction details and reduced-rate indication | 取引内容(軽減税率の対象品目である旨) | No (but reduced-rate flagging is new) | Items at 8% must be explicitly marked — often with ※ (asterisk) or 軽 (kei, "light"). |
| 4 | Total amount per tax rate category | 税率ごとに区分して合計した対価の額 | Yes — dual-rate breakdown | Separate subtotals for 10% and 8% taxable amounts. Can be tax-exclusive or tax-inclusive. |
| 5 | Consumption tax amount per rate | 税率ごとの消費税額 | Yes — per-rate tax calculation | Calculated in whole yen. Fractional yen amounts are truncated or rounded per issuer's method. |
| 6 | Recipient name | 書類の交付を受ける事業者の氏名又は名称 | No | The buyer's registered business name. Often followed by 御中 (onchū, "to the company"). |
Fields 1, 4, and 5 are entirely new — they did not appear on Japanese invoices before the reform. Field 3 existed but now carries an additional requirement: items subject to the reduced 8% rate must be explicitly identified. The phrase 軽減税率対象 (keigen zeiritsu taishō, "reduced tax rate applicable") or a simple ※ mark next to the item is sufficient.
For small retail and consumer-facing transactions — restaurants, taxi services, vending machines — a qualified simplified invoice (適格簡易請求書) is permitted. It omits the recipient's name and allows a condensed field set, but still requires the registration number and tax rate breakdown.
The T-Number: Verification and Extraction
The 登録番号 (tōroku bangō, registration number) is the single most critical field on a qualified invoice. Without it, the invoice cannot support any input tax credit claim. The format is consistent: the letter T followed by 13 numeric digits.
For corporations, the T-number is T + the company's existing 13-digit Corporate Number (法人番号, hōjin bangō). For sole proprietors, the NTA assigns a dedicated 13-digit number that is distinct from the MyNumber (個人番号) to protect personal privacy. Eligible foreign businesses without a permanent establishment in Japan can also register and receive a T-number.
The NTA's public registry at invoice-kohyo.nta.go.jp allows anyone to search by registration number or business name. The lookup returns the issuer's registered name, the date of registration, and the active status. If the number returns no result or the name does not match the invoice, the document cannot be treated as a qualified invoice for tax credit purposes.
Extraction challenge: The T-number can appear anywhere on the document — near the header, in a footer block, alongside the issuer's registered address, or buried in fine print near the bank transfer information (振込先欄). Unlike European VAT numbers, which typically appear in a predictable header position, Japanese T-numbers follow no standard placement convention. Semantic extraction — where the system reads every number on the page and identifies the T-prefixed 14-character string by what it means, not where it sits — is the only reliable approach for multi-supplier workflows.
Some Japanese suppliers print the T-number as T1234567890123 (one continuous string). Others separate it as T 1234-56-789012 with hyphens matching the corporate number grouping. Both formats are valid. The NTA's registry lookup accepts the 13 digits without the T prefix.
Invoice data extraction systems that rely on position-based rules often fail on this field because the T-number layout varies so widely across suppliers. A tool that locates it by recognizing the T-prefix pattern and the 13-digit sequence — regardless of where on the page it appears — can handle invoices from any registered supplier without per-vendor configuration.
Tax Rate Classification: When 8% Applies vs 10%
Japan's dual consumption tax rate has been in effect since October 2019. The breakdown on a qualified invoice must separate amounts into two distinct categories, each with its own subtotal and calculated tax:
| Rate | Category | What It Applies To | Invoice Label |
|---|---|---|---|
| 10% | 標準税率 (hyōjun zeiritsu) | All goods and services not eligible for the reduced rate. Includes alcohol, eating out, and general merchandise. | 10%対象 |
| 8% | 軽減税率 (keigen zeiritsu) | Food and beverages (excluding alcohol and dining out). Newspapers published under periodic subscription contracts. | 8%対象 / 軽減 |
On a qualified invoice, the NTA requires two separate calculations per rate bracket:
- The taxable amount (対価の額) — the price of goods or services, shown either tax-exclusive (税抜, zeinuki) or tax-inclusive (税込, zeikomi). The invoice must state which convention is being used.
- The consumption tax amount (消費税額) — calculated separately for the 10% and 8% brackets. Fractions of a yen are truncated or rounded down at the discretion of the issuer.
A typical qualified invoice displays these as distinct line blocks:
10%対象 ¥100,000
消費税(10%) ¥10,000
8%対象 ¥50,000
消費税(8%) ¥4,000
合計 ¥164,000
The total line (合計, gōkei) is often the sum of all values including both tax brackets — not simply the grand total. Some invoices show the tax-exclusive total first with a separate consumption tax line below. The variety in how these two rate categories are presented is one of the primary reasons template-based extraction fails on Japanese invoices: the layout changes not just between suppliers, but depending on whether the transaction involves items at both rates, a single rate, or mixed-rate line items within the same document.
Why Japanese Invoices Pose Unique Extraction Challenges
Japanese qualified invoices present four structural challenges that make them fundamentally harder to process than invoices from most other markets. These are not edge cases — they affect a significant proportion of the invoices your AP team will handle from Japanese suppliers.
Vertical Layout (縦書き) — When Text Reads Top to Bottom, Right to Left
While most Japanese business documents now use horizontal writing (横書き, yokogaki), a substantial number of traditional suppliers — particularly smaller firms, construction companies, and older sole proprietors — still issue invoices in vertical writing (縦書き, tategaki). In a vertical layout, text flows from top to bottom in columns that progress right to left across the page. Field labels that would normally appear to the left of their values in horizontal layout instead appear above or below them. Line-item tables in vertical invoices often place the column headers on the right side of the table, reading inward.
Standard OCR engines — including most API-based document AI services — assume a left-to-right, top-to-bottom reading order. When they encounter a vertically formatted Japanese invoice, they typically output characters in the wrong sequence, turning a structured document into an unrecoverable jumble. The reading order problem is severe enough that dedicated Japanese OCR models like Sarashina2.2 have been developed specifically to handle vertical text — a testament to how poorly general-purpose OCR handles this format.
Vision-based AI extraction (as opposed to traditional OCR) addresses this differently: instead of reading characters in a fixed sequence, the model looks at the entire page, understands the document structure visually, and extracts fields by semantic meaning. A T-number printed vertically alongside the issuer's name is still recognizable as a T-number because the model understands what a registration number looks like — not because it read the page in the correct order.
Handwritten Invoices — Still Common for Small Businesses
Japan's approximately 3.36 million small and medium enterprises — 99.7% of all businesses — range from fully digitized to entirely paper-based. The NTA explicitly allows handwritten qualified invoices as long as they contain all six mandatory fields. No requirement exists for electronic generation, digital signatures, or structured formats (though Peppol JP PINT is recommended as an e-invoicing standard).
This means that a supplier — a local construction subcontractor in Osaka, a family-run noodle shop, a freelance IT consultant — can handwrite their registration number, handwrite the tax rate breakdown, and handwrite the total on a pre-printed 請求書 form. The invoice is legally valid. And it is almost impossible for a template-based OCR system to process accurately.
The extraction problem with handwritten Japanese qualified invoices is not just that handwriting varies between individuals. Japanese handwriting compounds Arabic numerals (算用数字, san'yō sūji) with kanji numerals (大字, daiji — 壱, 弐, 参 instead of 1, 2, 3) and, in some traditional contexts, the 勘定科目 (account titles) written in semi-cursive script. A single field on an invoice might mix printed kanji headers with handwritten quantities and prices — and the AI must distinguish between them reliably.
Hanko Stamp Interference
The 印鑑 (hanko, personal or company seal) remains a standard element of Japanese business documentation. Many invoices carry a red stamp impression (朱肉, shuniku) over the issuer's name or total amount block. The red ink frequently overlaps printed or handwritten text, creating visual interference that degrades OCR accuracy — particularly when the stamp crosses numeric fields like amounts or the registration number.
This is not a defect of the document. The stamp is an intentional authentication mechanism. But for an extraction system, it creates a localized occlusion that traditional OCR cannot resolve: when a circular red impression covers digits in the total amount, the OCR reads partial character shapes and outputs incorrect values. Extraction tools that operate at the visual-semantic level — treating the whole page as an image that a multimodal AI model interprets — can often infer the obscured text from surrounding context, or at minimum flag the area for human review with higher confidence than an OCR engine that simply failed to recognize a partial character.
Era Date Formatting — When Reiwa 8 Is 2026
Japanese invoices commonly use the era date system (元号, gengō) alongside or instead of the Western Gregorian calendar. The current era, Reiwa (令和), began in 2019. Dates appear in formats such as:
令和8年3月10日 (Reiwa 8 = 2018 + 8 = 2026, March 10)
R8.3.10 (abbreviated era format)
令和8年(2026年)3月10日 (both formats, common for clarity)
H30.12.1 (Heisei 30 = 1988 + 30 = 2018 — still appears on archived documents)
An extraction system that processes Japanese invoices must convert era dates to Western equivalents automatically — ideally outputting ISO 8601 (2026-03-10) directly into the spreadsheet. Most global OCR tools do not handle this conversion. General-purpose document AI platforms treat "令和8年" as a string of characters without understanding that it represents a date. Field-level extraction with semantic date parsing is required to make the output usable in any downstream system that expects a standard date format.
When the invoice shows both era and Western dates side by side — such as "令和8年(2026年)3月10日" — the extraction system should prioritize the Western date as the reliable value and use the era date for cross-verification.
How to Extract Qualified Invoice Data: A Practical Workflow
Building a reliable extraction workflow for Japanese qualified invoices means solving four distinct problems: locating the registration number, separating the two tax rate brackets, converting era dates, and handling the non-standard layouts that handwritten and vertically formatted invoices introduce. Here is a practical five-step process that works across the full range of qualified invoice formats.
Step 1: Define the Output Columns
Rather than configuring rules for how each supplier's document looks, define what data you need. For a qualified invoice, the column list should include both the standard invoice fields and the Japan-specific fields required for compliance:
登録番号 (T+13桁), 発行日, 請求書番号, 発行者名, 発行者住所,
宛名, 10%対象金額, 8%対象金額,
消費税額(10%), 消費税額(8%), 合計額,
品目1, 数量1, 単価1, 金額1, ... (for each line item)
In ImageToTable.ai's Custom Column Extraction model, these column names become the instructions the AI follows: it reads the invoice, locates each value by semantic meaning (not by coordinates), and fills the corresponding cell. A column named "登録番号 (T+13桁)" tells the AI to find the 14-character T+digits pattern anywhere on the page — regardless of whether it appears horizontally in the header, vertically in the margin, or stamped in red over the issuer's address — and extract it into that column. No template setup, no zone drawing, no per-supplier configuration.
Step 2: Upload All Supplier Formats in One Batch
Because the extraction is layout-independent, there is no need to sort invoices by supplier before processing. A batch upload of 50 supplier invoices — half from large corporations using structured PDFs, a quarter from small businesses using handwritten forms, and the rest from mid-size companies with varied layouts — can be processed together. The AI reads each document independently and extracts the same column set.
This is the practical difference between semantic extraction and template-based OCR. A template approach would require 10-15 supplier-specific configurations for this batch — creating zones for each layout, adjusting for vertical vs horizontal, tuning for handwritten vs printed. Semantic extraction processes them all in one pass because it reads by meaning, not by position.
Step 3: Verify Registration Numbers Against the NTA Registry
After extraction, the T-numbers appear in a single column. The verification workflow becomes a lookup: export the column of extracted registration numbers and cross-reference them against the NTA's public registry at invoice-kohyo.nta.go.jp. For low-volume workflows, this can be done manually — entering the 13 digits (without the T prefix) into the search form. For higher volumes, the NTA registry can be queried programmatically.
Any T-number that does not return a matching business name should be flagged. The most common cause is a typo in the extracted number — a misread digit from a blurred or stamped-over registration field. The AP clerk corrects the extracted value and re-checks.
Step 4: Normalize the Output for Your Accounting System
The extracted data should undergo three normalization steps before it enters freee, MoneyForward, Yayoi, or any other accounting platform:
- Date normalization: Convert all era dates (令和8年, R8, etc.) to ISO 8601 (2026-03-10) or your accounting system's preferred format.
- Tax amount cross-check: Verify that the extracted 消費税額(10%) and 消費税額(8%) equal the extracted taxable amounts × 10% and × 8% respectively (allowing for the issuer's rounding method). Flag any discrepancy for review.
- Tax-exclusive or tax-inclusive standardization: If some invoices use 税抜 (tax-exclusive) and others use 税込 (tax-inclusive), convert all values to a single convention for your GL reporting.
Step 5: Import into Accounting Software
The normalized spreadsheet can be imported into Japanese accounting platforms via CSV. Both freee and MoneyForward Cloud support CSV import for invoice payable data, and Yayoi (弥生会計) provides an import function for desktop and cloud versions. The key requirement is that the CSV columns match the accounting system's import template — which is straightforward when you have already defined the output columns at Step 1.
For teams that use Google Sheets as their working environment, ImageToTable.ai's Google Sheets add-on allows extraction results to land directly into an active spreadsheet without intermediate file exports — the AI reads the invoices and writes the data into the sheet in a single operation.
How Japanese Accounting Software Supports the Qualified Invoice System
Japan's three dominant accounting platforms — freee, MoneyForward Cloud, and Yayoi (弥生会計) — all support the Qualified Invoice System natively. Each generates compliant invoices with T-numbers and rate-split totals, and each handles the consumption tax return calculations for filing. However, the gap between these platforms' invoice generation capabilities and their invoice extraction capabilities is significant:
| Platform | Price (Monthly) | Built-in OCR Designed For | Qualified Invoice Support | Invoice Extraction Limitation |
|---|---|---|---|---|
| freee | ¥1,980 (Starter) | Receipts (レシート) — short, single-format thermal slips | Full — generates compliant invoices and handles tax filing | Struggles with multi-format supplier invoices. No custom field extraction for vendor-specific layouts. |
| MoneyForward Cloud | ¥1,078 (Mini) + metered OCR | Receipts and bank statement feeds | Full — with 2,000+ bank integrations | Invoice payable management requires additional paid modules. Metered OCR costs add up. |
| Yayoi (弥生会計) | ¥11,000-33,000/year | Desktop-based receipt entry | Full — longest-established platform (~3.4M users) | Desktop plans lack cloud-native API integrations. OCR is receipt-focused. |
The common pattern across all three: their built-in OCR was designed for receipts (レシート) — short, uniform thermal paper slips with predictable layouts — not for multi-format supplier invoices with T-number extraction, dual tax rate breakdowns, and line-item tables that span multiple pages. This is not a failure of the accounting platforms; it is a product of their design. Receipts are high-volume, low-complexity documents. Supplier invoices are low-volume, high-complexity documents with compliance implications. The OCR requirements are sufficiently different that a single engine rarely handles both well.
The practical workflow for Japanese finance teams is therefore a two-tier approach: use the accounting platform for its strengths (bank reconciliation, payroll, tax filing, receipt OCR) and a dedicated invoice data extraction tool for multi-format supplier invoice processing, then bridge the two via CSV import.
The Transitional Measures Timeline — What Changes Through 2031
The Qualified Invoice System includes phased transitional measures that gradually reduce the input tax credit available on purchases from unregistered (tax-exempt) suppliers. These are not minor adjustments — they change the effective cost of doing business with unregistered vendors at every step, and they trigger ERP reconfiguration, AP retraining, and supplier renegotiation each time they change.
| Period | Input Credit on Purchases from Unregistered Suppliers | What Changes for AP |
|---|---|---|
| Oct 2023 – Sep 2026 | 80% of the purchase tax amount deductible | First step — suppliers who remain unregistered cost the buyer ~2% of taxable amount. Most AP teams began tracking registration status. |
| Oct 2026 – Sep 2029 | 50% of the purchase tax amount deductible | Effective cost penalty rises to ~5%. Urgency increases to convert unregistered suppliers. ERP systems must update the new deduction rate. |
| From Oct 2029 | 0% — no credit available | Full 10% consumption tax becomes unrecoverable cost on purchases from unregistered suppliers. Registration effectively mandatory for any B2B supplier. |
For AP teams processing Japanese invoices, this timeline creates a practical requirement: every invoice must be classified as coming from a registered or unregistered supplier, and the deduction rate applied to each invoice must match the applicable period. An invoice from a previously registered supplier whose registration lapses must be handled at the transitional rate — not the full credit rate. This makes systematic T-number verification not just a compliance step but a financial accuracy requirement.
The Japan Chamber of Commerce and Industry tracked the system's operational impact through two consecutive surveys. The 2024 survey of 3,149 member businesses found 48.8% reporting increased costs and 82.2% reporting increased administrative burden. The 2025 follow-up of 2,710 businesses showed those figures at 45.8% and 73.4% — improving but still affecting nearly three-quarters of respondents. The single largest source of new work: "supplier registration status verification and management," cited by 74.8% of respondents.
Frequently Asked Questions
Can a handwritten Japanese invoice be a valid qualified invoice?
Yes. The NTA does not require qualified invoices to be electronic or machine-printed. Any document that contains all six mandatory fields — including the registration number and tax rate breakdown — is a valid qualified invoice, whether handwritten, printed, or generated as a PDF. Handwritten invoices from small suppliers are common and legally valid.
What does a qualified invoice registration number look like?
The format is the letter T followed by exactly 13 digits, for example T1234567890123. For corporations, the 13 digits are the company's Corporate Number (法人番号). For sole proprietors, the NTA assigns a separate 13-digit number. The T prefix is mandatory and distinguishes the qualified invoice registration number from other identifiers.
How should I handle Japanese era dates in extracted data?
Era dates (令和8年3月10日, R8.3.10, H30.12.1) should be converted to ISO 8601 (2026-03-10) during extraction. The conversion formulas are: Reiwa year + 2018, Heisei year + 1988, Showa year + 1925. When an invoice shows both era and Western dates (e.g. "令和8年(2026年)3月10日"), use the Western date directly.
What if an invoice only has one tax rate?
If a qualified invoice involves only 10% items or only 8% items, the supplier must still clearly indicate which rate applies and show the consumption tax amount for that single rate. Displaying only one rate is acceptable as long as the document makes it unambiguous that no items at the other rate exist. Some invoices display a "0" or dash for the unused rate bracket.
Does the extraction tool handle vertical (縦書き) invoices?
Template-based OCR tools do not handle vertical invoices reliably — they read characters in the wrong sequence. Vision-based extraction that reads the document as a whole (rather than scanning left-to-right line by line) can handle vertical layouts because it identifies fields by semantic meaning rather than reading order. When evaluating an extraction tool for Japanese invoices, test it specifically on a vertical-format document — not every tool that claims "Japanese support" handles vertical text.
Is Peppol JP PINT mandatory for qualified invoices in Japan?
No. Peppol JP PINT is the recommended e-invoicing standard by the Japan Digital Agency, but it is not mandatory. Qualified invoices can be issued on paper, as PDFs, or in any electronic format, as long as they contain all six mandatory fields. However, for high-volume B2B transactions, Peppol adoption is growing because it enables automated exchange of structured data without manual data entry or extraction.
How do I verify a supplier's T-number?
The NTA maintains a public registry at invoice-kohyo.nta.go.jp. Enter the 13-digit number (without the T prefix) or search by business name. The registry returns the issuer's registered name, registration date, and current status. This verification step should be part of every AP workflow processing Japanese qualified invoices.
How long must qualified invoices be retained?
The NTA requires qualified invoices to be kept for 7 years from the end of the relevant tax period. This applies to both the issuer and the recipient. Digital storage (scanned copies) of paper invoices is permitted as long as the digital copy preserves all mandatory fields clearly.
Does a qualified invoice use tax-exclusive or tax-inclusive amounts?
Both are permitted. The invoice must state clearly whether the amounts shown are tax-exclusive (税抜, zeinuki) or tax-inclusive (税込, zeikomi). If the invoice does not explicitly state which convention is used, the NTA guidance considers ambiguous invoices potentially invalid for credit purposes. When extracting data, ensure the output distinguishes between the two and standardizes to a single convention for your accounting system.
The Japanese Qualified Invoice System adds a compliance layer to every invoice that crosses your desk — three new data points on every document, a per-supplier registration check, and a tax rate breakdown that changes depending on what the supplier sold. The tools that handle this best are the ones that read invoices the way a person reads them: by understanding what each field means, not by memorizing where it sits on a specific supplier's layout. If you are processing Japanese supplier invoices, try extracting a qualified invoice with ImageToTable.ai — upload a PDF or image of any Japanese invoice, name the columns you need, and see what comes back in about 10 seconds.