How to Extract Korean TransactionStatements into Excel

South Korea processes over 600 million electronic tax invoices per year through its NTS e-Tax system. Yet the document that actually accompanies every physical shipment — the transaction statement (거래명세서) — sits in a digital blind spot. It has no legally mandated format. It is not transmitted to any government system. And for the procurement team receiving 40 of them a week from 20 different suppliers, each one arrives as a PDF, a printout, or a mobile photo — all with different layouts, all needing the same item-level data extracted into the same receiving spreadsheet.

This guide covers what a Korean transaction statement actually contains, why conventional OCR tools struggle with it, and how to extract line-item data from any supplier's 거래명세서 into Excel — in one pass, without per-supplier setup.

Korean transaction statement data extraction to Excel spreadsheet for procurement reconciliation

Key Takeaways

  1. 600 million electronic tax invoices flow through Korea's NTS system every year — but the 거래명세서 you actually receive with every shipment sits outside that system entirely, with no legally mandated format and no digital transmission path.
  2. The bottleneck isn't Korean character recognition. Every supplier designs their own layout because no law tells them otherwise. Template-based extraction assumes consistent formatting — that assumption breaks the moment a second supplier sends you a document.
  3. A single column definition — Supplier Name, Item Name, Quantity, Unit Price — reads any 거래명세서 by meaning, not position. ImageToTable.ai locates fields by decoding what the document says rather than relying on where a supplier placed things.

What a Korean Transaction Statement Actually Contains

A transaction statement (거래명세서, literally "transaction detail document") is the document that travels with a shipment between Korean businesses. Unlike a tax invoice (세금계산서), which is a legally regulated document governed by Article 32 of Korea's VAT Act and transmitted to the National Tax Service, a 거래명세서 has no mandatory format. It is a private verification document (사적 증빙) — not eligible for input tax deduction, not filed with any government authority, and not required by any law.

That regulatory vacuum creates the extraction problem. Because no standard format exists, every supplier designs their own layout. A transaction statement from a packaging supplier in Incheon places the buyer information top-left and the item table bottom-center. One from a steel distributor in Pohang puts the company stamp (직인) across the header and the item rows in landscape orientation. Neither is wrong — because there is no template to be wrong against.

A typical 거래명세서 contains these fields:

FieldKorean NamePurpose
Supplier info공급자Business name, registration number, address, contact
Buyer info공급받는자Recipient business name, registration number, address
Transaction date거래일자Date the goods were delivered
Item name품목명Name or code of each delivered item
Specification규격Dimensions, model numbers, grade
Quantity수량Number of units delivered
Unit price단가Price per unit
Amount금액Line total (수량 × 단가)
Supply value공급가액Sum of all line amounts, excluding VAT
Tax amount세액VAT (10% of supply value)
Remarks비고Additional notes — delivery conditions, partial shipment flags, PO reference

The document is typically issued in duplicate: a red copy for the supplier (공급자용) and a blue copy for the buyer (공급받는자용). In the procurement workflow, it arrives with the physical shipment. The receiving team checks it against the purchase order and the actual goods — a process called 3-way matching — before authorizing payment against the tax invoice that follows later.

In Korean ERP systems like ECOUNT (이카운트), Douzone (더존), and iQuest's 얼마에요, transaction statements are typically generated automatically from sales records — a solved problem on the outbound side. The inbound side is where things break down: these same systems offer no native way to ingest data from the transaction statements you receive from your own suppliers.

Why Template-Based Tools Can't Handle 거래명세서

Most document extraction tools operate on one of two principles: coordinate-based templates or trained models. In a coordinate-based system, you draw rectangles around fields — "supplier name is at (120, 340)" — and the tool reads whatever text appears in that zone. In a trained model system, you annotate 10 to 50 sample documents so the model learns where each field usually appears.

Both approaches break on 거래명세서 for the same reason: there is no "usually." Because the document has no standard layout, coordinates and field positions are different for every supplier. Train a model on transaction statements from Supplier A, and it will fail on Supplier B's format. Add Supplier B's format to the training set, and Supplier C's layout introduces a new failure mode. This is not a training-data problem — it is a structural mismatch between template-bound extraction and a document type that was never designed to be template-bound.

The alternative is semantic extraction: instead of telling the tool where to look, you tell it what to look for. You type the column names you want — "Item Name," "Quantity," "Unit Price," "Supply Value" — and the AI reads the document visually, locating each field by understanding what it means rather than where it sits on the page. This approach, called column-name extraction, means one column definition processes every supplier's transaction statement — regardless of layout, orientation, or whether the supplier name is top-left, centered, or embedded in a header block.

This distinction matters because the alternative — maintaining a separate template for every supplier — doesn't scale. You onboard 3 suppliers, it works. You hit 30, and maintaining the template library becomes its own administrative task. Column-name extraction bypasses that maintenance entirely: the same column definition works on the first supplier's document and the 30th.

Template-based OCR assumes document layouts are consistent. Column-name extraction assumes they are not — and that is the correct assumption for any Korean B2B document that lacks a legally mandated format.

Step-by-Step: Extract 거래명세서 Data into Excel

Here is the full workflow, from receiving a transaction statement PDF to having structured data in a spreadsheet. Each step is one action — no coordinate drawing, no model training, no per-supplier configuration required.

1
Upload your transaction statements. Drag and drop PDFs, scans, or photos into the upload area. If a supplier emails you a PDF, save it and upload. If a delivery driver hands you a paper copy, take a photo with your phone and upload the image. JPG, PNG, WebP, and PDF are all supported. For daily batch processing, upload every statement from today's deliveries at once.
2
Define your columns. Type the field names you want extracted into your spreadsheet. For procurement reconciliation, a typical column set is: Supplier Name, Supplier Registration Number (사업자등록번호), Transaction Date (거래일자), Item Name (품목명), Specification (규격), Quantity (수량), Unit Price (단가), Line Amount (금액), Supply Value (공급가액), VAT Amount (세액), PO Reference Number. The column names you type become the exact headers in your Excel output — in the order you typed them.
3
Process and review. The AI reads each document, locates every requested field by semantic meaning (not by position), and populates the output table. Line items from the transaction statement's item table are extracted row by row, with header-level fields like supplier name repeated for each line. Review the extracted data on screen — you can edit any cell if needed before exporting.
4
Download as Excel. Export the full table as XLSX, CSV, or JSON. Each item row from every transaction statement becomes one row in your spreadsheet. Header fields are populated across all rows. The output is ready for import into your ERP — whether that is ECOUNT, Douzone, or a custom system — with no post-processing required.

The extraction engine reads each field by meaning — it recognizes a supplier registration number (사업자등록번호) by its 10-digit pattern with hyphens, a supply value (공급가액) by its position relative to quantity and unit price columns, and an item name (품목명) by its context in the item table. This is the difference between optical character recognition and visual language understanding: one reads text; the other reads documents.

For a deeper look at the mechanics behind this approach, see our guide on extracting specific fields by column name.

JPG/PNG/PDF AI Extraction

Files are processed securely and not stored.

3-Way Matching: PO vs 거래명세서 vs 세금계산서

Extracting data from a transaction statement in isolation is useful. Extracting it as part of the procurement reconciliation cycle is where the real operational leverage sits. The standard Korean procurement workflow follows a predictable sequence: purchase order (발주서) is sent to the supplier → goods are delivered with a transaction statement (거래명세서) → receiving inspection is performed → a tax invoice (세금계산서) is issued and transmitted to the NTS → payment is made.

At the receiving dock, the task is 3-way matching: comparing what was ordered (PO), what was delivered (거래명세서), and what was invoiced (세금계산서). Discrepancies caught at the dock cost a phone call. Discrepancies found during month-end reconciliation cost hours of backtracking through delivery records, supplier emails, and ERP screens.

Here is the data each document type provides for matching:

DocumentSourceKey Fields for Matching
Purchase Order (발주서)Your ERPOrdered quantity, agreed unit price, requested delivery date
Transaction Statement (거래명세서)Supplier (paper/PDF with delivery)Delivered quantity, item description, unit price on statement
Tax Invoice (세금계산서)Supplier (via NTS e-Tax)Billed quantity, billed unit price, supply value, VAT, NTS approval number

The bottleneck is the middle column: the transaction statement data is almost never digital. It arrives on paper clipped to a box or as a PDF attachment buried in a delivery notification email. Until that data is in spreadsheet form — in the same format as your PO data from the ERP and your tax invoice data from the NTS system — 3-way matching cannot be automated.

Once the transaction statement data is extracted into Excel, you can use computed columns to perform the match directly in the output. Define a column like Diff: Ordered Qty vs Delivered Qty (PO Qty - Quantity) and the AI calculates the difference for every line item during extraction. Any row with a non-zero difference is flagged before the spreadsheet is even opened. If you process purchase orders through the same extraction pipeline, you can extract PO data to Excel as well — bringing both document types into the same structured format for direct comparison.

For a parallel scenario — extracting tax invoice data specifically — see our guide on extracting Korean tax invoice data to Excel, which covers the seven mandatory fields and quarterly VAT filing workflow. If you are processing large volumes of invoices for VAT reporting, the batch tax invoice processing guide covers the throughput side.

Batch Processing: Handling Daily Delivery Statements

Individual extraction solves the per-document problem. Batch processing solves the daily-volume problem — and it does so without requiring you to repeat any of the setup work.

The column definition you created in Step 2 above is not document-specific. It describes the fields you want, not where those fields appear. This means you can upload 20 transaction statements from 15 different suppliers — all with different layouts — in a single batch. The same column definition is applied to every document. The output is one consolidated spreadsheet where each line item from every statement occupies one row, with the supplier name and date populated across all rows from the same document.

A procurement team receiving 30 transaction statements per week saves approximately 3–4 hours of manual data entry. At an estimated hourly cost of KRW 18,000–25,000 for a procurement clerk, this represents KRW 220,000–400,000 in monthly savings from a single workflow change. That figure does not include the error-correction time eliminated, which in practice can equal or exceed the entry time itself.

For suppliers who consistently send paper statements with deliveries, a collection link (수집 링크) can shift the digitization step upstream. Generate a shareable URL — no login required for the supplier — and include it in your supplier onboarding instructions. The supplier opens the link on their phone, enters a short verification code, and uploads a photo or PDF of the transaction statement directly into your processing queue. The document arrives already digitized, and your team extracts the data instead of retyping it. This is the same mechanism described in our Korean receipt extraction guide, adapted for the B2B procurement context.

When batch processing transaction statements alongside other delivery documents, the extraction workflow is identical — define columns once, upload everything, download one spreadsheet. The delivery note extraction tool covers the same pattern for carrier-issued delivery dockets, which often accompany transaction statements on the same shipment.

FAQ

Do I need a separate setup for each supplier's transaction statement format?

No. Column-name extraction locates fields by meaning, not by position. The same column definition — Supplier Name, Item Name, Quantity, Unit Price, Supply Value — works across every supplier's layout because the AI reads the document visually and understands what each field represents, regardless of where the supplier placed it on the page.

Can it handle handwritten transaction statements?

Yes, for printed template forms with handwritten entries — which is the most common paper format in Korean logistics, where suppliers print a blank 거래명세서 template and fill in quantities and dates by hand. Fully handwritten documents with no printed structure are more challenging and accuracy will be lower. The system handles stamps (직인) and signatures as visual elements — it can detect their presence but does not extract text from them.

How does it distinguish between 공급가액 (supply value) and 세액 (tax amount)?

The AI understands the semantic hierarchy of the document. 공급가액 is the total before VAT. 세액 is typically 10% of that figure (Korean VAT rate). When you define both columns, the AI uses their relationship to validate the extraction — if a detected value labeled 공급가액 does not match the expected 관계 (relationship) with 세액, the system flags it. You can also define a computed column like Verify VAT (세액 / 공급가액) to output the ratio for every row.

Can I extract data into ECOUNT or Douzone instead of Excel?

The direct output format is Excel (XLSX), CSV, or JSON. Both ECOUNT and Douzone support Excel import for transaction data — export as XLSX and use your ERP's import function to load the data. The column names in your Excel output can be configured to match your ERP's import field names, eliminating the mapping step.

What about documents with mixed Korean and English fields?

The AI reads both languages. Many Korean transaction statements include English item names alongside Korean descriptions, especially for imported goods or multinational supply chains. Define your columns in whichever language you need the output in — the AI locates the corresponding values regardless of the source document's language.

Next Steps

The gap in Korea's B2B document infrastructure is not on the issuance side — ECOUNT, Douzone, and the NTS e-Tax system have already digitized tax invoices end to end. The gap is on the receipt side: the documents your suppliers send you that were never designed to be machine-read. Transaction statements sit at the center of that gap because they touch procurement reconciliation, inventory receiving, and payment authorization — three workflows that downstream finance teams depend on.

Closing that gap does not require changing how your suppliers issue documents. It requires changing how your team processes what arrives. The same transaction statement that took 8 minutes to retype into your receiving log can be extracted in seconds — with the same column definition working on tomorrow's statement from a different supplier in a different format.

📮 contact email: [email protected]