How Tax Preparers Get W-2 & 1099 Data
Into Google Sheets — Without Typing Each Box
The average small tax firm processes between 100 and 1,000 returns per year. Of the 791,469 current PTIN holders as of May 2026, over 127,000 firms fall into that middle band — and every one of them faces the same bottleneck every January: client W-2s and 1099s arrive as PDFs, scanned images, or smartphone photos, and someone has to type Box 1 wages, Box 2 federal tax withheld, Box 3 Social Security wages, Box 4 Social Security tax, Box 5 Medicare wages, Box 6 Medicare tax — plus the EIN from Box b, the employee's name and SSN from Boxes e and a — into a tax prep spreadsheet or directly into Drake, UltraTax, or ProConnect.
That someone is often the most expensive person in the room: a preparer billing $55–$150 per hour, spending 8 minutes per form on an activity that adds zero analytical value. Across 300 W-2s, that's 40 hours — an entire work week — spent reading numbers off a page and typing them into a screen. The Google Sheets add-on shortens that to roughly 10 seconds per form.
Key Takeaways
- 40 hours — a full work week — vanishes every tax season typing W-2 box values into software, during the 12 weeks when every hour determines your firm's annual profitability.
- Template scanners work perfectly on the W-2s you wish clients sent — pristine ADP digital files — not the grainy photocopies, mismatched payroll formats, and handwritten corrections that fill your actual inbox.
- The column headers you already type — "Box 1 Wages," "Employer EIN" — become extraction instructions: ImageToTable.ai reads labels by meaning, finding each value on any W-2 layout from any payroll provider.
Tax Season Is a Data Entry Sprint — and W-2s and 1099s Are the Longest Leg
The National Association of Tax Professionals reports that 65% of its 23,000+ member firms' gross revenue is earned during tax season — a concentrated window from late January through April 15 where every hour determines the firm's profitability for the year. According to Wolters Kluwer, 70% of firms handled increased tax season workload by asking staff to work longer hours. That strategy works until it doesn't — burnout in the accounting profession has been well-documented, and the U.S. accounting workforce has declined by nearly 17% since 2020, with over 300,000 accountants and auditors leaving the profession.
For a 3-to-5-person tax firm, a typical season might involve 200 to 500 individual W-2 forms spread across 30 to 80 business clients. Each W-2 demands 12 to 15 fields: the core numbered boxes (1 through 6), employer EIN (Box b), employee name and SSN (Boxes e and a), and — for clients who worked in multiple states — state wage and withholding data from Boxes 15 through 17. That's before you touch 1099-NEC forms (Box 1 nonemployee compensation, Box 4 federal tax withheld, payer and recipient TINs) or 1099-MISC forms (Box 1 rents, Box 2 royalties, Box 3 other income).
The math compounds fast. At a conservative 8 minutes per form — which covers locating the document in the client file, opening the correct input screen in the tax software, typing 12–15 fields, and cross-checking Box 3 × 6.2% against Box 4 — 300 W-2s consume 40 staff hours. That's one full week of a preparer's time during the busiest 12 weeks of the year, spent on work that requires no professional judgment whatsoever. On a tax preparer forum, one practitioner described the experience plainly: "I feel like I'm just doing data entry" while working through 1040s with brokerage statements and K-1s — and that was someone who wasn't yet responsible for the firm's bottom line.
The W-2 and 1099 Fields That Matter (and the Ones You Can Skip)
Before you set up an extraction workflow, you need to know which fields carry forward into a tax return and which ones are noise for most small-firm workflows. The IRS Form W-2 (governed by IRS Publication 15, Circular E, and the 2026 General Instructions for Forms W-2 and W-3) contains 20 numbered boxes, six lettered identifier fields, Box 12 with up to four coded entries, and the newly split Box 14a/14b for 2026. Most firms need only a subset.
| Field | What It Reports | Why You Extract It |
|---|---|---|
| Box a / Box e | Employee SSN and name | Matches the return to the taxpayer; IRS Publication 1345 requires preparers to manually key-enter TINs — extraction output lets you verify against what the client provided |
| Box b | Employer Identification Number (EIN) | Verifies employer identity; used to match W-2 against employer records and detect 1099 vs W-2 misclassification |
| Box 1 | Wages, tips, other compensation | Flows to Form 1040, Line 1 — the single most important field on the form |
| Box 2 | Federal income tax withheld | Credited against total tax liability; aggregated across all W-2s for a taxpayer |
| Box 3 | Social Security wages | Capped at $176,100 for 2026; used to cross-check Box 4 (see verification section below) |
| Box 4 | Social Security tax withheld | Should equal Box 3 × 6.2% up to the cap — one of the fastest error catches in a return |
| Box 5 | Medicare wages and tips | No wage cap; wages above $200,000 also subject to Additional Medicare Tax (0.9%) |
| Box 6 | Medicare tax withheld | Should equal Box 5 × 1.45% for most taxpayers |
| Box 12 (codes D, E, AA, etc.) | Retirement plan contributions, Roth deferrals, etc. | Code D (401(k)) and Code E (403(b)) affect AGI calculations; Code AA (Roth) matters for future tax planning |
| Box 13 checkboxes | Statutory employee, Retirement plan, Third-party sick pay | Retirement plan checkbox determines IRA deduction eligibility for the taxpayer |
| Boxes 15–17 | State wages and state income tax withheld | Critical for multi-state filers — a client who worked in two states may have two W-2s or a single W-2 with split state reporting |
For 1099-NEC forms (governed by the Instructions for Forms 1099-MISC and 1099-NEC), the essential fields are tighter: Box 1 (nonemployee compensation), Box 4 (federal income tax withheld, if any), payer TIN, and recipient TIN. 1099-MISC forms add Box 1 (rents), Box 2 (royalties), and Box 3 (other income) depending on the client's situation. No single 1099-MISC will contain all of these — you extract the columns your client base actually needs.
The IRS processed roughly 245 million W-2 forms from 6.9 million employers annually. Every one of those forms contains these same fields — and every tax preparer who touch-types them by hand is performing the same redundant motion millions of other preparers performed the same week. This is where the mechanical work ends and the extraction workflow begins.
How the Add-On Reads Tax Forms: Custom Column Extraction, Not Template Matching
Tools like GruntWorx and SurePrep use template-based recognition: they know where Box 1 sits on a standard W-2, where the EIN field is on a 1099-NEC, and they scan for text at those coordinates. This works for pristine, digitally generated tax forms — the kind you get from ADP or Paychex. It breaks down when a client hands you a W-2 that was printed, signed with a pen, scanned at a low resolution, and emailed as a grainy PDF. It also struggles with employer-specific W-2 layouts that deviate from the exact IRS form geometry, or 1099 forms from smaller payers who use substitute statements with their own formatting.
The Google Sheets add-on uses Custom Column Extraction: instead of programming coordinates, you type the field names you want — "Box 1 Wages", "Box 2 Federal Tax Withheld", "Employer EIN" — into column headers in Google Sheets. The AI locates each value anywhere on the page by understanding what the label means, not where it sits. A Box 1 wage amount that appears in the top-right quadrant of one employer's W-2 and the middle-left of another's gets found because the AI recognizes "this is the dollar amount labeled as wages" — regardless of position, font, or surrounding clutter.
This distinction matters particularly for tax forms. A W-2 from a mid-size employer using ADP's payroll system looks nothing like a W-2 from a small restaurant using QuickBooks payroll — the box positions, font sizes, and label formatting differ. A 1099-NEC issued by a real estate management company uses a completely different layout than one from a construction contractor. Template-based tools need a separate template for each. Custom Column Extraction reads them all the same way: by understanding what each box label represents, then locating the corresponding value.
One forum post by a tax preparer evaluating GruntWorx captured the appeal of any automated extraction: "My productivity has soared since we started. The data entry is extremely accurate. My office admin does all of the initial scanning, uploading, and getting it ready for import. I cannot imagine running my practice without it." But another preparer noted the cost barrier: "I looked at automation tools like SurePrep/GruntWorx, but they seem expensive for a small startup." The add-on model — running inside Google Sheets, billed as part of your existing subscription — changes the math for smaller firms.
Step 1 — Set Up Your Tax Prep Spreadsheet in Google Sheets
The add-on works within a Google Sheet — the same spreadsheet many firms already use as a work-in-progress workbook between document collection and tax software data entry. If you don't yet have a tax prep spreadsheet, open a new Google Sheet and set up column headers based on the field list above. A typical layout for W-2 processing looks like this:
| Employee Name | SSN | Employer EIN | Box 1 Wages | Box 2 Fed Tax | Box 3 SS Wages | Box 4 SS Tax | Box 5 Medicare Wages | Box 6 Medicare Tax | Box 12 Code D | Box 15 State | Box 16 State Wages | Box 17 State Tax |
|---|
These column headers serve a dual purpose. In a standard workflow, they label the data you'll later import into Drake Tax or UltraTax CS. With the add-on, they become the extraction instructions: the AI reads each header and finds the corresponding value on the uploaded document. You can add computed columns for cross-check formulas directly in the sheet — for example, a column labeled "SS Tax Check (Box 3 × 6.2%)" will compute the expected Social Security tax from the extracted Box 3 value, letting you spot mismatches before the data enters the return.
Install the add-on from the Google Workspace Marketplace, then open it from Extensions → ImageToTable → Start. The first time you open it, you'll paste your API key (generated from the ImageToTable.ai profile page). This connects your Google Sheets session to your account — extraction history, saved column templates, and usage quotas all sync across web and add-on.
Step 2 — Upload W-2s and 1099s Through the Sidebar
With the sidebar open on the right side of your Google Sheet, you can drag-and-drop or click to upload W-2 and 1099 files. Supported formats include PDF (the most common format for client-submitted tax forms), JPG and PNG (for smartphone photos of paper forms), and WebP. The add-on accepts multiple files in a single upload session — drop all 30 W-2s from a single client's small business payroll into one batch, and they'll queue for processing together.
This batch capability is where the add-on alignment with tax season becomes clear. A business client with 20 employees generates 20 W-2s. Upload them once, let the extraction run, and every employee's data populates as a separate row in your sheet. Each row contains the same set of columns — Box 1 through Box 17 for every employee — giving you a single view of that client's entire wage reporting before it enters the tax software.
Files uploaded through the sidebar are processed on ImageToTable.ai's servers using the same AI engine that powers the web application. Once extraction completes — typically within 5 to 10 seconds per page for tax forms — the results write directly into the active sheet starting from the row below your header row. The sidebar also tracks processing status, so you can upload files from one client, switch to another sheet tab for a different client, and check back when the batch is done.
Step 3 — Define Your Extraction Columns for Tax Forms
This step is where the add-on's flexibility separates it from rigid tax-form scanners. In the sidebar, you'll see a "Columns" panel. The column names you entered as sheet headers appear there — and the add-on uses them as extraction targets. For a W-2 batch, your column list might look like:
Name columns exactly as they appear on the form. "Box 1 Wages" works better than "Wages" because the AI uses the label 'Box 1' as context. Similarly, "Box 4 Social Security Tax" produces more reliable extraction than "SS Tax" — the semantic match between your column name and the text on the form is what the AI relies on to locate values.
Include computed columns for verification. Add a column like "SS Tax Expected (Box 3 × 6.2%)" — the AI understands the formula notation and calculates the expected value from the Box 3 wage amount it extracted. Similarly, "Medicare Tax Expected (Box 5 × 1.45%)" gives you a row-by-row audit trail before the data hits the tax return.
Set up separate column groups for 1099-NEC and 1099-MISC. These forms share little structure with W-2s — a 1099-NEC has Box 1 (nonemployee compensation), Box 4 (federal tax withheld), payer TIN, and recipient TIN. Create a separate sheet tab for 1099 extraction with headers matching those fields. You can upload a mix of W-2s and 1099s to the same batch, but extracting them into separate sheets keeps the output clean and prevents column misalignment.
Save column templates for repeat use. Once you've defined the column set for W-2 extraction, save it as a template. Next January, open a new sheet, load the template, and your entire extraction setup — headers, computed columns, verification formulas — is ready before the first client W-2 arrives.
The key input-side habit: name your columns after what's printed on the form, not what your firm calls it internally. A column labeled "FWH" (your internal shorthand for "federal withholding") gives the AI almost nothing to work with. "Box 2 Federal Income Tax Withheld" — matching the exact IRS box label — gives it the semantic anchor it needs to locate that value on any layout of the W-2, from ADP's laser-printed version to a hand-corrected copy from a small bakery.
Step 4 — Run Cross-Checks on Extracted Data Before It Enters the Return
No extraction tool — AI or template-based — should send data directly into a tax return without a verification step. The add-on's output goes into Google Sheets first, not into your tax software, which turns out to be an advantage: you have a natural review point between extraction and filing. Here are the cross-checks worth running on every batch:
| Cross-Check | Formula | What a Mismatch Means |
|---|---|---|
| Social Security withholding | Box 3 × 6.2% ≈ Box 4 (within $1 rounding) | If Box 3 > $176,100, Box 4 should not exceed $10,918.20. A mismatch on a sub-cap wage could mean the AI misread Box 3 or Box 4 — or the employer made an error on the original form. |
| Medicare withholding | Box 5 × 1.45% ≈ Box 6 (plus 0.9% on wages > $200,000) | A mismatch without a wage-over-$200k explanation usually indicates a digit transposition in the extraction — one of the most common error types from degraded scans. |
| Box 1 ≤ Box 3 (common case) | Box 1 should be ≤ Box 3 for most employees due to pre-tax deductions (401(k), Section 125) | Box 1 > Box 3 is unusual and often means the client has taxable fringe benefits — not an extraction error, but worth flagging for review. |
| 1099-NEC Box 1 threshold | If Box 1 < $600, verify payer intent | Payments under $600 generally don't require a 1099-NEC — a sub-$600 form may be voluntary or may have a Box 1 misread. |
| Multi-state W-2 totals | Sum of all Box 16 values ≤ Box 1 | If state wages across multiple states exceed federal wages, either a Box 16 was misread or the employer's allocation needs review. |
For firms processing returns in Drake Tax, the verification stage in Google Sheets serves a particular purpose. Drake's data entry screen treats W-2 fields as individual input cells — if a Social Security wage amount was extracted incorrectly and imported without review, the preparer might not catch it until the diagnostic check at the end, when DoubleCheck flags the SS tax mismatch. Catching it in Google Sheets first means the flagged item never reaches the diagnostic stage — it's resolved before import.
TaxSlayer Pro users benefit similarly. At $20 per return for PPR, every minute spent chasing extraction errors in the tax software is a minute spent inside a metered system. Verifying in Google Sheets — outside the paid software — keeps the per-return cost where it belongs: on preparation and review, not on data cleanup.
What the Add-On Handles — and What Still Needs Your Eyes
Every extraction method has boundary conditions. The add-on performs well on standard printed W-2s, 1099-NEC forms, and 1099-MISC forms in PDF or clear image format. It handles common payroll provider outputs — ADP, Paychex, Gusto, QuickBooks Payroll — without issue because the AI reads form labels semantically, not by template coordinates.
The following scenarios call for extra manual review — and in some cases, manual entry remains the safer choice:
- Hand-corrected W-2s with pen-and-ink changes. IRS Publication 1345 specifically flags altered W-2s as requiring non-standard form code entry. If a client's employer crossed out a box value and wrote a different number by hand, the AI may read either the printed number, the handwritten correction, or neither — depending on which is more visually dominant. Manually verify any W-2 with visible pen alterations.
- Heavily degraded scans or dark photocopies. A W-2 that's been photocopied three times and scanned at 100 DPI has lost significant visual information. The AI's accuracy drops on these inputs — not to zero, but below the threshold where verification time starts to rival manual entry time.
- 1099-MISC with forms involving crop insurance proceeds, fishing boat proceeds, or Section 409A deferrals (Boxes 8, 9, 10, 12, 14, 15b). These are rare for small-firm client bases and the add-on can extract them, but the field labels are less standardized than the common boxes. Expect to verify each extraction on these less common fields.
- Handwritten W-2s. IRS Publication 1345 identifies entirely handwritten or typed W-2s as non-standard forms. The add-on's handwriting recognition handles neat, block-printed numbers but struggles with cursive or heavily stylized handwriting — the same limitation that applies to any OCR-based system.
The key distinction: the add-on reduces the manual entry workload from "enter every field on every form" to "review extracted fields and fix the 1–2% that need correction." That shift — from primary data entry to exception-based review — is where the time savings live. For a firm processing 300 W-2s, it's the difference between 40 hours of typing and roughly 3 hours of verification and spot-correction.
FAQ
How accurate is the extraction for W-2 box values?
On clean, printed W-2 forms — the kind generated by ADP, Paychex, Gusto, or QuickBooks Payroll — extraction accuracy reaches up to 99% for numeric fields. Dollar amounts (Boxes 1–6, 16–17) and EIN/SSN numbers extract reliably. Degraded scans, handwritten corrections, or photocopied forms reduce accuracy. The safest workflow: treat the extraction as a first draft, run the cross-check formulas in Step 4, and visually spot-check forms with known quality issues before importing into tax software.
Can the add-on read handwritten W-2s?
Yes — with limitations. Neatly block-printed numbers on a W-2 are extractable. Cursive handwriting, heavily stylized numbers, or forms where the handwriting overlaps printed text reduce accuracy significantly. IRS Publication 1345 classifies entirely handwritten W-2s as non-standard forms requiring special handling. If more than 2–3 fields on a W-2 are handwritten, manual entry may be faster than extraction-followed-by-correction. Use the add-on as a first pass — if the output looks garbled on a particular form, enter that one by hand.
What's the difference between 1099-MISC and 1099-NEC for extraction purposes?
They're different forms with different box layouts. 1099-NEC (introduced for tax year 2020) reports nonemployee compensation in Box 1 — this replaced Box 7 on the old 1099-MISC. 1099-MISC now covers rents (Box 1), royalties (Box 2), other income (Box 3), and several less common payment types. When setting up your extraction columns in Google Sheets, use a separate sheet tab for each form type. Column headers like "Box 1 Nonemployee Compensation" for 1099-NEC and "Box 1 Rents" for 1099-MISC help the AI distinguish which form it's looking at and which value to extract.
How does the add-on handle multi-state W-2s?
A W-2 with multiple state entries (Boxes 15–17 repeated for each state) presents a two-dimensional data structure: one employee, one federal set of boxes, but potentially multiple rows of state data. The add-on extracts the first state row (Boxes 15–17) into your column layout. For the second state row, you can define additional columns — "Box 15 State 2", "Box 16 State Wages 2", "Box 17 State Tax 2" — in the same sheet. Or process the W-2 twice with different state column configurations. Most tax software handles this through separate state input screens anyway, so the extraction just needs to capture all state data points — the arrangement is flexible.
Can I combine extracted W-2 data with my tax software's import feature?
Yes — and this is the intended endpoint. Once the add-on populates your Google Sheet with extracted W-2 data, you have a structured dataset. From here, you have several integration paths: export the sheet as a CSV and use Drake Tax's Import Data function (Drake supports CSV import for certain data), copy-paste the verified values into ProConnect Tax's input screens, or — for TaxSlayer Pro users — use the data to populate the software's W-2 worksheet directly. The Google Sheet becomes the intermediary: the add-on handles extraction and verification, the tax software handles preparation and filing. Neither tool needs to do the other's job.
Does the add-on require per-form setup or template creation?
No. Unlike template-based tax document scanners, the add-on uses Custom Column Extraction — you define column headers once per form type (W-2, 1099-NEC, 1099-MISC), save the column set as a template, and reuse it for every batch. The AI reads each document fresh, locating values by understanding the box labels rather than matching a stored template. An employer who switched payroll providers mid-year — and whose W-2 layout changed as a result — gets extracted using the same column headers with no reconfiguration.
What happens to client tax data after extraction?
Files uploaded through the add-on are processed on ImageToTable.ai's servers and deleted after processing. Extracted data resides in your Google Sheet — which lives in your Google Workspace, under your firm's data controls. The add-on does not store or retain uploaded tax forms beyond the processing window. For firms subject to IRS data security requirements (Publication 4557, Safeguarding Taxpayer Data), this architecture means sensitive W-2 and 1099 information never accumulates on a third-party server. The sheet is your data; the add-on is the extraction engine that fills it.
Next: From One Form at a Time to an Entire Season's Worth
Tax season compresses the year's most labor-intensive work into 12 weeks. Every minute not spent typing Box 1 through Box 6 is a minute available for the work that clients actually pay for: reviewing for missed deductions, identifying estimated tax payment gaps, or advising on entity structure changes for next year. The numbers on a W-2 are the inputs to that work — inputs that don't require professional judgment to transcribe. The add-on separates the mechanical step from the analytical one.
For firms looking to systematize the entire tax form intake pipeline — from client document drop-off through data verification to the final review screen — the collection-to-review workflow covers the full process. For a deeper look at the cost side, the cost of manual W-2 and 1099 entry breaks down the per-form economics. And if you're already extracting W-2 data into Excel rather than Sheets, the Excel-based extraction guide covers the web application workflow.
Install the add-on from the Google Workspace Marketplace, paste your API key, and run your first W-2 batch. One form or 50 — the sidebar handles the extraction while you handle the return.
Set up your API key