You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 

50 lines
2.0 KiB

K-1 Test Fixture: Scanned PDF
================================
This file documents the expected test data for a scanned (image-based) K-1 PDF.
Replace this file with an actual scanned PDF for integration testing.
Expected Extraction Method: azure (Tier 2) or tesseract (Tier 2 fallback)
Expected Confidence: MEDIUM (0.60-0.84) for most fields due to OCR uncertainty
--- Form Header ---
Schedule K-1 (Form 1065)
Partner's Share of Income, Deductions, Credits, etc.
Tax Year: 2023
Partnership EIN: 55-1234567
Partnership Name: Scanned Capital Fund, LP
Partner Name: Member Entity Inc.
Partner EIN: 77-9876543
--- Part III: Partner's Share ---
Box 1 - Ordinary business income (loss): -32,500
Box 2 - Net rental real estate income (loss): 0
Box 3 - Other net rental income (loss): 0
Box 4 - Guaranteed payments for services: 0
Box 5 - Interest income: 2,100
Box 6a - Ordinary dividends: 5,800
Box 6b - Qualified dividends: 4,200
Box 7 - Royalties: 0
Box 8 - Net short-term capital gain (loss): -1,500
Box 9a - Net long-term capital gain (loss): 18,750
Box 9b - Collectibles (28%) gain (loss): 0
Box 9c - Unrecaptured section 1250 gain: 0
Box 10 - Net section 1231 gain (loss): 0
Box 11 - Other income (loss): 0
Box 12 - Section 179 deduction: 0
Box 13 - Other deductions: -2,800
Box 14 - Self-employment earnings (loss): 0
Box 15 - Credits: 0
Box 16 - Foreign transactions: 0
Box 17 - Alternative minimum tax (AMT) items: 0
Box 18 - Tax-exempt income and nondeductible expenses: 750
Box 19a - Distributions (cash): 25,000
Box 19b - Distributions (property): 0
Box 20 - Other information: 0
Box 21 - Foreign taxes paid or accrued: 350
--- OCR Simulation Notes ---
This fixture simulates a scanned PDF where:
- Some numeric values may have OCR artifacts (e.g., "l" vs "1", "O" vs "0")
- Confidence scores should reflect Tier 2 extraction uncertainty
- The Azure DI or tesseract extractors handle these ambiguities
- Expected to generate MEDIUM confidence for most fields