My App
Email Ingestion

Document Types

Understanding document classification and supported document types

Document Types

The Email Ingestion service automatically classifies incoming documents into specific types for appropriate processing and NetSuite record creation.

Supported Types

Vendor Bill

NetSuite Record: vendorbill

A vendor bill is an invoice FROM a vendor/supplier TO your company. You are the buyer receiving this invoice and will need to pay it.

Key Indicators:

  • "Invoice" or "Bill" in header
  • Your company listed as "Bill To" or "Ship To"
  • Supplier/vendor letterhead and branding
  • Payment terms (Net 30, Due on Receipt, etc.)
  • Items or services you purchased

Extracted Fields:

FieldDescriptionRequired
vendorNameName of the vendor/supplierYes
invoiceNumberVendor's invoice numberYes
invoiceDateDate on the invoiceYes
dueDatePayment due dateNo
poNumberRelated purchase order numberNo
subtotalAmount before taxYes
taxAmountTax amountNo
totalTotal amount dueYes
currencyCurrency code (USD, EUR, etc.)No
lineItemsIndividual line itemsNo

Line Item Fields:

interface VendorBillLineItem {
  itemName: string;
  description?: string;
  quantity: number;
  rate: number;
  amount: number;
  taxCode?: string;
}

Customer Invoice

NetSuite Record: invoice

A customer invoice is an invoice FROM your company TO a customer. You are the seller billing the customer.

Key Indicators:

  • Your company logo and letterhead
  • Customer name in "Bill To" section
  • Your terms and conditions
  • Your bank details for payment

Extracted Fields:

FieldDescriptionRequired
customerNameName of the customerYes
invoiceNumberYour invoice numberYes
invoiceDateDate you issued the invoiceYes
dueDateWhen payment is expectedNo
subtotalAmount before taxYes
taxAmountTax chargedNo
totalTotal amount billedYes
lineItemsProducts/services soldNo

Purchase Order

NetSuite Record: purchaseorder

A purchase order is a request to purchase goods or services. This is NOT an invoice - it's created before the purchase is fulfilled.

Key Indicators:

  • "Purchase Order" or "PO" prominently displayed
  • PO number (e.g., "PO-2024-001")
  • Delivery/shipping instructions
  • No payment terms (yet)
  • Status indicators (Draft, Pending, Approved)

Extracted Fields:

FieldDescriptionRequired
poNumberPurchase order numberYes
vendorNameVendor to order fromYes
orderDateDate PO was createdYes
expectedDateExpected delivery dateNo
shipToAddressDelivery addressNo
subtotalOrder subtotalYes
totalOrder totalYes
lineItemsItems being orderedYes

Expense Report

NetSuite Record: expensereport

An expense report is a collection of receipts and expenses submitted by an employee for reimbursement.

Key Indicators:

  • Multiple receipts or transactions
  • Employee name and department
  • Expense categories (Travel, Meals, Office Supplies)
  • Reimbursement request form
  • Approval signatures

Extracted Fields:

FieldDescriptionRequired
employeeNameEmployee submitting expensesYes
reportDateDate of submissionYes
reportPeriodPeriod covered (e.g., "Nov 2024")No
totalAmountTotal reimbursement amountYes
expenseItemsIndividual expensesYes

Expense Item Fields:

interface ExpenseItem {
  date: string;
  merchant: string;
  category: string;    // 'travel' | 'meals' | 'supplies' | 'other'
  description?: string;
  amount: number;
  currency: string;
  receiptAttached: boolean;
}

Receipt

No NetSuite Record (may attach to expense report)

A receipt is a simple proof of purchase, typically from a retail transaction.

Key Indicators:

  • Single transaction
  • Point-of-sale format
  • Store name and address
  • Date, time, and transaction ID
  • Payment method shown

Extracted Fields:

FieldDescriptionRequired
merchantNameStore/vendor nameYes
transactionDateDate of purchaseYes
transactionTimeTime of purchaseNo
subtotalAmount before taxNo
taxTax amountNo
totalTotal paidYes
paymentMethodCash, Card ending in XXXXNo
itemsItems purchasedNo

General Request

No NetSuite Record

This category is used when:

  • The email has no attachments
  • The document type cannot be determined
  • The confidence score is too low
  • The document doesn't match any known type

Handling:

  • Sent to the review queue
  • No automatic processing
  • Manual classification required

Classification Process

AI Classification Pipeline

flowchart TD
    A[Email Received] --> B{Has Attachments?}
    B -->|No| C[General Request]
    B -->|Yes| D[Extract Document]
    D --> E[Email Metadata Analysis]
    E --> F[Document Content Analysis]
    F --> G[AI Classification]
    G --> H{Confidence >= 0.7?}
    H -->|Yes| I[Assign Document Type]
    H -->|No| J[Human Review Queue]
    I --> K[Route to Extractor]

Metadata Analysis

The classifier first analyzes email metadata:

// Quick classification from email only
const quickResult = classificationService.classifyByEmailMetadata({
  subject: "Invoice #12345 from Acme Corp",
  senderEmail: "billing@acme.com",
  hasAttachments: true
});
// Returns: { documentType: 'vendor_bill', confidence: 0.6 }

Pattern Matching:

PatternInferred TypeConfidence
Subject contains "invoice" or "bill"vendor_bill0.6
Subject contains "purchase order"purchase_order0.7
Subject contains "expense"expense_report0.6
Sender contains "billing@"vendor_bill0.5

Content Analysis

For higher confidence, the AI analyzes document content:

const result = await classificationService.classify({
  document: pdfBuffer,
  mimeType: 'application/pdf',
  emailSubject: subject,
  emailBody: body,
  senderEmail: sender,
  orgId: 'org_123'
});

Analysis includes:

  • Document layout and structure
  • Header text and titles
  • Key phrases and terminology
  • Address blocks (Bill To vs Ship To)
  • Amount patterns and totals
  • Company logos and branding

Confidence Scoring

Score RangeMeaningAction
0.9 - 1.0Very high confidenceAuto-process
0.7 - 0.9High confidenceProcess with flag
0.5 - 0.7Medium confidenceReview recommended
0.0 - 0.5Low confidenceHuman review required

Distinguishing Similar Documents

Vendor Bill vs Customer Invoice

The key distinction is direction of money flow:

AspectVendor BillCustomer Invoice
You areThe buyerThe seller
Bill ToYour companyCustomer
LetterheadVendor'sYour company's
ActionYou pay themThey pay you

Classification Hints:

  • Check which company's logo/letterhead appears
  • Look at the "Bill To" address
  • Identify who is requesting payment

Vendor Bill vs Purchase Order

AspectVendor BillPurchase Order
StatusAfter purchaseBefore purchase
PurposeRequest paymentRequest goods
ContainsPayment termsDelivery dates
Created byVendorBuyer

Receipt vs Vendor Bill

AspectReceiptVendor Bill
FormatPoint-of-saleBusiness document
ItemsRetail productsServices/supplies
PaymentAlready paidPayment due
DetailMinimalComprehensive

Custom Document Types

For specialized documents, create custom extractors:

// Custom document type
const customExtractor = await fetch('/api/extractors', {
  method: 'POST',
  body: JSON.stringify({
    name: 'Contract Agreement',
    documentType: 'custom',
    fieldDefinitions: [
      { name: 'contractNumber', type: 'text', required: true },
      { name: 'parties', type: 'array', required: true },
      { name: 'effectiveDate', type: 'date', required: true },
      { name: 'termLength', type: 'text', required: false }
    ]
  })
});

Best Practices

Improving Classification Accuracy

  1. Use clear email subjects: Include document type keywords
  2. Maintain sender consistency: Consistent vendor email addresses
  3. High-quality documents: Clear, readable scans
  4. Configure overrides: Known senders with known document types

Handling Low Confidence

When classification confidence is low:

  1. Document goes to review queue
  2. User manually selects document type
  3. System learns from corrections
  4. Future similar documents classified better

Multi-Document Emails

If an email contains multiple document types:

// Each attachment is classified independently
{
  emailId: 'email_123',
  attachments: [
    { filename: 'invoice.pdf', documentType: 'vendor_bill', confidence: 0.92 },
    { filename: 'receipt.jpg', documentType: 'receipt', confidence: 0.88 }
  ]
}