# AI Finance Classification Agent

## Overview

ExpenseWise features an intelligent AI agent that automatically classifies business expenses by assigning:
- **Department codes** (business unit responsible)
- **Account codes** (type of expense)
- **Split suggestions** (when transactions should be divided between departments)

The agent uses Google Vertex AI (Gemini) and **learns from user corrections** to improve over time.

---

## How It Works

### 1. **On Statement Import**
When a CSV statement is imported, each transaction is automatically queued for AI classification in the background:

```php
// Triggered automatically in StatementImportService
ClassifyTransaction::dispatch($transaction->id, hasReceipt: false);
```

### 2. **On Receipt Match**
When a receipt is matched to a transaction, the AI **re-classifies with receipt context**:

```php
// Triggered automatically in ReceiptMatchingService
ClassifyTransaction::dispatch($transaction->id, hasReceipt: true);
```

The receipt provides:
- Line-by-line item descriptions
- Item categories
- More detailed merchant information
- Discount information

### 3. **User Learning**
When users correct AI suggestions, the system **learns and remembers**:

- Merchant patterns: "This merchant is always Department X"
- Category patterns: "This category is usually Account Y"
- Location patterns: Implicit learning from foreign transactions
- User-specific preferences

Learning data is cached per-user and persists for 1 year.

---

## AI Classification Logic

### Input Data
The AI agent receives:
- **Transaction details**: Merchant name, amount, date, location, MCC, category
- **Receipt data** (if matched): Line items, amounts, categories, notes
- **User learning history**: Past corrections and patterns

### Output
```json
{
  "department_id": 3,
  "account_id": 12,
  "confidence": 0.85,
  "reasoning": "Business travel to conference venue",
  "should_split": true,
  "suggested_splits": [
    {
      "department_id": 3,
      "account_id": 12,
      "amount": 150.00,
      "description": "Conference registration",
      "reasoning": "Training expense"
    },
    {
      "department_id": 2,
      "account_id": 8,
      "amount": 75.00,
      "description": "Hotel stay",
      "reasoning": "Travel accommodation"
    }
  ]
}
```

### Confidence Thresholds
- **High (0.8-1.0)**: Auto-applied to transaction
- **Medium (0.5-0.79)**: Suggested but requires review
- **Low (0.0-0.49)**: Not applied, flagged for manual review

Only classifications with ≥60% confidence are auto-applied.

---

## Smart Split Detection

The AI automatically suggests splits when:

1. **Receipt has multiple departments**
   - Example: Fuel (Operations) + Shop items (Staff welfare)
   - Example: Office supplies (Admin) + Cleaning (Facilities)

2. **Mixed business categories**
   - Example: Conference registration (Training) + Hotel (Travel)
   - Example: Hardware (IT) + Software subscription (SaaS)

3. **User patterns indicate splitting**
   - Learned from past user split behavior

---

## User Interface Indicators

### AI Suggestion Badge
Transactions with AI classifications show a purple lightbulb icon:

```
[OH ▼] 💡 (85% confidence)
```

Hover to see:
- Confidence level
- Triggering event (import/receipt_match)
- Reasoning

### Split Suggestions
When AI detects multi-department transactions:
1. "Use receipt lines" button auto-creates splits
2. Each split shows suggested department/account
3. Users can accept, modify, or reject

---

## Learning System

### How It Learns

The `StatementTransactionObserver` watches for user corrections:

```php
// User changes department from 1 → 3
// System records:
- Merchant: "AMAZON WEB SERVICES"
- Original classification: Department 1
- User correction: Department 3
- Context: Category "Cloud Services", Amount ~£50-200
```

### Pattern Recognition

After ~3-5 corrections for the same merchant/category:
- Future classifications prioritize user's preference
- Confidence scores increase
- Similar merchants benefit from learned patterns

### Data Storage

Learning data is stored in Laravel cache:
```
Key: finance_learning:user:{user_id}
TTL: 1 year
Structure:
- corrections[]: Array of past corrections
- patterns{}: Merchant/category → department/account mappings
```

---

## Technical Implementation

### Key Components

1. **`FinanceClassificationAgent`** (`app/Services/AI/`)
   - Main classification logic
   - Prompt engineering
   - Learning data integration

2. **`ClassifyTransaction` Job** (`app/Jobs/`)
   - Background queue processing
   - Handles classification requests
   - Updates transaction records

3. **`StatementTransactionObserver`** (`app/Observers/`)
   - Watches for user edits
   - Captures corrections
   - Feeds learning system

4. **Integration Points**
   - `StatementImportService`: Triggers on import
   - `ReceiptMatchingService`: Triggers on match
   - `AppServiceProvider`: Registers observer

### Configuration

```php
// config/vertex.php
'models' => [
    'classify' => env('VERTEX_MODEL_CLASSIFY', 'gemini-1.5-flash'),
],
```

### Queue Processing

Make sure queue workers are running:
```bash
php artisan queue:work
```

---

## Example Use Cases

### Case 1: Import with Immediate Classification
1. User imports December 2024 statement (50 transactions)
2. 50 background jobs queued
3. AI classifies each based on:
   - Merchant category (e.g., "Restaurants" → Staff Welfare)
   - MCC codes (e.g., 5812 = Eating Places)
   - Location (e.g., Foreign = likely Travel)
4. High-confidence results auto-applied
5. Low-confidence flagged for review

### Case 2: Receipt Match Improves Classification
1. Transaction: "AMAZON" £45.99 → Auto-classified as "Office Supplies"
2. User uploads receipt → Shows "AWS Lambda compute"
3. AI re-classifies → "IT Services / Cloud Infrastructure"
4. Future "AMAZON" transactions checked against learned patterns

### Case 3: User Correction Learning
1. AI classifies "STARBUCKS" as "Staff Welfare"
2. User corrects to "Client Entertainment"
3. System learns: This user treats coffee shops as client expenses
4. Next "STARBUCKS" transaction → AI suggests "Client Entertainment"
5. Confidence increases with each confirmation

---

## Performance & Scaling

- **Background processing**: No user-facing delays
- **Batch processing**: Can handle 100+ transactions/minute
- **Caching**: Learning data cached to minimize DB queries
- **Cost-efficient**: Uses fast Gemini models (~$0.001/transaction)

---

## Future Enhancements

Potential improvements:
- [ ] Global learning (cross-user patterns with privacy)
- [ ] Seasonal pattern detection (Q4 = conferences)
- [ ] Amount-based classification (£5 vs £500 = different categories)
- [ ] Multi-language merchant name handling
- [ ] Export classification reports for auditing

---

## Monitoring

Check classification success rate:
```bash
# View recent classifications
tail -f storage/logs/laravel.log | grep "Transaction classified"

# Check learning data
php artisan tinker
>>> Cache::get('finance_learning:user:1');
```

---

## Troubleshooting

### Classification Not Running
```bash
# Check queue is working
php artisan queue:work

# Check for failed jobs
php artisan queue:failed
```

### Low Accuracy
- Ensure Vertex AI API key is configured
- Check learning data has sufficient corrections (~5-10 minimum)
- Review AI reasoning in transaction metadata

### Clear Learning Data
```php
// Reset user learning
Cache::forget('finance_learning:user:{user_id}');
```

---

Built with ❤️ using Laravel, Livewire, and Google Vertex AI (Gemini)

