Beyond Search: The Recommendation Problem
Traditional e-commerce search works like this: user types keywords, system returns matching products. It's functional but limited.
An AI product recommender understands context:
"I'm hosting a dinner party for 8 people this weekend, budget about $200 for everything"
A search engine sees keywords: "dinner", "party", "8", "$200". An AI recommender understands: hosting event, group size, budget constraint, occasion type — and can recommend a coherent set of products that work together.
Architecture
The recommender system has four components:
User Query → Intent Classification → Product Retrieval → LLM Recommendation → Response
1. Intent Classification
The fine-tuned LLM classifies the user's intent:
- Direct search — "Do you have product X?" → Search for specific product
- Category browse — "Show me red wines" → Filter by category
- Recommendation — "What goes well with steak?" → Multi-factor recommendation
- Comparison — "What's the difference between A and B?" → Retrieve both, compare
- Education — "How does product X work?" → Use knowledge from training data
2. Multi-field Product Retrieval
The retrieval system uses weighted BM25 search across multiple product fields:
Field weights:
name: 3x (highest priority)
brand: 2x
category: 2x
style/type: 2x
taste/flavor: 1x
description: 1x
For a query like "something smooth and fruity", the system searches across taste profiles and style descriptions, not just product names.
3. Category Alias Mapping
Users don't speak in database categories. A mapping layer translates natural language:
{
"something refreshing": ["sparkling", "light wines", "citrus"],
"for cooking": ["cooking wines", "oils", "vinegars"],
"gift idea": ["gift sets", "premium", "popular"],
"budget option": ["sort: price_asc", "value category"]
}
This layer catches intent that keyword search alone would miss.
4. LLM-Powered Recommendation
The LLM receives retrieved products and generates a contextual recommendation:
[System] You are a product expert. Use the provided products to answer.
[Context] Products: [{name, price, category, description, taste}...]
[User] I'm hosting a dinner party for 8...
[Assistant] For a dinner party of 8, I'd recommend...
The model's fine-tuning teaches it how to recommend — considering complementary products, occasion appropriateness, and budget constraints.
Training the Recommendation Engine
The key is diverse training data covering recommendation scenarios:
Occasion-Based Recommendations (~400 samples)
User: "I need something for a birthday celebration"
Assistant: "For a birthday, here are some festive options..."
Budget-Conscious Recommendations (~200 samples)
User: "Best option under $30?"
Assistant: "In that range, I'd recommend [product] because..."
Taste-Based Recommendations (~300 samples)
User: "I prefer something not too sweet"
Assistant: "Based on your preference, these have a drier profile..."
Comparison Scenarios (~200 samples)
User: "What's the difference between A and B?"
Assistant: "Great comparison! A is [characteristics] while B is [characteristics]. Choose A if..., choose B if..."
Cross-Selling and Pairing
The most powerful feature: recommending products that work together.
Training includes pairing patterns:
User: "I'm buying [Product A]"
Assistant: "Excellent choice! [Product A] pairs beautifully with [Product B] because...
You might also enjoy [Product C] — many customers who love [A] discover [C]."
The model learns natural cross-selling that feels helpful, not pushy.
Rich Product Context
The difference between a mediocre and excellent recommendation often comes down to how much context the model has about each product.
Minimal context (poor recommendations):
{"name": "Product X", "price": 29.99}
Rich context (excellent recommendations):
{
"name": "Product X",
"price": 29.99,
"category": "Category A",
"style": "Classic, elegant",
"flavor_profile": "Rich, complex, with notes of...",
"best_for": "Special occasions, gifts",
"pairs_with": "Category B products",
"description": "Award-winning product known for..."
}
Each product gets 60-90 tokens of context, giving the model enough information to make nuanced recommendations.
Performance Metrics
From production deployment:
| Metric | Value |
|---|---|
| Query to recommendation | 0.8-1.5 seconds |
| Product retrieval accuracy | 92% (top 3 relevant) |
| Recommendation relevance | 88% (user satisfaction) |
| Cross-sell click-through | 15-25% |
| Products in response | 2-5 per query |
| RAG lookup time | < 1ms |
Edge Cases and Safety
A production recommender must handle:
- Out-of-stock items — Don't recommend unavailable products
- Price sensitivity — Never suggest items above stated budget
- Unknown products — Refuse to hallucinate products not in the catalog
- Competitor mentions — Redirect to your own alternatives
- Inappropriate requests — Gracefully decline and redirect
Each edge case needs explicit training samples. Even 10-20 samples per case dramatically improves handling.
Building Your Own
Step 1: Prepare Your Product Data
Export your catalog with rich attributes: name, price, category, description, and any domain-specific fields (flavor, material, size, use case).
Step 2: Generate Training Data
- Product Q&A pairs (automated from catalog)
- Recommendation scenarios (manually written)
- Comparison dialogues (manually written)
- Edge cases (manually written)
Target: 5,000-25,000 total samples.
Step 3: Fine-tune
Use Unsloth + LoRA for efficient training. 5 hours on a single GPU.
Step 4: Build RAG Pipeline
BM25 search with weighted fields and category aliases.
Step 5: Deploy and Iterate
Log all interactions. Review weekly. Add training samples for cases the model handles poorly. Retrain monthly.
The result: an AI assistant that provides expert-level product recommendations, available 24/7, at near-zero marginal cost.
Want this built for your business? See how it works — we handle the full stack, from data prep to deployment.