Visual AI Comparison
Transform drawing comparison from tedious manual overlay to instant visual intelligence. Our cascade alignment system achieves 99%+ accuracy across scale variations, rotations, and different scan qualities.
The Overlay Pipeline
Five stages transform two document revisions into an intelligent visual comparison.
Fuzzy Matching
Intelligent pairing of sheets across revisions
Fuzzy Sheet Matching
Real-world documents use inconsistent naming. Our normalization algorithm handles revision prefixes, OCR errors, and naming variations automatically.
Document A (Old)
Document B (New)
Fuzzy Matching Rules
Revision Prefix
4-A-101 → A-101
OCR Error (O → 0)
MH1O-A → MH10-A
Case Matching
a-101 → A-101
Cascade Alignment Strategy
Four alignment methods tried in sequence, each with increasing robustness. The system stops when confidence exceeds 88%.
Scale-Invariant Features
Robust feature detection across scales and rotations
0%
confidence
Accelerated KAZE
Fast nonlinear scale space for efficiency
0%
confidence
Oriented FAST/BRIEF
Ultra-fast binary descriptor matching
0%
confidence
Phase Correlation
Frequency-domain alignment with grid search
0%
confidence
Drawing-Focused Fallback
When initial alignment confidence is low (<88%), the system applies drawing masks to focus feature detection on actual CAD content, avoiding misleading matches from title blocks and sheet numbers.
Change Detection
Red/blue pixel-level comparison with connected components analysis groups nearby changes into meaningful regions.
Wall Removed
Room Added
Modified
Clip-Based AI Analysis
Instead of analyzing the full image, we extract focused clips around detected regions. This grounds the AI analysis and enables artifact filtering.
Load-bearing wall removed between rooms 101 and 102. Requires structural review.
New storage room added with dimensions 12' x 15'. Includes electrical outlet.
Minor drafting artifact from scan alignment. Not a real change.
Two-Phase Architecture
Visual overlay returns instantly while AI analysis runs asynchronously, unblocking the user interface immediately.
Phase 1: Visual Overlay
~2-5 secondsLambda function handles image alignment, generates red/blue overlay, and extracts modification regions.
Phase 2: AI Analysis
~10-30 seconds (async)Vision LLM analyzes extracted clips, provides semantic descriptions, filters artifacts, and assesses severity.
Instant Feedback
Visual overlay ready in seconds
Non-Blocking
AI runs asynchronously
Progressive Enhancement
Insights arrive as ready
99%+
Alignment Success
~5s
Visual Overlay
4
Cascade Methods
88%
Confidence Threshold