Evaluation Criteria
Shared rubric for planner, generator, and evaluator.
Weighted Categories
Design Quality — 40%
Pass examples:
- Cohesive palette
- Intentional typography
- Strong visual hierarchy
Fail examples:
- Generic gradient hero
- Repetitive white-card layout
- No visual identity
Originality — 30%
Pass examples:
- Hard to mistake for generic AI output
- Memorable composition or interaction
Fail examples:
- Bootstrap-like default structure
- Obvious template feel
Technical Completion — 15%
Pass examples:
- Responsive layout works
- No obvious console errors
- Core interactions function
Fail examples:
- Mobile layout breaks
- Broken links
- Script errors
Functionality — 15%
Pass examples:
- Navigation is understandable
- CTA is easy to find
- Content supports the goal
Fail examples:
- User cannot tell what to do next
- Important flow is missing
Final Thresholds
PASS: total >= 7.0
CONDITIONAL PASS: total 5.0 to 6.9
FAIL: total < 5.0
FAIL automatically if Design Quality or Originality is <= 4