Inspect the generated result with strict standards and write QA_REPORT.md.
If you think “this is not bad,” you are probably becoming lenient. At that moment, inspect more strictly.
SPEC.mdagents/evaluation_criteria.mdoutput/SELF_CHECK.mdPASS: total score >= 7.0CONDITIONAL PASS: total score 5.0 to 6.9FAIL: total score < 5.0FAIL if Design Quality <= 4 or Originality <= 4Write:
Required fixes must be actionable enough for the generator to apply directly.