Carryover Seeds Bulk Cleanup — Correctness Gate Strengthening + Infrastructure Context Injection
Sprint 143 — Carryover Seeds Bulk Cleanup (Plan B + Seed #4 Option A)
Context
Bulk cleanup of 9 carryover seeds from Sprint 142 (prompt optimization) and Sprint 141 (infrastructure debt resolution) using group-by-group PR split strategy.
The key decision point was seed #4 (structural defect where ai-analysis worker receives no problem information and calls LLM with empty context). Sprint 142 only added guards on the prompt side, and without the infrastructure fix, the guards would trigger meaningless fallbacks only → core work of this sprint.
Merge Scope (Plan B — User Approved)
| Seed | Location | Priority | Status |
|---|---|---|---|
| #1 | ai-analysis correctness weight 30→40% | P2 | ✅ PR #198 |
| #2 | ai-analysis worker context absence guard | P1 | ✅ PR #199 |
| #3 | ai-analysis claude_client token truncation guard | P2 | ✅ PR #199 |
| #4 | submission entity + getProblemInfo (Option A) | P0 (core) | ✅ PR #200 |
| #6 | Calendar useLocale provider defense | P2 | ✅ PR #201 |
| #8 | E2E full integration PR auto comment | P2 | ✅ PR #202 |
Sprint 144 Carryover (3 items)
- Seed #5: UAT — Programmers resubmission scoring pass confirmation (user direct)
- Seed #7: prometheus-rules / dashboard automatic verification CI (
promtool check rules+ grafana JSON cross-check) - Seed #9: UAT — English environment calendar + production Grafana CB dashboard consistency (user direct)
Decisions
Seed #4 — Option A vs B (User Decision: Option A Adopted)
Option A (root resolution, adopted): Add nullable problem_title (VARCHAR 255) / problem_description (TEXT) columns to Submission entity + DB migration + submission.service's create() calls problem service and stores result in entity.
- Advantage: Permanent resolution once done, no additional HTTP load (only 1 call at creation time)
- Disadvantage: TypeORM migration + multiple spec mock updates needed → Critic multiple rounds (Sprint 134~135 saga pattern)
Option B (immediate, not adopted): ai-analysis worker directly calls problem service /internal/{id} + caching
- Advantage: No migration needed
- Disadvantage: HTTP call on every analysis + caching needed, increases problem service load
PR Split Strategy (Reusing Sprint 141 Pattern)
5 PR split to distribute merge burden. Each PR uses new branch + Squash merge.
| PR | Branch | Files | Critic |
|---|---|---|---|
| #198 | feat/sprint-143-weights | 3 (+20 -18) | Not invoked |
| #199 | feat/sprint-143-context-token-guard | 4 (+206) | Not invoked |
| #200 | feat/sprint-143-submission-problem-context | 7 (+127 -1) | 3 rounds |
| #201 | feat/sprint-143-calendar-fallback | 1 (+11 -1) | Not invoked |
| #202 | feat/sprint-143-e2e-pr-comment | 1 (+27) | Not invoked |
Core Changes
PR #198 — correctness Weight 30% → 40% Strengthening (Seed #1)
ALGORITHM_WEIGHTS: correctness 30→40%, efficiency 25→20%, bestPractice 15→10%SQL_WEIGHTS: correctness 30→40%, efficiency 20→15%, bestPractice 20→15%- Bulk synchronization of prompt body (SYSTEM_PROMPT/SQL_SYSTEM_PROMPT) + SSOT constants + test assertions
test_claude_client.pytotalScore=0 recalculation TC 3 expected values updated (68→70, 64→66, 62→65)
PR #199 — Token + Context Guard (Seeds #2, #3)
claude_client.CODE_LENGTH_THRESHOLD = 50000constant +analyze_code()code length pre-validationworker._on_message()optimizedCode fallback when problem context absent- Reuses Sprint 142 self-verification meta pattern — double guard strengthens correctness
- 3 token guard TCs + 3 context guard TCs added
PR #200 — Submission Entity Expansion + getProblemInfo (Seed #4 Option A)
submission.entity.ts:problem_title(VARCHAR 255 nullable),problem_description(TEXT nullable)migrations/20260508000000-AddProblemContextColumns.ts:ALTER TABLE ... ADD COLUMN IF NOT EXISTS(Expand-Contract compatible)ProblemServiceClient.getProblemInfo()new method — protected by host single CB,_dispatch/_fallbackextendedsubmission.service.create()callscheckLateSubmission+getProblemInfoin parallel withPromise.all- ai-analysis worker uses new fields automatically included in
/internal/{id}response entity (0 worker code changes)
PR #201 — Calendar useLocale Provider Defense (Seed #6)
- Wrapped
useLocale()call in try-catch so that throw outside NextIntlClientProvider (Storybook/tests) falls back to ko - ESLint
react-hooks/rules-of-hooksdisable + intent comment (hook order is invariant since it's environment-dependent within the same tree)
PR #202 — E2E Full Integration PR Auto Comment (Seed #8)
- Added 2
actions/github-scriptsteps toe2e-testjob - On start: "E2E Integration Test in progress" + workflow run link
- On failure: "E2E Integration Test failed" + artifact download guidance
- Added
pull-requests: writepermission
Critic Verification (PR #200 — 3 Rounds)
Round 1 — 1 P1 + 1 P2 Caught
- P1:
getProblemInfonot defined inmockProblemServiceClientfactory in 3 specs (saga-orchestrator.service.spec.ts,ai-satisfaction.spec.ts,submission.service.spec.ts) → allservice.create()call tests throwTypeError - P2:
checkLateSubmission+getProblemInfocalled serially → 2× timeout (10 seconds) in incident scenarios
Resolution: Add getProblemInfo: jest.fn().mockResolvedValue({title:'', description:''}) to all 3 spec mocks + parallelize with Promise.all. Safe to parallelize since both immediately fallback if same host single CB is OPEN.
Round 2 — 1 P2 Caught
- P2:
problemTitle: problemTitle || null→ SQL NULL → ai-analysis workersubmission.get()returns None → prompt builder serializes asf"Description: {problem_description}"→ "Description: None" string sent to LLM
Resolution: Normalize with ?? '' (preserve empty string). Entity keeps nullable (migration compatible), all new rows stored as string.
Round 3 — Clean Pass ✅
"The changes appear consistent with the existing submission flow: the new columns are added via migration, populated on create with safe fallbacks, and the ProblemService client extension matches the current internal Problem Service contract. I did not identify a discrete regression or blocking issue in the diff relative to the base branch."
Verification
- All CI GREEN — Quality (submission/frontend/ai-analysis) + Test (all services) + Coverage Gate + E2E Programmers Full Flow
- 0 jest regressions — all create() tests pass with updated submission service spec
- 0 pytest regressions — 6 new ai-analysis TCs (3 token + 3 context) + 3 weight assertion updates
New Policies / Patterns
1. Bulk Update Pattern for Dependent Tests on Score Distribution Changes
When rebalancing weights (seed #1), updating only direct weight assertions in test_prompt.py is insufficient. The totalScore=0 recalculation TCs in test_claude_client.py also depend on weights, so they must be updated together. Detected via follow-up commit in Sprint 143 PR #198 → recommend pre-grepping all weighted average TCs when SSOT weights change going forward.
2. Mock Missing Pattern on saga + DB Schema Changes
When adding a new method to ProblemServiceClient, the same mock must be added to mockProblemServiceClient factories in all 3 specs. Same as Sprint 141's pattern of bulk updating 4 monitoring files on schema change — grep all mock factories when adding client methods is mandatory.
3. External Call Serialization Latency Review Obligation
When adding a new external service call, check whether sequential await with existing calls to the same host doubles the timeout. CB OPEN causes both to immediately fallback, so Promise.all parallelization is safe. Critic R1 P2 pattern.
4. SQL NULL → Other Language Serialization Regression Pattern
TypeScript null → PostgreSQL NULL → Python None → f-string "None". Nullable fields crossing language boundaries can reach user-exposed paths like prompts/logs → empty string normalization recommended (entity stays nullable for migration compatibility).
5. React Hooks try-catch Pattern (Provider Absence Defense)
Hooks like useLocale() that throw when provider is absent conflict with ESLint react-hooks/rules-of-hooks. Since the throw/non-throw behavior is the same every render within the same component tree, hook order is invariant. Resolved with ESLint disable + intent comment.
Post-Sprint Retrospective
What Went Well
- User-decision-first flow: Seed #4 Option A/B decision delegated to user in advance → explicit decision point separated in plan → no hesitation during execution
- Critic multiple round effectiveness reconfirmed: R1 (mock + latency), R2 (null serialization), R3 (clean) in PR #200 — same as Sprint 142 5-round pattern
- PR split strategy reuse: Sprint 141's 7 PR pattern applied with 5 PRs — right size
- Branch discipline ✅: 9 consecutive sprints compliant (since Sprint 134 violation) — all 5 PRs use new branches + Squash merge
Room for Improvement
- PR #200 lint/typecheck pre-check absent: The 3 spec mock omissions couldn't be pre-verified due to local Python 3.9 environment limitations, but TypeScript could have been → auto-mock search grep when adding new client methods should be mandated (Sprint 144 seed)
- User direct UAT dependency: Seeds #5 (Programmers resubmission) and #9 (English calendar + Grafana) both require user direct visual verification. Automation to be reviewed in Sprint 144+.
Sprint 144 Carryover Seeds
Sprint 143 New (found during this work)
- New seed (P3): Add TypeScript client method mock factory auto-detection lint or CI grep
- New seed (P3): Auto-detect dependent TCs on score system change (SSOT weight single-source + recalculation TC auto-consistency)
Sprint 142~141 Remaining
- Seed #5 (UAT): Actual Programmers resubmission → scoring pass confirmation (user direct)
- Seed #7 (P2): prometheus-rules / dashboard automatic verification CI —
promtool check rules+ grafana JSON cross-check - Seed #9 (UAT): English environment calendar + production Grafana CB dashboard consistency (user direct)