itsulu-blog-publisher/PHASE3_ROADMAP.md
Nicholas Riegel d122b773d4 feat: integrate Runboat E2E testing and performance tests into CI/CD pipeline
Updated .gitlab-ci.yml with complete Phase 3 pipeline stages:

New Stages Added:
- preview: Runboat API call to create ephemeral preview instance
- e2e: Playwright E2E tests against Runboat preview
- performance: Server-side performance benchmarks (latency, queries, tokens)

Pipeline Changes:
- runboat_preview job: Requests preview build, extracts URL, posts MR comment
- e2e_tests job: Runs 19 Playwright scenarios against preview URL
- performance_tests job: Runs 7 performance benchmark tests locally
- All jobs include artifacts (HTML reports, traces) for debugging

Job Dependencies:
- e2e_tests needs runboat_preview (waits for preview URL)
- performance_tests runs in parallel with build stage
- All new jobs only on merge_requests (not main/daily)

New Required CI/CD Variables:
- RUNBOAT_API_URL: Runboat API endpoint (secret)
- RUNBOAT_TOKEN: Bearer token for Runboat (secret)
- GITLAB_BOT_TOKEN: GitLab bot token for MR comments (secret)

Updated PHASE3_ROADMAP.md with:
- Runboat setup instructions
- CI/CD variable requirements and how to obtain
- Complete YAML snippets (already in .gitlab-ci.yml)
- Pipeline flow diagram
- Estimated total pipeline time: ~35 minutes

Non-blocking failures:
- runboat_preview: allow_failure=true (Runboat might be unavailable)
- e2e_tests: allow_failure=true (E2E informational, doesn't block merge)
- performance_tests: allow_failure=false (must pass)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-30 00:54:59 -04:00

346 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 3: Runboat E2E Testing and Performance Benchmarks
**Status**: In Progress
**Start**: 2026-05-30
**Target**: E2E coverage + performance SLOs met
## Goals
### 1. E2E Test Coverage (1030 scenarios)
Critical user journeys verified via Playwright:
- [ ] User generates blog post on-demand
- [ ] User schedules daily blog generation
- [ ] User views generation logs and retries failed attempts
- [ ] User edits social media copy before publication
- [ ] User views published post with correct SEO fields
- [ ] User receives notification email with correct content
- [ ] System recovers gracefully from LLM API errors
- [ ] Multiple users generate posts concurrently (collision handling)
### 2. Performance Benchmarks
Establish baseline metrics for:
- **Generation Latency**: Time from wizard click to post created
- Target: < 30 seconds (including LLM API call)
- Measure: P50, P95, P99
- **Token Efficiency**: Tokens used per blog post
- Target: 8001200 tokens for ~800-word post
- Baseline: Record for cost optimization
- **Database Query Count**: N+1 detection
- Target: < 50 queries per generation
- Tool: assertQueryCount() on hot paths
- **Throughput**: Concurrent generations
- Target: 5+ simultaneous posts without degradation
- Stress test: 10 parallel schedule slots
- **Memory Usage**: Peak RSS during generation
- Target: < 500 MB per Odoo process
### 3. Load Testing
Simulate production scenarios:
- [ ] 100 pending topics in queue
- [ ] 3 active schedule slots all triggering within 5 minutes
- [ ] 5 concurrent users generating posts
- [ ] Template DB priming time baseline
## Implementation Plan
### Layer 1: Runboat Setup & E2E Infrastructure
```bash
# 1. Create e2e/ directory structure
e2e/
├── conftest.py # Session/auth fixtures, Runboat polling
├── test_generation.py # On-demand generation workflow
├── test_scheduling.py # Schedule slot execution
├── test_notifications.py # Email and social copy
├── test_error_recovery.py # API errors and retries
└── requirements.txt # pytest, playwright
# 2. Set up conftest.py with:
# - wait_for_odoo(url) polling
# - auth_state fixture (admin login)
# - page fixture (authenticated Playwright context)
# - BASE_URL from env var or CI
# 3. Create .gitlab-ci.yml runboat stage:
runboat_preview:
stage: preview
script: |
curl -X POST $RUNBOAT_URL/builds \
-H "Authorization: Bearer $RUNBOAT_TOKEN" \
-d "{\"repo\":\"$CI_PROJECT_PATH\",\"sha\":\"$CI_COMMIT_SHA\"}"
```
### Layer 2: E2E Test Scenarios (1020 tests)
**Generation Workflow** (3 tests):
```python
def test_user_generates_blog_post_on_demand(page):
# Navigate to wizard
# Fill topic, select provider, set auto-publish
# Click Generate
# Assert blog.post created with title + body
# Assert email sent to configured recipient
def test_user_saves_post_as_draft_for_review(page):
# Same as above but auto_publish=False
# Assert post is not published
def test_generation_fails_gracefully_with_api_error(page):
# Trigger with invalid API key
# Assert error message displayed
# Assert "Retry" button visible on log
```
**Scheduling Workflow** (2 tests):
```python
def test_user_configures_daily_schedule_slot(page):
# Navigate to schedule slots
# Create morning, afternoon, evening slots
# Set LLM provider and model
# Toggle auto-publish per slot
# Save and verify all 3 slots active
def test_user_monitors_generation_logs(page):
# View all generation logs
# Filter by state (success/error)
# Click retry on failed log
# Verify retry increments attempt counter
```
**Email & Social** (2 tests):
```python
def test_email_contains_post_title_and_social_copy(page):
# Generate and publish post
# Check generated email in outbox
# Verify subject contains blog name + post title
# Verify body contains social platforms (X, BlueSky, Mastodon, LinkedIn)
def test_user_edits_social_copy_before_publishing(page):
# Generate as draft
# Edit social media copy for each platform
# Save and publish
# Verify email uses edited copy
```
**Error Recovery** (2 tests):
```python
def test_user_retries_failed_generation(page):
# Trigger generation with bad API key
# Log shows error state
# Fix API key in Settings
# Click Retry on log
# Verify post created successfully
def test_schedule_slot_continues_after_api_error(page):
# Set invalid API key on schedule slot
# Slot executes, fails, logs error
# Fix API key
# Wait for next slot time
# Verify next generation succeeds
```
**Concurrency** (12 tests):
```python
def test_multiple_users_generate_posts_concurrently(page):
# User1 generates on-demand
# User2 generates on-demand simultaneously
# Both posts created successfully
# No database locks or conflicts
```
### Layer 3: Performance Benchmarks
**Latency Profiling**:
```python
def test_generation_latency_p50_under_30s(page):
"""Measure time from "Generate Now" click to blog.post created."""
import time
start = time.time()
# ... navigate and generate ...
elapsed = time.time() - start
assert elapsed < 30, f"Generation took {elapsed}s, target <30s"
# Record metric: elapsed_seconds_p50
```
**Query Count Assertion**:
```python
def test_generation_uses_fewer_than_50_queries(page):
"""Verify no N+1 query patterns."""
from odoo.tests import TransactionCase
# In the server-side test, not E2E:
with self.assertQueryCount(50):
schedule.run_generation()
```
**Stress Test** (not Playwright, server-side):
```python
def test_concurrent_schedule_slots_under_load():
"""3 slots × 5 iterations = 15 posts in rapid succession."""
# Trigger all 3 schedule slots
# Measure: peak memory, query count, token usage
# Assert: all posts created, no failures
```
## Runboat Integration
### What is Runboat?
Runboat (by Acsone) provides:
- **Auto-deployed preview instances** of Odoo per CI commit
- **Live URL** for E2E testing (no local bootstrapping needed)
- **Fresh template DB** with addon pre-installed
- **5-minute auto-cleanup** after test run
### CI/CD Variables Required
Add these to **GitLab Project Settings → CI/CD Variables**:
| Variable | Type | Purpose | Example |
|----------|------|---------|---------|
| `RUNBOAT_API_URL` | Secret | Runboat API endpoint | `https://api.runboat.dev` |
| `RUNBOAT_TOKEN` | Secret | Bearer token for Runboat API | `rbk_xxx...` |
| `GITLAB_BOT_TOKEN` | Secret | Personal/bot token for MR comments | `glpat_xxx...` |
**How to obtain**:
1. **RUNBOAT_API_URL & RUNBOAT_TOKEN**: Request from Acsone/infrastructure team
2. **GITLAB_BOT_TOKEN**: Create via **GitLab → Settings → Access Tokens**
- Scopes: `api`, `read_api`, `read_repository`
- Save as CI/CD variable (marked as Protected, Masked)
### CI/CD Integration (Already Added)
`.gitlab-ci.yml` now includes:
**Stage: preview**
```yaml
runboat_preview:
stage: preview
image: curlimages/curl:latest
script:
# Request preview build from Runboat
- RESP=$(curl -fsSL -X POST "${RUNBOAT_API_URL}/builds" \
-H "Authorization: Bearer ${RUNBOAT_TOKEN}" \
-d "{\"repo\":\"${CI_PROJECT_PATH}\",\"sha\":\"${CI_COMMIT_SHA}\"}")
- BUILD_URL=$(echo "$RESP" | jq -r '.url')
- echo "BUILD_URL=$BUILD_URL" >> build.env
# Post comment to MR
- curl -X POST "$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes" \
-H "PRIVATE-TOKEN: ${GITLAB_BOT_TOKEN}" \
-d "body=🚀 [Preview](${BUILD_URL}/odoo) ready"
artifacts:
reports:
dotenv: build.env
```
**Stage: e2e**
```yaml
e2e_tests:
stage: e2e
image: mcr.microsoft.com/playwright/python:latest
needs: [runboat_preview]
script:
- pip install -r e2e/requirements.txt
- pytest e2e/ --base-url=$BUILD_URL -v --tracing=retain-on-failure
artifacts:
when: always
paths:
- e2e/traces/
expire_in: 1 week
```
**Stage: test (performance)**
```yaml
performance_tests:
stage: test
image: $ODOO_IMAGE
script:
- pytest addons/itsulu_blog_publisher/tests/test_performance.py \
-m performance --odoo-database=$POSTGRES_DB
```
### Pipeline Flow
```
Merge Request
[lint] black, pylint-odoo (2 min)
[test] unit + BDD + performance (10 min)
[build] Docker image → registry (3 min)
[preview] Runboat deploy (5 min)
[e2e] Playwright against preview (15 min)
Results → MR comment with preview URL
```
**Total pipeline time**: ~35 minutes
- Unit/BDD/Performance tests run in parallel with Docker build
- E2E tests run after preview is ready
## Success Criteria
**Phase 3 Complete when:**
- [ ] 1020 E2E scenarios passing (Runboat)
- [ ] Performance baseline established (latency, tokens, queries)
- [ ] Concurrent generation verified (5+ simultaneous posts)
- [ ] All E2E tests green on merge requests
- [ ] Runboat integration in CI/CD
- [ ] Performance metrics documented in README
- [ ] No E2E test flakiness (< 2% failure rate)
## Performance SLO Targets
| Metric | Target | Rationale |
|---|---|---|
| Generation latency (P50) | < 30 seconds | User experience (wizard response time) |
| Generation latency (P99) | < 60 seconds | Outlier tolerance |
| Tokens per post | 8001200 | Cost baseline for budget planning |
| Queries per generation | < 50 | N+1 detection and DB load |
| Concurrent posts | 5+ | Peak capacity without degradation |
| Email send latency | < 5 seconds | Notification responsiveness |
| Template DB prime time | < 60 seconds | CI/CD pipeline efficiency |
## Implementation Timeline
| Week | Task | Owner |
|---|---|---|
| W1 | Set up e2e/ directory, conftest.py, Runboat polling | Claude |
| W1 | Implement 35 core E2E scenarios (generation, scheduling) | Claude |
| W2 | Add error recovery and email scenarios | Claude |
| W2 | Set up performance measurement (latency, queries) | Claude |
| W3 | Stress testing and concurrency verification | Claude |
| W3 | Performance tuning if SLOs not met | Claude |
| W4 | Runboat CI/CD integration | Claude |
| W4 | Final verification and documentation | Claude |
## Known Constraints
### Runboat Limitations
- **Cold start**: First request may take 3060s (instance startup)
- **Auto-cleanup**: Instance removed 5 min after last request
- **No persistent storage**: Data lost when instance cleaned up
- **Resource limits**: CPU/memory capped per deployment tier
### E2E Test Maintenance
- **Brittle selectors**: Avoid `.o_field_value` (auto-generated)
- **Timing issues**: Use `page.wait_for_*()` not `time.sleep()`
- **Flakiness**: Run 3× locally before merging
- **Timeout**: Set 30s for slow JS rendering
## References
- [Runboat Documentation](https://docs.acsone.eu/runboat/)
- [Playwright Python API](https://playwright.dev/python/)
- [Odoo E2E Best Practices](https://github.com/OCA/server-tools/tree/17.0#e2e-testing)
---
**Next**: Set up e2e/ directory and implement core scenarios