| Field | Value |
|---|---|
| RFC ID | 002 |
| Title | ROSIE Engine: Functional Requirements for GxP Sync and Validation Tools |
| Version | 1.1.0 |
| Status | Draft |
| Focus | Tool qualification, release gates, sync logic, SoR interface |
1. Scope
This RFC defines the functional requirements for the ROSIE Engine, the automated software orchestrator responsible for extracting RFC-001 data, maintaining Dual-Ledger synchronization with a System of Record (SoR), and enforcing regulatory gates within the CI/CD pipeline.
1.1 ROSIE Engine Responsibilities
The engine is responsible for:
- Parsing repositories to extract
@gxp-annotations (RFC-001) - Building and validating the trace graph
- Computing deterministic manifest hashes
- Communicating with the SoR via the API contract (RFC-004)
- Enforcing release gates in CI/CD pipelines
- Generating evidence packages (RFC-003)
1.2 System of Record Responsibilities (Out of Scope)
The SoR is a separate system that implements RFC-004. The SoR handles:
- User authentication and authorization
- Approval workflows (sequential, parallel, role-based)
- Electronic signature capture (21 CFR Part 11 compliant)
- Audit trail storage and retention
- Notification and escalation
ROSIE defines the interface to the SoR, not the SoR implementation itself.
1.3 Implementation Flexibility
This specification is platform-agnostic. Compliant engines may be implemented as CLI tools, CI/CD plugins (GitHub Actions, GitLab CI, Jenkins, Azure DevOps, etc.), IDE extensions, or standalone services.
Key management, secret storage, and cryptographic implementations are environment-specific and outside this specification.
2. Sync Protocol (Dual-Ledger)
The engine must maintain state consistency between the Git repository (design intent) and the SoR (regulatory approval).
2.1 Ingest Logic (repo → SoR)
- Extraction: Scan the repository at every PR or commit to build a directed acyclic graph (DAG) of requirements and annotations
- Diff detection: Identify new, modified, or deleted
@gxp-blocks and push these proposed changes to the SoR - Versioning: Associate every sync event with a unique Git commit SHA
2.2 Reflect Logic (SoR → repo)
- Metadata injection: Pull approval statuses, timestamps, and e-signatures into the YAML front matter of corresponding
.mdfiles - Non-destructive write: Update only designated metadata fields, never altering requirement narrative text or implementation code
2.3 Audit Trail Requirements
Every transaction performed by the engine must be logged in an immutable audit trail within the SoR, containing:
| Field | Description |
|---|---|
user_id | User or agent_id for automated tasks |
timestamp | UTC timestamp |
action_type | e.g., INGEST, REFLECT, HASH_VERIFY |
payload_hash | SHA-256 of the exchanged data |
3. Data Integrity (the Integrity Guard)
3.1 The Manifest Hash
The engine must compute a cumulative SHA-256 hash of the entire validation state.
- Input: Concatenation of all
@gxp-id,@gxp-traces, and the text content of linked requirements - Storage: The manifest hash is stored in the SoR as the canonical fingerprint of the validated version
- Validation: Any change to code or specs during the release cycle results in a hash mismatch, automatically revoking Approved status
3.2 Pre-commit Enforcement
To prevent shadow documentation (manual edits to approved metadata), the engine provides a client-side hook:
- Mechanism: Compare local metadata blocks against the last known sync state from the SoR
- Enforcement: Reject commits if
status: Approvedis found in a modified block without a valid SoR-signed sync token
4. AI Agent Protocol
The engine uses an AI subsystem to enhance auditability and accelerate legacy retrofitting.
4.1 Semantic Consistency Check
- Logic: Compute vector embeddings for requirement text and the implemented code block
- Alerting: If cosine similarity falls below a configurable threshold (default: 0.75), flag the trace for manual review
4.2 Shadow Mode Retrofitting
- Restricted access: AI is forbidden from writing directly to
.mdor source files during initial discovery - Proposal file: AI generates
gxp-proposal.jsoncontaining recommended tags and links - Acceptance: A human developer must apply the proposal, at which point the engine performs the injection and logs the action as AI-assisted traceability
5. Automated Release Gates (Hard-Gate)
The engine acts as a blocking gate in the CI/CD pipeline (e.g., GitHub Actions, Jenkins).
5.1 Gate Logic (RRT Protocol)
The engine evaluates the following expression before allowing a build to proceed to a regulated environment:
G = (S_all AND H_match AND T_pass)
Where:
| Variable | Description |
|---|---|
S_all | All identified IDs in the PR have Approved status in the SoR |
H_match | The current manifest hash matches the SoR-signed hash |
T_pass | All verification nodes (OQ/PQ) associated with the change have passed execution |
5.2 Release Readiness Token (RRT)
If G is true, the engine generates a short-lived, cryptographically signed RRT. Deployment scripts must verify this token before deploying artifacts.
6. Tool Qualification (TQ) and GAMP 5 Compliance
To be fit for purpose in a GxP environment, the ROSIE Engine must satisfy qualification requirements.
6.1 Operational Qualification (OQ) for the Engine
- Extraction accuracy: Test against a golden repository to ensure 100% accuracy in tag extraction across all supported languages
- Failure modes: Verify that the engine correctly fails the Hard-Gate when signatures are missing or hashes are tampered with
6.2 AI Reliability (the Non-Deterministic Guard)
- Source tagging: Any trace or metadata generated by AI must be explicitly tagged with provenance:
ai-inferred - Human-in-the-loop (HITL): AI-inferred traces cannot satisfy
S_alluntil promoted to provenance:human-verifiedby a qualified user
Appendix: Sequence Diagram (Hard-Gate Handshake)
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌─────────────────────┐
│ CI/CD │ │ ROSIE Engine │ │ System of │ │ Regulated │
│ Pipeline │ │ │ │ Record │ │ Environment │
└──────┬──────┘ └──────┬───────┘ └────────┬────────┘ └──────────┬──────────┘
│ │ │ │
│ Request Release │ │ │
│ (SHA, Hash) │ │ │
│──────────────────>│ │ │
│ │ │ │
│ │ Validate Approvals │ │
│ │ and Signatures │ │
│ │─────────────────────>│ │
│ │ │ │
│ │ Validation Result │ │
│ │<─────────────────────│ │
│ │ │ │
│ ┌──────┴──────┐ │ │
│ │ Validation │ │ │
│ │ Success? │ │ │
│ └──────┬──────┘ │ │
│ │ │ │
│ ┌──────────────┼──────────────┐ │ │
│ │ YES │ NO │ │ │
│ │ │ │ │ │
│ │ Generate │ Error: │ │ │
│ │ Signed RRT │ Missing │ │ │
│ │ │ Signatures │ │ │
│ │ │ or Hash │ │ │
│ │ │ Drift │ │ │
│<───┴──────────────┴──────────────┘ │ │
│ │ │
│ Deploy Artifact + RRT │ │
│─────────────────────────────────────────────────────────────────>│
│ │ │
│ │ Verify RRT │
│ │ Authenticity │
│ │ │
Related RFCs
- RFC-001: Data Standard — Data structures processed by the engine
- RFC-003: Evidence Standard — Evidence capture triggered by the engine
- RFC-004: API Interface — API endpoints called by the engine
- RFC-005: TQ Baseline — Qualification requirements for the engine