| ROSIE

Field	Value
RFC ID	002
Title	ROSIE Engine: Functional Requirements for GxP Sync and Validation Tools
Version	1.1.0
Status	Draft
Focus	Tool qualification, release gates, sync logic, SoR interface

1. Scope

This RFC defines the functional requirements for the ROSIE Engine, the automated software orchestrator responsible for extracting RFC-001 data, maintaining Dual-Ledger synchronization with a System of Record (SoR), and enforcing regulatory gates within the CI/CD pipeline.

1.1 ROSIE Engine Responsibilities

The engine is responsible for:

Parsing repositories to extract @gxp- annotations (RFC-001)
Building and validating the trace graph
Computing deterministic manifest hashes
Communicating with the SoR via the API contract (RFC-004)
Enforcing release gates in CI/CD pipelines
Generating evidence packages (RFC-003)

1.2 System of Record Responsibilities (Out of Scope)

The SoR is a separate system that implements RFC-004. The SoR handles:

User authentication and authorization
Approval workflows (sequential, parallel, role-based)
Electronic signature capture (21 CFR Part 11 compliant)
Audit trail storage and retention
Notification and escalation

ROSIE defines the interface to the SoR, not the SoR implementation itself.

1.3 Implementation Flexibility

This specification is platform-agnostic. Compliant engines may be implemented as CLI tools, CI/CD plugins (GitHub Actions, GitLab CI, Jenkins, Azure DevOps, etc.), IDE extensions, or standalone services.

Key management, secret storage, and cryptographic implementations are environment-specific and outside this specification.

2. Sync Protocol (Dual-Ledger)

The engine must maintain state consistency between the Git repository (design intent) and the SoR (regulatory approval).

2.1 Ingest Logic (repo → SoR)

Extraction: Scan the repository at every PR or commit to build a directed acyclic graph (DAG) of requirements and annotations
Diff detection: Identify new, modified, or deleted @gxp- blocks and push these proposed changes to the SoR
Versioning: Associate every sync event with a unique Git commit SHA

2.2 Reflect Logic (SoR → repo)

Metadata injection: Pull approval statuses, timestamps, and e-signatures into the YAML front matter of corresponding .md files
Non-destructive write: Update only designated metadata fields, never altering requirement narrative text or implementation code

2.3 Audit Trail Requirements

Every transaction performed by the engine must be logged in an immutable audit trail within the SoR, containing:

Field	Description
`user_id`	User or `agent_id` for automated tasks
`timestamp`	UTC timestamp
`action_type`	e.g., `INGEST`, `REFLECT`, `HASH_VERIFY`
`payload_hash`	SHA-256 of the exchanged data

3. Data Integrity (the Integrity Guard)

3.1 The Manifest Hash

The engine must compute a cumulative SHA-256 hash of the entire validation state.

Input: Concatenation of all @gxp-id, @gxp-traces, and the text content of linked requirements
Storage: The manifest hash is stored in the SoR as the canonical fingerprint of the validated version
Validation: Any change to code or specs during the release cycle results in a hash mismatch, automatically revoking Approved status

3.2 Pre-commit Enforcement

To prevent shadow documentation (manual edits to approved metadata), the engine provides a client-side hook:

Mechanism: Compare local metadata blocks against the last known sync state from the SoR
Enforcement: Reject commits if status: Approved is found in a modified block without a valid SoR-signed sync token

4. AI Agent Protocol

The engine uses an AI subsystem to enhance auditability and accelerate legacy retrofitting.

4.1 Semantic Consistency Check

Logic: Compute vector embeddings for requirement text and the implemented code block
Alerting: If cosine similarity falls below a configurable threshold (default: 0.75), flag the trace for manual review

4.2 Shadow Mode Retrofitting

Restricted access: AI is forbidden from writing directly to .md or source files during initial discovery
Proposal file: AI generates gxp-proposal.json containing recommended tags and links
Acceptance: A human developer must apply the proposal, at which point the engine performs the injection and logs the action as AI-assisted traceability

5. Automated Release Gates (Hard-Gate)

The engine acts as a blocking gate in the CI/CD pipeline (e.g., GitHub Actions, Jenkins).

5.1 Gate Logic (RRT Protocol)

The engine evaluates the following expression before allowing a build to proceed to a regulated environment:

G = (S_all AND H_match AND T_pass)

Where:

Variable	Description
`S_all`	All identified IDs in the PR have Approved status in the SoR
`H_match`	The current manifest hash matches the SoR-signed hash
`T_pass`	All verification nodes (OQ/PQ) associated with the change have passed execution

5.2 Release Readiness Token (RRT)

If G is true, the engine generates a short-lived, cryptographically signed RRT. Deployment scripts must verify this token before deploying artifacts.

6. Tool Qualification (TQ) and GAMP 5 Compliance

To be fit for purpose in a GxP environment, the ROSIE Engine must satisfy qualification requirements.

6.1 Operational Qualification (OQ) for the Engine

Extraction accuracy: Test against a golden repository to ensure 100% accuracy in tag extraction across all supported languages
Failure modes: Verify that the engine correctly fails the Hard-Gate when signatures are missing or hashes are tampered with

6.2 AI Reliability (the Non-Deterministic Guard)

Source tagging: Any trace or metadata generated by AI must be explicitly tagged with provenance: ai-inferred
Human-in-the-loop (HITL): AI-inferred traces cannot satisfy S_all until promoted to provenance: human-verified by a qualified user

Appendix: Sequence Diagram (Hard-Gate Handshake)

┌─────────────┐     ┌──────────────┐     ┌─────────────────┐     ┌─────────────────────┐
│ CI/CD       │     │ ROSIE Engine │     │ System of       │     │ Regulated           │
│ Pipeline    │     │              │     │ Record          │     │ Environment         │
└──────┬──────┘     └──────┬───────┘     └────────┬────────┘     └──────────┬──────────┘
       │                   │                      │                         │
       │ Request Release   │                      │                         │
       │ (SHA, Hash)       │                      │                         │
       │──────────────────>│                      │                         │
       │                   │                      │                         │
       │                   │ Validate Approvals   │                         │
       │                   │ and Signatures       │                         │
       │                   │─────────────────────>│                         │
       │                   │                      │                         │
       │                   │   Validation Result  │                         │
       │                   │<─────────────────────│                         │
       │                   │                      │                         │
       │            ┌──────┴──────┐               │                         │
       │            │ Validation  │               │                         │
       │            │ Success?    │               │                         │
       │            └──────┬──────┘               │                         │
       │                   │                      │                         │
       │    ┌──────────────┼──────────────┐      │                         │
       │    │ YES          │          NO  │      │                         │
       │    │              │              │      │                         │
       │    │  Generate    │   Error:     │      │                         │
       │    │  Signed RRT  │   Missing    │      │                         │
       │    │              │   Signatures │      │                         │
       │    │              │   or Hash    │      │                         │
       │    │              │   Drift      │      │                         │
       │<───┴──────────────┴──────────────┘      │                         │
       │                                          │                         │
       │ Deploy Artifact + RRT                    │                         │
       │─────────────────────────────────────────────────────────────────>│
       │                                          │                         │
       │                                          │     Verify RRT          │
       │                                          │     Authenticity        │
       │                                          │                         │

RFC-001: Data Standard — Data structures processed by the engine
RFC-003: Evidence Standard — Evidence capture triggered by the engine
RFC-004: API Interface — API endpoints called by the engine
RFC-005: TQ Baseline — Qualification requirements for the engine