Deep Dive

Data Structure Methodology

Shifting from document summarization to structured data mapping (Pillars I-III).

The core of the architecture is the conversion of the RFP from a text document into a structured dataset. This approach addresses the issue of "compound constraints"—sentences containing multiple, distinct rules—by Atomizing the RFP into immutable data points (Pillar I) and mapping them to a defined Reference Ontology (Pillar II).

1.1 Deep Dive: Atomization Logic

The Atomization Engine parses the source PDF to identify constraints rather than summarizing text. It identifies sentence boundaries, list items, and conditional clauses to split them into individual units of requirement.

Processing Example: Key Personnel Clause
Input (Raw RFP Text)

"The Contractor shall provide a dedicated Project Manager who possesses a current PMP certification and has at least five (5) years of experience leading Agile software development teams, and who must be available to report on-site within ten (10) days of contract award."

Atom ID Subject Action Constraint Class
R-104.a Contractor Provide Project Manager Mandatory
R-104.b PM Possess PMP (Current) Mandatory
R-104.c PM Exp 5+ Yrs (Agile) Mandatory
R-104.d PM Availability On-site (10 days) Compliance

1.2 The Anchor Mechanism

Verbatim Anchoring is used to validate the normalized data. While atoms are normalized for vector search (e.g., converting "five (5) years" to integer 5), the system retains the original raw text as an immutable "Anchor." This enables the system to utilize semantic search for retrieval while referencing the raw legal text for final compliance checks.

Anchoring in Action

ATOM_ID: R-104.c
NORMALIZED_VAL: min_experience = 5 (integer)
ANCHOR_TXT: "at least five (5) years of experience"
SOURCE_LOC: Pg 42, Ln 14, Offset 1024:1080

1.3 The Reference Ontology

To support cross-bid normalization, every atom is mapped to a controlled 3-level hierarchy: Domain > Function > Attribute. This enforces consistent tagging across years and agencies, replacing free-form labels with a shared vocabulary.

Level 1: Domain
Technical

Evaluation, Commercial, Administrative

Level 2: Function
Personnel

Security, Hosting, Migration

Level 3: Attribute
Certification

Encryption, Uptime, Penalty

Example Mapping: "Project Manager PMP" → Technical > Personnel > Certification

1.4 The Lineage Model

The datastore behaves as an append-only ledger. Requirements are never overwritten—new versions are created and linked to preserve the full audit trail.

Status States

Active
Current binding requirement
Superseded
Replaced by a newer version
Deleted
Explicitly removed by issuer
Clarified
Active, with Q&A metadata

Lineage Graph Example

VERSION 1.0 (RFP Release)
"Reports due monthly."
STATUS: SUPERSEDED
VERSION 1.1 (Amendment 01)
"Reports due weekly."
STATUS: ACTIVE
CLARIFICATION (Q&A #45)
"Weekly means every Friday by 5PM EST."
LINKED: CLARIFIES V1.1