Data Structure Methodology
Shifting from document summarization to structured data mapping (Pillars I-III).
The core of the architecture is the conversion of the RFP from a text document into a structured dataset. This approach addresses the issue of "compound constraints"—sentences containing multiple, distinct rules—by Atomizing the RFP into immutable data points (Pillar I) and mapping them to a defined Reference Ontology (Pillar II).
1.1 Deep Dive: Atomization Logic
The Atomization Engine parses the source PDF to identify constraints rather than summarizing text. It identifies sentence boundaries, list items, and conditional clauses to split them into individual units of requirement.
"The Contractor shall provide a dedicated Project Manager who possesses a current PMP certification and has at least five (5) years of experience leading Agile software development teams, and who must be available to report on-site within ten (10) days of contract award."
| Atom ID | Subject | Action | Constraint | Class |
|---|---|---|---|---|
| R-104.a | Contractor | Provide | Project Manager | Mandatory |
| R-104.b | PM | Possess | PMP (Current) | Mandatory |
| R-104.c | PM | Exp | 5+ Yrs (Agile) | Mandatory |
| R-104.d | PM | Availability | On-site (10 days) | Compliance |
1.2 The Anchor Mechanism
Verbatim Anchoring is used to validate the normalized data. While atoms are normalized for vector search (e.g., converting "five (5) years" to integer 5), the system retains the original raw text as an immutable "Anchor." This enables the system to utilize semantic search for retrieval while referencing the raw legal text for final compliance checks.
Anchoring in Action
1.3 The Reference Ontology
To support cross-bid normalization, every atom is mapped to a controlled 3-level hierarchy: Domain > Function > Attribute. This enforces consistent tagging across years and agencies, replacing free-form labels with a shared vocabulary.
Evaluation, Commercial, Administrative
Security, Hosting, Migration
Encryption, Uptime, Penalty
Example Mapping: "Project Manager PMP" → Technical > Personnel > Certification
1.4 The Lineage Model
The datastore behaves as an append-only ledger. Requirements are never overwritten—new versions are created and linked to preserve the full audit trail.