Extract Requirements Overview
Saphira’s Extract Requirements feature uses advanced OCR, NLP, and Vision Language Models (VLMs) to automatically extract requirements, components, and technical specifications from your source materials with high accuracy.Getting Started
From the Dashboard, navigate to the Seek Guidance section and click Extract Requirements.
This opens the extraction method selection dialog where you choose how to extract requirements from your source material.
Choose Extraction Method
Saphira provides two extraction methods optimized for different source materials:
Extract from Document
Upload PDF, Word, or other documents to extract requirements and text content.Best for:
- Technical specifications
- Requirements documents
- Standards documents
- Engineering reports
Extract from Image
Upload architecture diagrams, screenshots, or images to extract requirements and structure.Best for:
- System architecture diagrams
- Block diagrams and flowcharts
- Screenshots of requirements
- Schematic drawings
Extract from Document
Upload text-based documents (PDF, Word, etc.) to extract structured requirements.Supported Document Types
PDF Documents
PDF Documents
Extract from PDF files including:
- Technical specifications and requirements documents
- Standards documents (ISO, IEC, UL standards)
- Engineering reports and design documents
- Regulatory compliance documentation
- Page-level extraction with section preservation
- Table and figure extraction
- Cross-reference detection and linking
- Hierarchical structure preservation
Word Documents
Word Documents
Extract from Microsoft Word documents:
- Structured requirements with headings
- Tables of requirements
- Embedded images and diagrams
- Track changes and comments preserved
Spreadsheets
Spreadsheets
Extract from Excel/CSV files:
- Requirements tables with ID, description, parent
- Bill of Materials (BOM) data
- Test matrices and traceability tables
Document Extraction Workflow
1
Select Extract from Document
From the extraction method dialog, click Extract from Document
2
Upload Your Document
Drag and drop or browse to select your PDF, Word, or other document file
3
AI Extracts Requirements
Saphira analyzes the document and extracts:
- Requirement IDs and descriptions
- Values, units, and logical operators
- Parent-child relationships
- Cross-references
4
Classify & Validate
Review extracted requirements:
- AI suggests requirement classification
- INCOSE lint checks quality
- Edit inline to refine
5
Import to Project
Import validated requirements to your project database with full traceability
Extracted Fields
For each requirement, Saphira extracts:| Field | Description | Example |
|---|---|---|
| ID | Unique requirement identifier | REQ-SAFETY-001 |
| Requirement | Full requirement text | ”The system shall…” |
| Parent | Parent requirement or category | Safety |
| Logical Operator | Comparison operators | >=, <=, within |
| Unit | Measurement unit | ms, °C, Gauss |
| Value | Numeric threshold | 500, 0.20-0.55 |
| Requirement Type | Numeric or Textual | Numeric |
Extract from Image
Upload visual content like architecture diagrams and screenshots to extract requirements and system structure.Supported Image Types
Architecture Diagrams
Architecture Diagrams
Extract from system architecture diagrams:
- Component names and their roles (ECU, function, subsystem, system, process)
- Connections and data flows between components
- System boundaries and interfaces
- Hierarchical structure
- Detects if an image is an architecture diagram (with confidence scoring)
- Extracts every labeled element
- Identifies relationships between components
Block Diagrams & Flowcharts
Block Diagrams & Flowcharts
Extract from visual diagrams:
- Process steps and decision points
- Data flow arrows and connections
- Functional blocks and modules
- State transitions
Screenshots
Screenshots
Extract from captured screenshots:
- Requirements from legacy systems
- Tables and structured data
- Form fields and values
- Any visible text content
Schematic Drawings
Schematic Drawings
Extract from electrical/electronic schematics:
- Part numbers and component symbols
- Component values and ratings
- Connection information
- Reference designators
Image Extraction Workflow
1
Select Extract from Image
From the extraction method dialog, click Extract from Image
2
Upload Your Image
Upload architecture diagrams, screenshots, or images (PNG, JPG, PDF)
3
VLM Analyzes Image
Saphira’s Vision Language Model:
- Detects document type (diagram, chart, table, screenshot)
- Extracts all visible text
- Identifies technical components and relationships
- Generates searchable keywords
4
Review Extracted Content
Review the structured extraction:
- Components with roles and descriptions
- Requirements derived from visual content
- Hierarchical relationships
5
Import to Project
Import extracted requirements and components to your project
Architecture Diagram Processing
When you upload an architecture diagram, Saphira:- Diagram Detection: Uses VLM to identify the image type with confidence scoring
- Component Enumeration: Extracts every labeled element:
- Component names from diagram labels
- Role classification (ECU, function, subsystem, system, process)
- Brief descriptions of purpose and connections
- Relationship Mapping: Identifies connections and data flows
- Context Integration: Combines with item definition for complete system understanding
Extraction Accuracy
>90% Accuracy Target
Saphira uses a multi-pass extraction approach to achieve high accuracy:Pass 1: Initial Extraction
Pass 1: Initial Extraction
- Document/image structure analysis
- Primary content extraction
- Initial requirement identification
- Component detection from diagrams
Pass 2: Refinement
Pass 2: Refinement
- Cross-validation of extracted content
- Relationship verification
- Missing field detection
- Confidence scoring
Pass 3: Validation
Pass 3: Validation
- Format normalization
- ID standardization
- Traceability link verification
- Quality assurance checks
Accuracy by Source Type
| Source Type | Accuracy Target | Key Features |
|---|---|---|
| PDF Requirements | >90% | Hierarchical extraction, ID preservation |
| Architecture Diagrams | >85% | VLM component detection, relationship mapping |
| BOMs/Spreadsheets | >95% | Column auto-detection, format normalization |
| Screenshots | >85% | OCR text extraction, table detection |
| Schematics | >80% | Part number extraction, symbol recognition |
Maximizing Accuracy
To get the best extraction results:- Use High-Resolution Images: Clear, high-resolution images yield better VLM analysis
- Well-Formatted Documents: Documents with consistent formatting extract more accurately
- Clear Labels: Diagrams with readable labels improve component detection
- Review and Edit: Use the inline editor to refine extracted content
Post-Extraction Workflow
After extraction, requirements flow through:Classify
Classify
AI automatically suggests requirement classification level:
- Stakeholder
- System
- Subsystem
- Component
- Hardware/Software
INCOSE Lint
INCOSE Lint
Quality check against INCOSE requirements engineering principles:
- Atomicity (single capability per requirement)
- Verifiability (testable with single procedure)
- Clarity (no ambiguous terms)
- Suggested rewrites for improvement
Suggest Children
Suggest Children
AI suggests child requirements for decomposition:
- Hierarchical breakdown
- Component allocation
- Traceability maintained
Push to RM
Push to RM
Import to project requirements database:
- Full traceability to source document
- Version control
- Change tracking
Integration with Safety Workflows
Extracted requirements and components integrate directly with Saphira’s safety analysis:- FMEA: Extracted components become focus elements for failure mode analysis
- HARA: Requirements populate hazard sources and control measures
- TARA: Interface definitions feed cybersecurity asset identification
- Gap Analysis: Extracted specifications enable automated standards compliance checking
- Safety Case: Requirements become evidence in GSN safety arguments

