Extract Requirements Overview
Saphira’s Extract Requirements feature uses advanced OCR, NLP, and Vision Language Models (VLMs) to automatically extract requirements, components, and technical specifications from your source materials with high accuracy.Getting Started
From the Home Page, navigate to the Seek Guidance section and click Extract Requirements, which will lead to this Choose Extraction Method option.
This opens the extraction method selection dialog where you choose how to extract requirements from your source material.
Choose Extraction Method
Saphira provides two extraction methods optimized for different source materials:
Extract from Document
Upload PDF documents to extract requirements and text content.Best for:
- Technical specifications
- Requirements documents
- Standards documents
- Engineering reports
Extract from Image
Upload architecture diagrams, screenshots, or images to extract requirements and structure.Best for:
- System architecture diagrams
- Block diagrams and flowcharts
- Screenshots of requirements
- Schematic drawings
Extract from Document
Upload PDF documents to extract structured requirements.
Supported Document Types
PDF Documents
PDF Documents
Extract from PDF files including:
- Technical specifications and requirements documents
- Standards documents (ISO, IEC, UL standards)
- Engineering reports and design documents
- Regulatory compliance documentation
- Page-level extraction with section preservation
- Table and figure extraction
- Cross-reference detection and linking
- Hierarchical structure preservation
Document Extraction Workflow
AI Extracts Requirements
Saphira analyzes the document and extracts:
- Requirement IDs and descriptions
- Values, units, and logical operators
- Parent-child relationships
- Cross-references
Review Extracted Requirements
Review the extracted requirements:
- Requirement IDs and descriptions
- Values, units, and logical operators
- Parent-child relationships
- Cross-references
Extracted Fields
For each requirement, Saphira extracts:| Field | Description | Example |
|---|---|---|
| ID | Unique requirement identifier | REQ-SAFETY-001 |
| Requirement | Full requirement text | ”The system shall…” |
| Parent | Parent requirement or category | Safety |
| Logical Operator | Comparison operators | >=, <=, within |
| Unit | Measurement unit | ms, °C, Gauss |
| Value | Numeric threshold | 500, 0.20-0.55 |
| Requirement Type | Numeric or Textual | Numeric |
Extract from Image
Upload visual content like architecture diagrams and screenshots to extract requirements and system structure.
Supported Image Types
Architecture Diagrams
Architecture Diagrams
Extract from system architecture diagrams:
- Component names and their roles (ECU, function, subsystem, system, process)
- Connections and data flows between components
- System boundaries and interfaces
- Hierarchical structure
- Detects if an image is an architecture diagram (with confidence scoring)
- Extracts every labeled element
- Identifies relationships between components
Block Diagrams & Flowcharts
Block Diagrams & Flowcharts
Extract from visual diagrams:
- Process steps and decision points
- Data flow arrows and connections
- Functional blocks and modules
- State transitions
Screenshots
Screenshots
Extract from captured screenshots:
- Requirements from legacy systems
- Tables and structured data
- Form fields and values
- Any visible text content
Schematic Drawings
Schematic Drawings
Extract from electrical/electronic schematics:
- Part numbers and component symbols
- Component values and ratings
- Connection information
- Reference designators
Image Extraction Workflow
VLM Analyzes Image
Saphira’s Vision Language Model:
- Detects document type (diagram, chart, table, screenshot)
- Extracts all visible text
- Identifies technical components and relationships
- Generates searchable keywords
Review Extracted Content
Review the structured extraction:
- Components with roles and descriptions
- Requirements derived from visual content
- Hierarchical relationships
Architecture Diagram Processing
When you upload an architecture diagram, Saphira:- Diagram Detection: Uses VLM to identify the image type with confidence scoring
- Component Enumeration: Extracts every labeled element:
- Component names from diagram labels
- Role classification (ECU, function, subsystem, system, process)
- Brief descriptions of purpose and connections
- Relationship Mapping: Identifies connections and data flows
- Context Integration: Combines with item definition for complete system understanding
Extraction Accuracy
>90% Accuracy Target
Saphira uses a multi-pass extraction approach to achieve high accuracy:Pass 1: Initial Extraction
Pass 1: Initial Extraction
- Document/image structure analysis
- Primary content extraction
- Initial requirement identification
- Component detection from diagrams
Pass 2: Refinement
Pass 2: Refinement
- Cross-validation of extracted content
- Relationship verification
- Missing field detection
- Confidence scoring
Pass 3: Validation
Pass 3: Validation
- Format normalization
- ID standardization
- Traceability link verification
- Quality assurance checks
Accuracy by Source Type
| Source Type | Accuracy Target | Key Features |
|---|---|---|
| PDF Requirements | >90% | Hierarchical extraction, ID preservation |
| Architecture Diagrams | >85% | VLM component detection, relationship mapping |
| Screenshots | >85% | OCR text extraction, table detection |
| Schematics | >80% | Part number extraction, symbol recognition |
Maximizing Accuracy
To get the best extraction results:- Use High-Resolution Images: Clear, high-resolution images yield better VLM analysis
- Well-Formatted Documents: Documents with consistent formatting extract more accurately
- Clear Labels: Diagrams with readable labels improve component detection
- Review and Edit: Use the inline editor to refine extracted content
Integration with Safety Workflows
Extracted requirements and components integrate directly with Saphira’s safety analysis:- FMEA: Extracted components become focus elements for failure mode analysis
- HARA: Requirements populate hazard sources and control measures
- TARA: Interface definitions feed cybersecurity asset identification
- Gap Analysis: Extracted specifications enable automated standards compliance checking
- Safety Case: Requirements become evidence in GSN safety arguments
Next Steps After Extraction
After extracting requirements from documents or images, use the Write & Check Requirements workflow to:- Classify requirements at the appropriate level (Stakeholder, System, Subsystem, Component, HW/SW)
- Validate quality with INCOSE linting
- Establish traceability links to safety analyses and other requirements
- Import to your project database with full traceability
Related Features
Write & Check Requirements
Classify, validate, and refine extracted requirements with AI-assisted quality checking
Import from Jama
Pull requirements from Jama, clean with AI, and sync back
Components Tab
Manage extracted components and BOM data
Standards Chatbot
Ask questions about extracted requirements against standards

