Skip to main content

Extract Requirements Overview

Saphira’s Extract Requirements feature uses advanced OCR, NLP, and Vision Language Models (VLMs) to automatically extract requirements, components, and technical specifications from your source materials with high accuracy.

Getting Started

From the Home Page, navigate to the Seek Guidance section and click Extract Requirements, which will lead to this Choose Extraction Method option. Choose Extraction Method This opens the extraction method selection dialog where you choose how to extract requirements from your source material.

Choose Extraction Method

Extract Requirements Card Saphira provides two extraction methods optimized for different source materials:

Extract from Document

Upload PDF documents to extract requirements and text content.Best for:
  • Technical specifications
  • Requirements documents
  • Standards documents
  • Engineering reports

Extract from Image

Upload architecture diagrams, screenshots, or images to extract requirements and structure.Best for:
  • System architecture diagrams
  • Block diagrams and flowcharts
  • Screenshots of requirements
  • Schematic drawings

Extract from Document

Upload PDF documents to extract structured requirements. Extract PDFs

Supported Document Types

Extract from PDF files including:
  • Technical specifications and requirements documents
  • Standards documents (ISO, IEC, UL standards)
  • Engineering reports and design documents
  • Regulatory compliance documentation
Features:
  • Page-level extraction with section preservation
  • Table and figure extraction
  • Cross-reference detection and linking
  • Hierarchical structure preservation

Document Extraction Workflow

1

Select Extract from Document

From the extraction method dialog, click Extract from Document
2

Upload Your Document

Drag and drop or browse to select your PDF file
3

AI Extracts Requirements

Saphira analyzes the document and extracts:
  • Requirement IDs and descriptions
  • Values, units, and logical operators
  • Parent-child relationships
  • Cross-references
4

Review Extracted Requirements

Review the extracted requirements:
  • Requirement IDs and descriptions
  • Values, units, and logical operators
  • Parent-child relationships
  • Cross-references
5

Refine & Import

After extraction, use Write & Check Requirements to classify, validate, and import to your project database with full traceability

Extracted Fields

For each requirement, Saphira extracts:
FieldDescriptionExample
IDUnique requirement identifierREQ-SAFETY-001
RequirementFull requirement text”The system shall…”
ParentParent requirement or categorySafety
Logical OperatorComparison operators>=, <=, within
UnitMeasurement unitms, °C, Gauss
ValueNumeric threshold500, 0.20-0.55
Requirement TypeNumeric or TextualNumeric

Extract from Image

Upload visual content like architecture diagrams and screenshots to extract requirements and system structure. Extract PDFs

Supported Image Types

Extract from system architecture diagrams:
  • Component names and their roles (ECU, function, subsystem, system, process)
  • Connections and data flows between components
  • System boundaries and interfaces
  • Hierarchical structure
The VLM automatically:
  • Detects if an image is an architecture diagram (with confidence scoring)
  • Extracts every labeled element
  • Identifies relationships between components
Extract from visual diagrams:
  • Process steps and decision points
  • Data flow arrows and connections
  • Functional blocks and modules
  • State transitions
Extract from captured screenshots:
  • Requirements from legacy systems
  • Tables and structured data
  • Form fields and values
  • Any visible text content
Extract from electrical/electronic schematics:
  • Part numbers and component symbols
  • Component values and ratings
  • Connection information
  • Reference designators

Image Extraction Workflow

1

Select Extract from Image

From the extraction method dialog, click Extract from Image
2

Upload Your Image

Upload architecture diagrams, screenshots, or images (PNG, JPG, PDF)
3

VLM Analyzes Image

Saphira’s Vision Language Model:
  • Detects document type (diagram, chart, table, screenshot)
  • Extracts all visible text
  • Identifies technical components and relationships
  • Generates searchable keywords
4

Review Extracted Content

Review the structured extraction:
  • Components with roles and descriptions
  • Requirements derived from visual content
  • Hierarchical relationships
5

Refine & Import

After extraction, use Write & Check Requirements to classify, validate, and import to your project database with full traceability

Architecture Diagram Processing

When you upload an architecture diagram, Saphira:
  1. Diagram Detection: Uses VLM to identify the image type with confidence scoring
  2. Component Enumeration: Extracts every labeled element:
    • Component names from diagram labels
    • Role classification (ECU, function, subsystem, system, process)
    • Brief descriptions of purpose and connections
  3. Relationship Mapping: Identifies connections and data flows
  4. Context Integration: Combines with item definition for complete system understanding

Extraction Accuracy

>90% Accuracy Target

Saphira uses a multi-pass extraction approach to achieve high accuracy:
  • Document/image structure analysis
  • Primary content extraction
  • Initial requirement identification
  • Component detection from diagrams
  • Cross-validation of extracted content
  • Relationship verification
  • Missing field detection
  • Confidence scoring
  • Format normalization
  • ID standardization
  • Traceability link verification
  • Quality assurance checks

Accuracy by Source Type

Source TypeAccuracy TargetKey Features
PDF Requirements>90%Hierarchical extraction, ID preservation
Architecture Diagrams>85%VLM component detection, relationship mapping
Screenshots>85%OCR text extraction, table detection
Schematics>80%Part number extraction, symbol recognition

Maximizing Accuracy

To get the best extraction results:
  1. Use High-Resolution Images: Clear, high-resolution images yield better VLM analysis
  2. Well-Formatted Documents: Documents with consistent formatting extract more accurately
  3. Clear Labels: Diagrams with readable labels improve component detection
  4. Review and Edit: Use the inline editor to refine extracted content

Integration with Safety Workflows

Extracted requirements and components integrate directly with Saphira’s safety analysis:
  • FMEA: Extracted components become focus elements for failure mode analysis
  • HARA: Requirements populate hazard sources and control measures
  • TARA: Interface definitions feed cybersecurity asset identification
  • Gap Analysis: Extracted specifications enable automated standards compliance checking
  • Safety Case: Requirements become evidence in GSN safety arguments

Next Steps After Extraction

After extracting requirements from documents or images, use the Write & Check Requirements workflow to:
  • Classify requirements at the appropriate level (Stakeholder, System, Subsystem, Component, HW/SW)
  • Validate quality with INCOSE linting
  • Establish traceability links to safety analyses and other requirements
  • Import to your project database with full traceability