Extract Requirements

Extract Requirements Overview

Saphira’s Extract Requirements feature uses advanced OCR, NLP, and Vision Language Models (VLMs) to automatically extract requirements, components, and technical specifications from your source materials with high accuracy.

Getting Started

From the Home Page, navigate to the Seek Guidance section and click Extract Requirements, which will lead to this Choose Extraction Method option.

This opens the extraction method selection dialog where you choose how to extract requirements from your source material.

Choose Extraction Method

Saphira provides two extraction methods optimized for different source materials:

Extract from Document

Upload PDF documents to extract requirements and text content.Best for:

Technical specifications
Requirements documents
Standards documents
Engineering reports

Extract from Image

Upload architecture diagrams, screenshots, or images to extract requirements and structure.Best for:

System architecture diagrams
Block diagrams and flowcharts
Screenshots of requirements
Schematic drawings

Extract from Document

Upload PDF documents to extract structured requirements. Extract PDFs

Supported Document Types

PDF Documents

Extract from PDF files including:

Technical specifications and requirements documents
Standards documents (ISO, IEC, UL standards)
Engineering reports and design documents
Regulatory compliance documentation

Features:

Page-level extraction with section preservation
Table and figure extraction
Cross-reference detection and linking
Hierarchical structure preservation

Document Extraction Workflow

Select Extract from Document

From the extraction method dialog, click Extract from Document

Upload Your Document

Drag and drop or browse to select your PDF file

AI Extracts Requirements

Saphira analyzes the document and extracts:

Requirement IDs and descriptions
Values, units, and logical operators
Parent-child relationships
Cross-references

Review Extracted Requirements

Review the extracted requirements:

Requirement IDs and descriptions
Values, units, and logical operators
Parent-child relationships
Cross-references

Refine & Import

After extraction, use Write & Check Requirements to classify, validate, and import to your project database with full traceability

Extracted Fields

For each requirement, Saphira extracts:

Field	Description	Example
ID	Unique requirement identifier	`REQ-SAFETY-001`
Requirement	Full requirement text	”The system shall…”
Parent	Parent requirement or category	`Safety`
Logical Operator	Comparison operators	`>=`, `<=`, `within`
Unit	Measurement unit	`ms`, `°C`, `Gauss`
Value	Numeric threshold	`500`, `0.20-0.55`
Requirement Type	Numeric or Textual	`Numeric`

Extract from Image

Upload visual content like architecture diagrams and screenshots to extract requirements and system structure. Extract PDFs

Supported Image Types

Architecture Diagrams

Extract from system architecture diagrams:

Component names and their roles (ECU, function, subsystem, system, process)
Connections and data flows between components
System boundaries and interfaces
Hierarchical structure

The VLM automatically:

Detects if an image is an architecture diagram (with confidence scoring)
Extracts every labeled element
Identifies relationships between components

Block Diagrams & Flowcharts

Extract from visual diagrams:

Process steps and decision points
Data flow arrows and connections
Functional blocks and modules
State transitions

Screenshots

Extract from captured screenshots:

Requirements from legacy systems
Tables and structured data
Form fields and values
Any visible text content

Schematic Drawings

Extract from electrical/electronic schematics:

Part numbers and component symbols
Component values and ratings
Connection information
Reference designators

Image Extraction Workflow

Select Extract from Image

From the extraction method dialog, click Extract from Image

Upload Your Image

Upload architecture diagrams, screenshots, or images (PNG, JPG, PDF)

VLM Analyzes Image

Saphira’s Vision Language Model:

Detects document type (diagram, chart, table, screenshot)
Extracts all visible text
Identifies technical components and relationships
Generates searchable keywords

Review Extracted Content

Review the structured extraction:

Components with roles and descriptions
Requirements derived from visual content
Hierarchical relationships

Refine & Import

After extraction, use Write & Check Requirements to classify, validate, and import to your project database with full traceability

Architecture Diagram Processing

When you upload an architecture diagram, Saphira:

Diagram Detection: Uses VLM to identify the image type with confidence scoring
Component Enumeration: Extracts every labeled element:
- Component names from diagram labels
- Role classification (ECU, function, subsystem, system, process)
- Brief descriptions of purpose and connections
Relationship Mapping: Identifies connections and data flows
Context Integration: Combines with item definition for complete system understanding

Extraction Accuracy

>90% Accuracy Target

Saphira uses a multi-pass extraction approach to achieve high accuracy:

Pass 1: Initial Extraction

Document/image structure analysis
Primary content extraction
Initial requirement identification
Component detection from diagrams

Pass 2: Refinement

Cross-validation of extracted content
Relationship verification
Missing field detection
Confidence scoring

Pass 3: Validation

Format normalization
ID standardization
Traceability link verification
Quality assurance checks

Accuracy by Source Type

Source Type	Accuracy Target	Key Features
PDF Requirements	>90%	Hierarchical extraction, ID preservation
Architecture Diagrams	>85%	VLM component detection, relationship mapping
Screenshots	>85%	OCR text extraction, table detection
Schematics	>80%	Part number extraction, symbol recognition

Maximizing Accuracy

To get the best extraction results:

Use High-Resolution Images: Clear, high-resolution images yield better VLM analysis
Well-Formatted Documents: Documents with consistent formatting extract more accurately
Clear Labels: Diagrams with readable labels improve component detection
Review and Edit: Use the inline editor to refine extracted content

Integration with Safety Workflows

Extracted requirements and components integrate directly with Saphira’s safety analysis:

FMEA: Extracted components become focus elements for failure mode analysis
HARA: Requirements populate hazard sources and control measures
TARA: Interface definitions feed cybersecurity asset identification
Gap Analysis: Extracted specifications enable automated standards compliance checking
Safety Case: Requirements become evidence in GSN safety arguments

Next Steps After Extraction

After extracting requirements from documents or images, use the Write & Check Requirements workflow to:

Classify requirements at the appropriate level (Stakeholder, System, Subsystem, Component, HW/SW)
Validate quality with INCOSE linting
Establish traceability links to safety analyses and other requirements
Import to your project database with full traceability

Write & Check Requirements

Classify, validate, and refine extracted requirements with AI-assisted quality checking

Import from Jama

Pull requirements from Jama, clean with AI, and sync back

Components Tab

Manage extracted components and BOM data

Standards Chatbot

Ask questions about extracted requirements against standards

Get Started

Extraction & Requirements

Safety Analysis

Verification & Compliance

Safety Case & Standards

Extract Requirements