Technical Whitepaper

From Black Box to Glass Box

A Clinical Auditing System for complying with the EU AI Act: Using PyTorch and Captum to verify physiological fidelity in generative neuroimaging.

Abstract

As deep learning models like NeuroBOLT bridge the gap between EEG and fMRI, their "black box" nature remains a barrier to clinical adoption. Regulatory standards (FDA/EU AI Act) now demand that medical AI be transparent and explainable. This paper presents a "Glass Box" auditing framework built entirely in PyTorch. By implementing an Audit Layer using Integrated Gradients, we demonstrate how to map 4D volumetric outputs back to 1D EEG frequency bands, enabling clinicians to verify physiological reasoning before patient care decisions are made.

01. The "Black Box" Liability

State-of-the-art generative models demonstrate mathematical excellence in EEG-to-fMRI synthesis, often achieving high Structural Similarity Indices (SSIM). However, in a clinical setting, accuracy is not enough.

With the EU AI Act classifying medical AI systems as "High-Risk," hospitals face a binary choice: deploy transparent, auditable systems or do not deploy at all. A "Black Box" model that outputs a statistically plausible but physiologically hallucinated brain scan represents a catastrophic failure mode.

The Compliance Challenge

"High-risk AI systems shall be designing and developed in such a way to ensure that their operation is sufficiently transparent to enable users to interpret the system's output and use it appropriately."
- EU AI Act, Article 13

02. The Clinical Auditing System

We introduce a post-hoc interpretability pipeline—the Audit Layer—that wraps the core generative model. This system does not just predict; it explains.

Physiological Verification

Identifying if predictions are driven by valid neural oscillations (Alpha/Delta waves) or noise artifacts (muscle movement, electrode drift).

Interpretable Reasoning

Mapping high-dimensional 4D fMRI voxels back to specific 1D EEG time-series segments, creating a "Reverse Neuro-Attribution" map.

03. Technical Implementation (PyTorch & Captum)

We leverage the Captum library for model interpretability. Specifically, we employ Integrated Gradients, an axiomatic attribution method that assigns an importance score to each input feature (EEG sample point) by approximating the integral of gradients.

The Audit Layer Code Structure

The following PyTorch snippet demonstrates how we inspect a specific voxel's activation (e.g., in the Amygdala) and traceback its cause to the EEG input.

from captum.attr import IntegratedGradients def audit_voxel_prediction(model, eeg_input, target_voxel_coords): """ Backpropagates from a specific fMRI voxel to finding the driving EEG features. """ ig = IntegratedGradients(model) # Define the target: The intensity of one specific voxel (x,y,z) target_idx = flatten_coords(target_voxel_coords) # Calculate attribution scores attributions, delta = ig.attribute( inputs=eeg_input, target=target_idx, n_steps=50, return_convergence_delta=True ) return attributions # Returns importance map over the EEG waveform

Cross-Modal Reasoning: The Reality Check

Once attributions are computed, we perform spectral analysis on the "important" EEG segments. A compliant prediction must obey neurovascular coupling laws.

  • Pass Case: High attribution in the 0.5-4Hz (Delta) band correlates with Thalamic activation (Sleep maintenance).
  • Fail Case: High attribution in >40Hz (Gamma) or sudden voltage spikes correlates with Deep Brain activation. This is likely EMG (muscle) artifact or noise.
Alert: Hemodynamic Lag Violation Detected (Response < 2s)

Conclusion: Compliance-by-Code

By wrapping generative models in this "Glass Box" safety layer, we satisfy the "human oversight" and transparency requirements of modern healthcare regulation. This turns PyTorch from a research sandbox into a viable vehicle for clinical deployment, ensuring that when the AI speaks, it speaks based on physiology, not noise.