Explainable Multimodal Models for Critical Infrastructure
Published:
Critical infrastructure operators increasingly rely on multimodal perception systems that fuse imagery, acoustic signatures, and telemetry feeds. Unfortunately, explainability research has lagged the architectural complexity of these systems. I propose a governance framework that blends modal-specific rationales with a global semantic narrative aligned to operator workflows. The pipeline begins with disentangled encoders whose latent spaces are regularised to preserve modality provenance. During inference, each encoder emits a sparse explanation graph that ties salient observations back to physical phenomena, for example corrosion cues in thermal imagery or harmonic anomalies in vibration spectra.
