Driver vigilance estimation is an important task for transportation safety. Wearable and portable brain-computer interface devices provide a powerful means for real-time monitoring of the vigilance level of drivers to help with avoiding distracted or impaired driving. In this paper, we propose a novel multimodal architecture for in-vehicle vigilance estimation from Electroencephalogram and Electrooculogram. To enable the system to focus on the most salient parts of the learned multimodal representations, we propose an architecture composed of a capsule attention mechanism following a deep Long Short-Term Memory (LSTM) network. Our model learns hierarchical dependencies in the data through the LSTM and capsule feature representation layers. To better explore the discriminative ability of the learned representations, we study the effect of the proposed capsule attention mechanism including the number of dynamic routing iterations as well as other parameters. Experiments show the robustness of our method by outperforming other solutions and baseline techniques, setting a new state-of-the-art. We then provide an analysis on different frequency bands and brain regions to evaluate their suitability for driver vigilance estimation. Lastly, an analysis on the role of capsule attention, multimodality, and robustness to noise is performed, highlighting the advantages of our approach.
Sign-in or become an IEEE member to discover the full contents of the paper.