Bronchoscopy enables many minimally invasive chest procedures for diseases such as lung cancer and asthma. Guided by the bronchoscope’s video stream, a physician can navigate the complex three-dimensional (3-D) airway tree to collect tissue samples or administer a disease treatment. Unfortunately, physicians currently discard procedural video because of the overwhelming amount of data generated. Hence, they must rely on memory and anecdotal snapshots to document a procedure. We propose a robust automatic method for summarizing an endobronchial video stream. Inspired by the multimedia concept of the video summary and by research in other endoscopy domains, our method consists of three main steps: 1) shot segmentation, 2) motion analysis, and 3) keyframe selection. Overall, the method derives a true hierarchical decomposition, consisting of a shot set and constituent keyframe set, for a given procedural video. No other method to our knowledge gives such a structured summary for the raw, unscripted, unedited videos arising in endoscopy. Results show that our method more efficiently covers the observed endobronchial regions than other keyframe-selection approaches and is robust to parameter variations. Over a wide range of video sequences, our method required on average only 6.5% of available video frames to achieve a video coverage = 92.7%. We also demonstrate how the derived video summary facilitates direct fusion with a patient’s 3-D chest computed-tomography scan in a system under development, thereby enabling efficient video browsing and retrieval through the complex airway tree.