Clinical documents are rich free-text data sources containing valuable medication and symptom information, which have a great potential to improve health care. In this paper, we build an integrating system for extracting medication names and symptom names from clinical notes. Then we apply nonnegative matrix factorization (NMF) and multi-view NMF to cluster clinical notes into meaningful clusters based on sample-feature matrices. Our experimental results show that multi-view NMF is a preferable method for clinical document clustering. Moreover, we find that using extracted medication/symptom names to cluster clinical documents outperforms just using words.
Sign-in or become an IEEE member to discover the full contents of the paper.