An unsupervised pattern extraction system and method for extracting incident and root cause patterns from various kinds of machine data such as system-level metric values, system call traces, and semi-structured or free form text log data and performing holistic root cause analysis for distributed systems. The system utilizing Natural Language Processing and machine learning techniques to extract incident and root cause information from received incident reports and other system data. The system consists of both real time data collection (104) and analytics functions (200). The previously reported incident data is used to discover and apply remediation techniques to utilize prior remediation efforts to automatically classify and correct incidents. The system may then annotate a remediation data file with the technique applied. The system will utilize prior known remediation techniques for identified categories to predict and prevent future issues.