Ozcan Demirbas, Arcelik A.S. - Arctic S.A.
The project refers to practical efforts to reconcile three freestyle written note columns (Call Reason/Technician Note/Description) in 10 different languages over 161.000 data lines. The presentation further focuses on categorizing 161.000 rows to their respective Failure Modes.By using Text Mining techniques we take out all unnecessary words like punctuations, 2-character words, Numbers, subsequently we tokenize the words using Stemmer and then create topic groups by Topic extractor. For machine learning, we divided the data into learning and test data. The first part is put into Tree Ensemble Learner as learning data, and the second part is put into the Tree Ensemble predictor as test data. We have reached 75% accuracy rate to predict the Failure Modes correctly.