You are here: Home / Publications / Articles / Unsupervised machine learning to classify language dimensions to constitute the linguistic complexity of mathematical word problems
Unsupervised machine learning to classify language dimensions to constitute the linguistic complexity of mathematical word problems
D. Bednorz, M. Kleine

Unsupervised machine learning to classify language dimensions to constitute the linguistic complexity of mathematical word problems

International Electronic Journal of Mathematics Education

The study examines language dimensions of mathematical word problems and the classification of mathematical word problems according to these dimensions with unsupervised machine learning (ML) techniques. Previous research suggests that the language dimensions are important for mathematical word problems because it has an influence on the linguistic complexity of word problems. Depending on the linguistic complexity students can have language obstacles to solve mathematical word problems. A lot of research in mathematics education research focus on the analysis on the linguistic complexity based on theoretical build language dimensions. To date, however it has been unclear what empirical relationship between the linguistic features exist for mathematical word problems. To address this issue, we used unsupervised ML techniques to reveal latent linguistic structures of 17 linguistic features for 342 mathematical word problems and classify them. The models showed that three- and five-dimensional linguistic structures have the highest explanatory power. Additionally, the authors consider a four-dimensional solution. Mathematical word problem from the three-dimensional solution can be classify in two groups, three- and five-dimensional solutions in three groups. The findings revealed latent linguistic structures and groups that could have an implication of the linguistic complexity of mathematical word problems and differ from language dimensions, which are considered theoretically. Therefore, the results indicate for new design principles for interventions and materials for language education in mathematics learning and teaching.