A machine learning approach for the analysis of demographic features of the socio-economic deprivation at sub-municipal level
DOI:
https://doi.org/10.71014/sieds.v80i3.540Keywords:
Machine learning, K-means cluster, Forest analysis, Deprivation, Socioeconomic InequalitiesAbstract
This study provides a machine learning analysis of the social and demographic features of territories, associated with different levels of deprivation, measured by the Socio-Economic Deprivation Index (SED-I). The SED-I has been developed by the Italian National Institute of Statistics (Istat) and is aimed at measuring socio-economic and educational deprivation at sub-municipal level. In this paper, we exploit the availability of socio-economic indicators and demographic context variables at Sub-Municipal Areas (SMAs) level to provide useful information for policies aimed at contrasting the economic, social, and educational deprivation as measured by the SED-I. In the context of socio-economic deprivation, territories may be characterised by a variety of interrelated variables: indicators related to the level of education in the area, employment, household structure, demographic composition referring to population density, foreign subpopulations, and age, and finally, social conditions, are employed in the analysis. Principal Component Analysis (PCA) is employed to summarise the information contained in this large number of correlated indicators. This unsupervised learning method allows us to analyse the relationships among these variables and the SED-I, pointing out the differences between two different socio-economic environments such as Trieste and Cagliari. Furthermore, a K-means cluster analysis is employed to identify population groups with similar composition in relation to deprivation and demographic features. In the K-means cluster analysis, Elbow and Silhouette methods are used to choose the optimal number of clusters. The analysis is carried out focusing on Trieste and Cagliari data, to provide a robust application on different socio-economic contests.
References
ALLIK M., BROWN D., DUNDAS R., LEYLAND A. H. 2016. Developing a new small area measure of deprivation using 2001 and 2011 census data from Scotland. Health & Place, Vol. 39, pp. 122–130.
CARBONETTI G., BIASCIUCCI F., CUTILLO A., MAZZIOTTA M., QUONDAMSTEFANO V., TAMBURRANO M.T., TRONU D. 2025a. Measuring socio-economic deprivation at sub-municipal level through the integration of census and administrative data. Italian Journal of Economic, Demographic and Statistical Studies – RIEDS, Vol. LXXIX No.1 Gennaio-Marzo 2025.
CARBONETTI G., BIASCIUCCI F., CUTILLO A., MAZZIOTTA M., QUONDAMSTEFANO V., TRONU D. 2025b. An innovative approach for the analysis of socio-economic phenomena at sub-municipal level: the household deprivation study project. Data, Statistics and AI for Well-Being of People and Organizations. Book of Short Papers of the ASA Rome Conference, pp. 121-126. Padua: Cleup. https://doi.org/10.26398/asaproc.0090
HARTIGAN J. A., WONG M. A. 1979. A K-Means Clustering Algorithm, Journal of the Royal Statistical Society Series C: Applied Statistics, Vol. 28, Issue 1, March 1979, pp. 100–108.
ISTITUTO NAZIONALE DI STATISTICA 2024. Sicurezza e stato di degrado delle città e delle loro periferie, Audizione presso la Commissione parlamentare di inchiesta sulle condizioni di sicurezza e sullo stato di degrado delle città e delle loro periferie, 26 giugno 2024, Istat, Roma, 2024.
ISTITUTO NAZIONALE DI STATISTICA 2020. Le misure della vulnerabilità: un’applicazione a diversi ambiti territoriali, Istat, Roma, 2020.
JAMES, G., WITTEN, D., HASTIE, T., TIBSHIRANI, R. 2021. Unsupervised Learning. In: An Introduction to Statistical Learning. Springer Texts in Statistics. Springer, New York, NY.
MAZZIOTTA M., PARETO A. 2017. Synthesis of indicators: the composite indicators approach. In Maggino F. (Ed.) Complexity in Society: From Indicators Construction to their Synthesis. Social Indicators Research Series., Springer, pp.159-191.
MAZZIOTTA M., PARETO A. 2016. On a Generalized Non-compensatory Composite Index for Measuring Socio-economic Phenomena. Social Indicators Research, Vol. 127, No. 3, pp. 983-1003.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Matilde Bonelli, Giancarlo Carbonetti, Elena Grimaccia, Debora Tronu

This work is licensed under a Creative Commons Attribution 4.0 International License.

