Restoring Continuity: An Aggregative, Open-Source Methodology for Harmonizing Disparate Census Tracts Across Time

Authors

DOI:

https://doi.org/10.5753/jisa.2026.6795

Keywords:

Census, Spatial data, Data Science, Urban Computing

Abstract

Urban analysis often relies on census or census-like survey data, but the redesign of census tract layers often breaks historical comparability, thereby hampering longitudinal studies at the local scale. In this article, we propose an automated methodology for making census tracts comparable through the construction of a graph, using spatial join operations and non-spatial id comparisons. We adopt an aggregative heuristic to minimize the impact of the matching process on analytical possibilities. To validate the proposed methodology, we deploy it to match the most recent Brazilian censuses for the São Paulo, Vitória, and Recife metropolitan regions, as well as for the Brazilian Federal District. We also test our implementation for New York City and Buenos Aires. We then evaluate the results against available gold standard data and alternative metrics, concluding with a discussion of potential improvements. Our implementation produces comparability files efficiently and has the potential to enable studies that would otherwise be impractical.

Downloads

Download data is not yet available.

References

Allen, J. and Taylor, Z. (2018). A new tool for neighbourhood change research: The Canadian Longitudinal Census Tract Database, 1971–2016. Canadian Geographies / Géographies canadiennes, 62(4):575-588. DOI: 10.1111/cag.12467.

Barbosa, R. J. (2014). Comparabilidade das informações disponíveis nos Censos (1960-2010) e PNADs (1976, 1985, 1995 e 2005). São Paulo: Centro de Estudos da Metrópole. Available at:[link].

Breen, C. F. and Feehan, D. M. (2025). New Data Sources for Demographic Research. Population and Development Review, 51(1):539-573. DOI: 10.1111/padr.12671.

Buzzelli, M. (2020). Modifiable Areal Unit Problem. In International Encyclopedia of Human Geography, pages 169-173. Elsevier. DOI: 10.1016/B978-0-08-102295-5.10406-8.

Cockings, S., Fisher, P. F., and Longford, M. (1997). Parameterization and Visualization of the Errors in Areal Interpolation. Geographical Analysis, 29(4):314-328. DOI: 10.1111/j.1538-4632.1997.tb00967.x.

Comber, A. and Zeng, W. (2019). Spatial interpolation using areal features: A review of methods and opportunities using new forms of data with coded illustrations. Geography Compass, 13(10):e12465. DOI: 10.1111/gec3.12465.

Dias, F. and Silver, D. (2021). Neighborhood Dynamics with Unharmonized Longitudinal Data. Geographical Analysis, 53(2):170-191. DOI: 10.1111/gean.12224.

Flowerdew, R. and Openshaw, S. (1987). A review of the problems of transferring data from one set of areal units to another incompatible set. Northern Regional Research Laboratory. Book.

Gehlke, C. E. and Biehl, K. (1934). Certain Effects of Grouping upon the Size of the Correlation Coefficient in Census Tract Material. Journal of the American Statistical Association, 29(185A):169-170. DOI: 10.1080/01621459.1934.10506247.

GeoLitycs (2007). CensusCD Neighborhood ChangeDatabase (NCDB) 1970-2000US Census Tract Data - User Guide. Technical report, GeoLytics, East Brunswick. Available at:[link].

Goodchild, M. F., Anselin, L., and Deichmann, U. (1993). A Framework for the Areal Interpolation of Socioeconomic Data. Environment and Planning A: Economy and Space, 25(3):383-397. DOI: 10.1068/a250383.

Gregory, I. (2002). The accuracy of areal interpolation techniques: standardising 19th and 20th century census data to allow long-term comparisons. Computers, Environment and Urban Systems, 26(4):293-314. DOI: 10.1016/S0198-9715(01)00013-8.

Hirye, M. C. d. M., Amaral, S., Monteiro, A. M. V., and Alves, D. S. (2016). Interpolação de dados censitários para análise da ocupação intraurbana em Altamira (PA) em 2000 e 2010. Revista Brasileira de Cartografia, 68(8):2016. DOI: 10.14393/rbcv68n8-44381.

IBGE (1980). IX Recenseamento Geral - Instruções para delimitação dos setores censitários. IBGE, 1 edition. Available at:[link]. Last access on: 2025-03-02.

Junior, J. U. P., Louro, T. V., Assis, L. B. M. d., and Brito, P. L. (2025). Measuring land-use mix with address-level census data. DOI: 10.31224/5975.

Jurjevich, J. R., Meehan, K., Chun, N. M. J. W., and Schrock, G. (2025). Advancing methods for comparative urban research: A city-centric protocol and longitudinal dataset for US metropolitan statistical areas. PLOS ONE, 20(3):e0316750. DOI: 10.1371/journal.pone.0316750.

Lobo, M. A. A. (2009). Método para compatibilizar setores censitários urbanos de 1991 e 2000 aplicado ao estudo da dinâmica populacional da região metropolitana de Belém (PA). Revista Brasileira de Gestão Urbana, 1(1):71-84. Available at:[link].

Logan, J. R., Xu, Z., and Stults, B. J. (2014). Interpolating U.S. Decennial Census Tract Data from as Early as 1970 to 2010: A Longitudinal Tract Database. The Professional Geographer, 66(3):412-420. DOI: 10.1080/00330124.2014.905156.

Markley, S. N., Holloway, S. R., Hafley, T. J., and Hauer, M. E. (2022). Housing unit and urbanization estimates for the continental U.S. in consistent tract boundaries, 1940–2019. Scientific Data, 9(1):82. DOI: 10.1038/s41597-022-01184-x.

Mendonça, P. H. and Kon, F. (2025). Compatibilização de setores censitários baseada em grafos para análise histórica de processos urbanos em escala local. In Workshop de Computação Urbana (CoUrb), pages 29-42. SBC. DOI: 10.5753/courb.2025.7950.

Mendonça, P. H. R., Lima, P. H. B. M., Costa, D. F., Canan, H. G., Benedusi, A. A., Giacomini, L. A., Azzolini, G. S., Andrade, L. G. E., Stroher, L. E., Santoro, P. F., and Rolnik, R. (2024). A expansão - com desadensamento - da Região Metropolitana de São Paulo entre 2010 e 2022. E-metropolis, 15. Available at:[link].

Nguyen, T., Bernard, A., Lee, R., Wilson, T., and Argent, N. (2023). Do Co-Ethnic Neighbourhoods Affect the Labour Market Outcomes of Immigrants? Longitudinal Evidence from Australia. Applied Spatial Analysis and Policy, 16(2):831-850. DOI: 10.1007/s12061-023-09505-2.

Norman, P., Colbert, J., and Exeter, D. J. (2023). Linking Individuals to Areas: Protecting Confidentiality While Preserving Research Utility. Spatial Demography, 11(3):10. DOI: 10.1007/s40980-023-00121-9.

Openshaw, S. (1979). A million or so correlated coefficients: three experiment on the modifiable areal unit problem. Statistical applications in the spatial sciences. Available at:[link].

Pérez, V. and Pavía, J. M. (2024). Automating the transfer of data between census sections and postal codes areas over time. an application to spain. Journal of Regional Research. Available at:[link].

Reis, E., Pimentel, M., Alvarenga, A. I., and Santos, M. d. C. H. (2008). Áreas mínimas comparáveis para os períodos intercensitários de 1872 a 2000. Rio de Janeiro: Ipea/Dimac, 40. Available at:[link].

Reis, I. A. (2013). Compatibilização de populações entre malhas censitárias diferentes com o uso de imagens de sensores orbitais. Anais XVI Simpósio Brasileiro de Sensoriamento Remoto - SBSR - INPE. Available at:[link].

Rodríguez, G. M. (2021). Comparabilidad retrospectiva en la cartografía censal digital del INDEC. estado actual, avances y desafíos en Argentina y la ciudad de Buenos Aires. Población de Buenos Aires, 18(30):22-33. Available at:[link].

Rodriguez, G. M. and de Grande, P. E. (2024). Base cartográfica de radios del censo argentino 2022: Primera versión revisada y corregida para uso en Sistemas de Información Geográfica. Available at:[link]. Last access on: 2025-03-02.

Ruther, M., Leyk, S., and Buttenfield, B. P. (2015). Comparing the effects of an nlcd-derived dasymetric refinement on estimation accuracies for multiple areal interpolation methods. GIScience & Remote Sensing, 52(2):158-178. DOI: 10.1080/15481603.2015.1018856.

Schroeder, J., Van Riper, D., Manson, S., Knowles, K., Kugler, T., Roberts, F., and Ruggles, S. (2025). IPUMS National Historical Geographic Information System. DOI: 10.18128/D050.V20.0.

Schroeder, J. P. (2007). Target-Density Weighting Interpolation and Uncertainty Evaluation for Temporal Analysis of Census Data. Geographical Analysis, 39(3):311-335. DOI: 10.1111/j.1538-4632.2007.00706.x.

US Census Bureau (2024a). History of census tracts and blocks. Available at:[link]. Last access on: 2025-03-02.

US Census Bureau (2024b). Relationship files. Available at:[link]. Last access on: 2025-03-02.

Yamaguchi, F. Y. (2017). Avaliação de dados de grades regulares para fins estatísticos. Master's thesis, Escola Politécnica da Universidade Federal da Bahia. Available at:[link].

Downloads

Published

2026-04-25

How to Cite

Mendonça, P. H. R., & Kon, F. (2026). Restoring Continuity: An Aggregative, Open-Source Methodology for Harmonizing Disparate Census Tracts Across Time. Journal of Internet Services and Applications, 17(1), 136–151. https://doi.org/10.5753/jisa.2026.6795

Issue

Section

Research article