LAGOON: Achieving bounded individual fairness through classification frequency equalization

Maria Silva; Iago Chaves; Javam Machado

doi:10.5753/jbcs.2024.3468

Authors

Maria Silva Universidade Federal do Ceará https://orcid.org/0000-0002-1032-8187
Iago Chaves Universidade Federal do Ceará https://orcid.org/0000-0002-1733-3069
Javam Machado Universidade Federal do Ceará https://orcid.org/0000-0002-8430-9421

DOI:

https://doi.org/10.5753/jbcs.2024.3468

Keywords:

Fairness, Classification, Utility

Abstract

One of the main concerns about using machine learning models for classification is algorithmic discrimination. Several works define different meanings of fairness to avoid or mitigate unfair classifications against minorities. The achievement of algorithmic fairness implies modifying training data, model operation, or outputs. Hence, the fair algorithm may modify the original classification. Generally, fairness means not discriminating against a person or a group. In a utopia, a system would classify every person or minority as privileged, which may decrease the utility of classification. We define λ-fairness, a relaxation of individual fairness designed to achieve fairness while maintaining utility with configurable parameters. We also propose a post-processing method that uses frequency equalization to achieve fairness in machine learning models by generalizing the outputs into frequencies. We used this flexible approach on LAGOON, an algorithm that achieves λ-fairness using frequency equalization. For experiments, we employ three benchmarks with different contexts to evaluate the quality of our approach. We compared our results to two baselines that aim to achieve fairness and minimize utility loss.

Downloads

References

Almuallim, H., Kaneda, S., and Akiba, Y. (2002). Development and applications of decision trees. In Expert Systems, pages 53-77. Elsevier. DOI: 10.1016/B978-012443880-4/50047-8.

Argawal, P. (2021). How is automated credit decisioning transforming digital lending. Available online [link]. Last accessed: July 29, 2022.

Blockeel, H., Devos, L., Frénay, B., Nanfack, G., and Nijssen, S. (2023). Decision trees: from efficient prediction to responsible ai. Frontiers in Artificial Intelligence, 6. DOI: 10.3389/frai.2023.1124553.

Calders, T. and Verwer, S. (2010). Three naive bayes approaches for discrimination-free classification. Data mining and knowledge discovery, 21:277-292. DOI: 10.1007/s10618-010-0190-x.

Census, Bureau (2023). United states census bureau. Available online [link]. Last accessed: July 29, 2022.

Chawla, N. V. (2010). Data mining for imbalanced datasets: An overview. Data mining and knowledge discovery handbook, pages 875-886. DOI: 10.1007/978-0-387-09823-4_45.

Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153-163. DOI: 10.48550/arXiv.1703.00056.

Costa, V. G. and Pedreira, C. E. (2023). Recent advances in decision trees: An updated survey. Artificial Intelligence Review, 56(5):4765-4800. DOI: 10.1007/s10462-022-10275-5.

Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1):21-27. DOI: 10.1109/TIT.1967.1053964.

Dash, R. (2021). Designing next-generation credit-decisioning models. Available online [link]. Last accessed: July 29, 2022.

Dua, D., Graff, C., et al. (2017). Uci machine learning repository. Available online [link].

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012). Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214-226. DOI: 10.1145/2090236.2090255.

Galhotra, S., Shanmugam, K., Sattigeri, P., and Varshney, K. R. (2021). Interventional fairness with indirect knowledge of unobserved protected attributes. Entropy, 23(12):1571. DOI: 10.3390/e23121571.

Hardt, M., Price, E., and Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in neural information processing systems, 29. DOI: 10.48550/arXiv.1610.02413.

Kamishima, T., Akaho, S., Asoh, H., and Sakuma, J. (2012). Fairness-aware classifier with prejudice remover regularizer. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2012, Bristol, UK, September 24-28, 2012. Proceedings, Part II 23, pages 35-50. Springer. DOI: 10.1007/978-3-642-33486-3_3.

Kearns, M. and Roth, A. (2019). The ethical algorithm: The science of socially aware algorithm design. Oxford University Press. Book.

Khan, A., Baharudin, B., Lee, L. H., and Khan, K. (2010). A review of machine learning algorithms for text-documents classification. Journal of advances in information technology, 1(1):4-20. Available online [link].

Kim, K., Ohn, I., Kim, S., and Kim, Y. (2022). Slide: A surrogate fairness constraint to ensure fairness consistency. Neural Networks, 154:441-454. DOI: 10.1016/j.neunet.2022.07.027.

Kusner, M. J., Loftus, J., Russell, C., and Silva, R. (2017). Counterfactual fairness. Advances in neural information processing systems, 30. DOI: 10.48550/arXiv.1703.06856.

Lahoti, P., Gummadi, K. P., and Weikum, G. (2019a). ifair: Learning individually fair data representations for algorithmic decision making. In 2019 ieee 35th international conference on data engineering (icde), pages 1334-1345. IEEE. DOI: 10.1109/ICDE.2019.00121.

Lahoti, P., Gummadi, K. P., and Weikum, G. (2019b). Operationalizing individual fairness with pairwise fair representations. arXiv preprint arXiv:1907.01439. DOI: 10.14778/3372716.3372723.

Larson, J., Roswell, M., and Atlidakis, V. (2016). Compas. Available online [link]. July 29, 2022.

Law, E. U. (2016). Gdpr. Available online [link]. Last accessed: May 08, 2023.

Lohia, P. K., Ramamurthy, K. N., Bhide, M., Saha, D., Varshney, K. R., and Puri, R. (2019). Bias mitigation post-processing for individual and group fairness. In Icassp 2019-2019 ieee international conference on acoustics, speech and signal processing (icassp), pages 2847-2851. IEEE. DOI: 10.48550/arXiv.1812.06135.

Luong, B. T., Ruggieri, S., and Turini, F. (2011). k-nn as an implementation of situation testing for discrimination discovery and prevention. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 502-510. DOI: 10.1145/2020408.2020488.

Mashat, A. F., Fouad, M. M., Philip, S. Y., and Gharib, T. F. (2012). A decision tree classification model for university admission system. International Journal of Advanced Computer Science and Applications, 3(10). Available online [link].

Mutalemwa, P., Kisoka, W., Nyingo, V., Barongo, V., and Malecela, M. (2008). Manifestations and reduction strategies of stigma and discrimination on people living with hiv/aids in tanzania. Tanzania journal of health research, 10(4). DOI: 10.4314/thrb.v10i4.45077.

Pappada, R. and Pauli, F. (2022). Discrimination in machine learning algorithms. arXiv preprint arXiv:2207.00108. DOI: 10.48550/arXiv.2207.00108.

Pedreshi, D., Ruggieri, S., and Turini, F. (2008). Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 560-568. DOI: 10.1145/1401890.1401959.

Pitoura, E., Stefanidis, K., and Koutrika, G. (2021). Fairness in rankings and recommenders: Models, methods and research directions. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pages 2358-2361. IEEE. DOI: 10.1109/ICDE51399.2021.00265.

Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., and Weinberger, K. Q. (2017). On fairness and calibration. Advances in neural information processing systems, 30. Available online [link].

Ramentol, E., Olsson, T., and Barua, S. (2021). Machine learning models for industrial applications. In AI and Learning Systems-Industrial Applications and Future Directions. IntechOpen. DOI: 10.5772/intechopen.93043.

Ramos Salas, X., Alberga, A., Cameron, E., Estey, L., Forhan, M., Kirk, S., Russell-Mayhew, S., and Sharma, A. (2017). Addressing weight bias and discrimination: moving beyond raising awareness to creating change. Obesity Reviews, 18(11):1323-1335. DOI: 10.1111/obr.12592.

Salimi, B., Rodriguez, L., Howe, B., and Suciu, D. (2019). Interventional fairness: Causal database repair for algorithmic fairness. In Proceedings of the 2019 International Conference on Management of Data, pages 793-810. DOI: 10.1145/3299869.3319901.

Shen, A., Tong, R., and Deng, Y. (2007). Application of classification models on credit card fraud detection. In 2007 International conference on service systems and service management, pages 1-4. IEEE. DOI: 10.1109/ICSSSM.2007.4280163.

Shen, X., Tseng, G. C., Zhang, X., and Wong, W. H. (2003). On ψ-learning. Journal of the American Statistical Association, 98(463):724-734. DOI: 10.1198/016214503000000639.

Shih, M., Young, M. J., and Bucher, A. (2013). Working to reduce the effects of discrimination: Identity management strategies in organizations. American Psychologist, 68(3):145. DOI: 10.1037/a0032250.

UK Government (2013). Equality act 2010: Chapter 1 protected characteristics. Available online [link]. Last accessed: October 19, 2023.

Yona, G. and Rothblum, G. (2018). Probably approximately metric-fair learning. In International conference on machine learning, pages 5680-5688. PMLR. DOI: 10.48550/arXiv.1803.03242.

Zafar, M. B., Valera, I., Rodriguez, M. G., and Gummadi, K. P. (2017). Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics, pages 962-970. PMLR. Available online [link].

Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C. (2013). Learning fair representations. In International conference on machine learning, pages 325-333. PMLR. Available online [link].

Zhang, J., Xie, Y., Wu, Q., and Xia, Y. (2019). Medical image classification using synergic deep learning. Medical image analysis, 54:10-19. DOI: 10.1016/j.media.2019.02.010.

Zliobaite, I. (2015). A survey on measuring indirect discrimination in machine learning. arXiv preprint arXiv:1511.00148. DOI: 10.48550/arXiv.1511.00148.