Private Reverse Top-k Algorithms Applied on Public Data of COVID-19 in the State of Ceará
DOI:
https://doi.org/10.5753/jidm.2021.1941Keywords:
COVID-19, differentially private reverse top-k approaches, utilityAbstract
In this article we propose a differentially private reverse top-k query. Our strategy allows obtaining the less frequent data according to a search criteria, with a high guarantee of privacy of the individuals who contributed with personal data in the original database. We apply our strategy on public data for COVID-19 in the State of Ceará using two different queries. Our experimental results show that the result of the proposed top-k query returns a high degree of similarity to the result of a conventional top-k query, when the chosen budget is suitable, providing useful results for researchers, while ensuring a low probability of re-identification of individuals arising from the properties of differential privacy.
Downloads
References
Cheng, X., Su, S., Xu, S., and Li, Z. Dp-apriori: A differentially private frequent itemset mining algorithm based on transaction splitting. Computers & Security vol. 50, pp. 74–90, 2015.
Dwork, C. Differential privacy in new settings. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms. SIAM, Philadelphia, United States, pp. 174–183, 2010.
Dwork, C., McSherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference. Springer, New York, United States, pp. 265–284, 2006.
Farias, V. A., Brito, F. T., Flynn, C., Machado, J. C., Majumdar, S., and Srivastava, D. Local dampening: Differential privacy for non-numeric queries via local sensitivity. VLDB 14 (4): 521,533, 2020.
Hardt, M. and Rothblum, G. N. A multiplicative weights mechanism for privacy-preserving data analysis. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, Nevada, United States, pp. 61–70, 2010.
Lee, J. and Clifton, C. W. Top-k frequent itemsets via differentially private fp-trees. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. Association for Computing Machinery, New York, United States, pp. 931–940, 2014.
Li, N., Qardaji, W., Su, D., and Cao, J. Privbasis: Frequent itemset mining with differential privacy. VLDB 5 (11): 1340–1351, 2012.
McKenna, R. and Sheldon, D. Permute-and-flip: A new mechanism for differentially private selection. https://arxiv.org/pdf/2010.12603.pdf, 2020.
McSherry, F. and Talwar, K. Mechanism design via differential privacy. In 48th Annual Symposium on Foundations of Computer Science. Vol. 7. IEEE, Rhode Island, United States, pp. 94–103, 2007.
McSherry, F. D. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In ACM SIGMOD Int. Conf. on Management of data. Association for Computing Machinery, New York, United States, pp. 19–30, 2009.
Narayanan, A. and Shmatikov, V. How to break anonymity of the netflix prize dataset. https://arxiv.org/pdf/cs/0610105.pdf, 2006.
Sarathy, R. and Muralidhar, K. Evaluating laplace noise addition to satisfy differential privacy for numeric data. Trans. Data Priv. 4 (1): 1–17, 2011.
Silva, M. d. L. M., Chaves, I. C., and Machado, J. C. Aplicação de top-k reverso com privacidade sobre os dados públicos de covid-19 no estado do Ceará. In Anais do XXXV Simpósio Brasileiro de Bancos de Dados. SBC, SBC Open Lib, Ceará, Brazil, pp. 193–198, 2020.
SUS. Boletim epidemiológico novo coronavírus (covid-19). https://bit.ly/32yFY7a, 2020.
Vlachou, A., Doulkeridis, C., Kotidis, Y., and Nørvåg, K. Reverse top-k queries. In International Conference on Data Engineering. Piscataway, NJ IEEE 2010, Long Beach, CA, United States, pp. 365–376, 2010.
Zeng, C., Naughton, J. F., and Cai, J.-Y. On differentially private frequent itemset mining. The VLDB journal: very large data bases: a publication of the VLDB Endowment 6 (1): 25, 2012.