Preserving individual privacy is one of the major issues in the context of Big Data, since handling huge volumes of data may contribute to the disclosure of sensitive or personally identifiable information. In fact, even when data is anonymized there is a risk of re-identification through privacy attacks. This paper presents a re-identification risk-based anonymization framework for big data analytics platforms. This framework is based on anonymization policies and allows applying anonymization techniques and models in two stages: during the ETL process and before exporting the statistical results of data analytics. This second stage evaluates the data re-identification risk and increases the anonymity level if it is necessary to reduce this risk. Although generic, the implementation of the framework reported in this work was integrated into Ophidia as a case study. Privacy attacks were performed to check the effectiveness of the re-identification process. Results are promising, showing a low probability of re-identification in two different scenarios.

A Re-Identification Risk-Based Anonymization Framework for Data Analytics Platforms / Silva, H.; Basso, T.; Moraes, R.; Elia, D.; Fiore, S.. - (2018), pp. 101-106. (Intervento presentato al convegno 14th European Dependable Computing Conference, EDCC 2018 tenutosi a Romania nel 2018) [10.1109/EDCC.2018.00026].

A Re-Identification Risk-Based Anonymization Framework for Data Analytics Platforms

Fiore S.
2018-01-01

Abstract

Preserving individual privacy is one of the major issues in the context of Big Data, since handling huge volumes of data may contribute to the disclosure of sensitive or personally identifiable information. In fact, even when data is anonymized there is a risk of re-identification through privacy attacks. This paper presents a re-identification risk-based anonymization framework for big data analytics platforms. This framework is based on anonymization policies and allows applying anonymization techniques and models in two stages: during the ETL process and before exporting the statistical results of data analytics. This second stage evaluates the data re-identification risk and increases the anonymity level if it is necessary to reduce this risk. Although generic, the implementation of the framework reported in this work was integrated into Ophidia as a case study. Privacy attacks were performed to check the effectiveness of the re-identification process. Results are promising, showing a low probability of re-identification in two different scenarios.
2018
Proceedings - 2018 14th European Dependable Computing Conference, EDCC 2018
USA
IEEE
978-1-5386-8060-5
Silva, H.; Basso, T.; Moraes, R.; Elia, D.; Fiore, S.
A Re-Identification Risk-Based Anonymization Framework for Data Analytics Platforms / Silva, H.; Basso, T.; Moraes, R.; Elia, D.; Fiore, S.. - (2018), pp. 101-106. (Intervento presentato al convegno 14th European Dependable Computing Conference, EDCC 2018 tenutosi a Romania nel 2018) [10.1109/EDCC.2018.00026].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/292861
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact