Finding Interesting Correlations with Conditional Heavy Hitters.

IRIS

The notion of heavy hitters—items that make up a large fraction of the population—has been successfully used in a variety of applications across sensor and RFID monitoring, network data analysis, event mining, and more. Yet this notion often fails to capture the semantics we desire when we observe data in the form of correlated pairs. Here, we are interested in items that are conditionally frequent: when a particular item is frequent within the context of its parent item. In this work, we introduce and formalize the notion of Conditional Heavy Hitters to identify such items, with applications in network monitoring, and Markov chain modeling. We introduce several streaming algorithms that allow us to find conditional heavy hitters efficiently, and provide analytical results. Different algorithms are successful for different input characteristics. We perform experimental evaluations to demonstrate the efficacy of our methods, and to study which algorithms are most suited for different types of data

Finding Interesting Correlations with Conditional Heavy Hitters.

Mirylenka, Katsiaryna;Palpanas, Themistoklis;G. Cormode;D. Srivastava

2013-01-01

Abstract

The notion of heavy hitters—items that make up a large fraction of the population—has been successfully used in a variety of applications across sensor and RFID monitoring, network data analysis, event mining, and more. Yet this notion often fails to capture the semantics we desire when we observe data in the form of correlated pairs. Here, we are interested in items that are conditionally frequent: when a particular item is frequent within the context of its parent item. In this work, we introduce and formalize the notion of Conditional Heavy Hitters to identify such items, with applications in network monitoring, and Markov chain modeling. We introduce several streaming algorithms that allow us to find conditional heavy hitters efficiently, and provide analytical results. Different algorithms are successful for different input characteristics. We perform experimental evaluations to demonstrate the efficacy of our methods, and to study which algorithms are most suited for different types of data

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2013
			
	Titolo del volume (Proceedings title)
	
				Proceedings of the 29th IEEE International Conference on Data Engineering
			
	Autore/i del libro (Book author/s)
	
				AA. VV.
			
	Luogo di edizione (Place of publication)
	
				Washington
			
	Casa editrice (Publisher)
	
				IEEE
			
	Codice Scopus (Scopus Identifier)
	
				2-s2.0-84885103403
			
	Codice WOS (WOS identifier)
	
				WOS:000324994600007
			
	Tutti gli autori
	
						Mirylenka, Katsiaryna; Palpanas, Themistoklis; G., Cormode; D., Srivastava
					
	Appare nelle tipologie:
	
				04.1 Saggio in atti di convegno (Paper in Proceedings)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/95226

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

13

9

ND

social impact