A new approach to optimal embedding of time series

Perinelli, Alessio

doi:10.15168/11572_280754

The analysis of signals stemming from a physical system is crucial for the experimental investigation of the underlying dynamics that drives the system itself. The field of time series analysis comprises a wide variety of techniques developed with the purpose of characterizing signals and, ultimately, of providing insights on the phenomena that govern the temporal evolution of the generating system. A renowned example in this field is given by spectral analysis: the use of Fourier or Laplace transforms to bring time-domain signals into the more convenient frequency space allows to disclose the key features of linear systems. A more complex scenario turns up when nonlinearity intervenes within a system's dynamics. Nonlinear coupling between a system's degrees of freedom brings about interesting dynamical regimes, such as self-sustained periodic (though anharmonic) oscillations ("limit cycles"), or quasi-periodic evolutions that exhibit sharp spectral lines while lacking strict periodicity ("limit tori"). Among the consequences of nonlinearity, the onset of chaos is definitely the most fascinating one. Chaos is a dynamical regime characterized by unpredictability and lack of periodicity, despite being generated by deterministic laws. Signals generated by chaotic dynamical systems appear as irregular: the corresponding spectra are broad and flat, prediction of future values is challenging, and evolutions within the systems' state spaces converge to strange attractor sets with noninteger dimensionality. Because of these properties, chaotic signals can be mistakenly classified as noise if linear techniques such as spectral analysis are used. The identification of chaos and its characterization require the assessment of dynamical invariants that quantify the complex features of a chaotic system's evolution. For example, Lyapunov exponents provide a marker of unpredictability; the estimation of attractor dimensions, on the other hand, highlights the unconventional geometry of a chaotic system's state space. Nonlinear time series analysis techniques act directly within the state space of the system under investigation. However, experimentally, full access to a system's state space is not always available. Often, only a scalar signal stemming from the dynamical system can be recorded, thus providing, upon sampling, a scalar sequence. Nevertheless, by virtue of a fundamental theorem by Takens, it is possible to reconstruct a proxy of the original state space evolution out of a single, scalar sequence. This reconstruction is carried out by means of the so-called embedding procedure: m-dimensional vectors are built by picking successive elements of the scalar sequence delayed by a lag L. On the other hand, besides posing some necessary conditions on the integer embedding parameters m and L, Takens' theorem does not provide any clue on how to choose them correctly. Although many optimal embedding criteria were proposed, a general answer to the problem is still lacking. As a matter of fact, conventional methods for optimal embedding are flawed by several drawbacks, the most relevant being the need for a subjective evaluation of the outcomes of applied algorithms. Tackling the issue of optimally selecting embedding parameters makes up the core topic of this thesis work. In particular, I will discuss a novel approach that was pursued by our research group and that led to the development of a new method for the identification of suitable embedding parameters. Rather than most conventional approaches, which seek a single optimal value for m and L to embed an input sequence, our approach provides a set of embedding choices that are equivalently suitable to reconstruct the dynamics. The suitability of each embedding choice m, L is assessed by relying on statistical testing, thus providing a criterion that does not require a subjective evaluation of outcomes. The starting point of our method are embedding-dependent correlation integrals, i.e. cumulative distributions of embedding vector distances, built out of an input scalar sequence. In the case of Gaussian white noise, an analytical expression for correlation integrals is available, and, by exploiting this expression, a gauge transformation of distances is introduced to provide a more convenient representation of correlation integrals. Under this new gauge, it is possible to test—in a computationally undemanding way—whether an input sequence is compatible with Gaussian white noise and, subsequently, whether the sequence is compatible with the hypothesis of an underlying chaotic system. These two statistical tests allow ruling out embedding choices that are unsuitable to reconstruct the dynamics. The estimation of correlation dimension, carried out by means of a newly devised estimator, makes up the third stage of the method: sets of embedding choices that provide uniform estimates of this dynamical invariant are deemed to be suitable to embed the sequence.The method was successfully applied to synthetic and experimental sequences, providing new insight into the longstanding issue of optimal embedding. For example, the relevance of the embedding window (m-1)L, i.e. the time span covered by each embedding vector, is naturally highlighted by our approach. In addition, our method provides some information on the adequacy of the sampling period used to record the input sequence.The method correctly distinguishes a chaotic sequence from surrogate ones generated out of it and having the same power spectrum. The technique of surrogate generation, which I also addressed during my Ph. D. work to develop new dedicated algorithms and to analyze brain signals, allows to estimate significance levels in situations where standard analytical algorithms are unapplicable. The novel embedding approach being able to tell apart an original sequence from surrogate ones shows its capability to distinguish signals beyond their spectral—or autocorrelation—similarities.One of the possible applications of the new approach concerns another longstanding issue, namely that of distinguishing noise from chaos. To this purpose, complementary information is provided by analyzing the asymptotic (long-time) behaviour of the so-called time-dependent divergence exponent. This embedding-dependent metric is commonly used to estimate—by processing its short-time linearly growing region—the maximum Lyapunov exponent out of a scalar sequence. However, insights on the kind of source generating the sequence can be extracted from the—usually overlooked—asymptotic behaviour of the divergence exponent. Moreover, in the case of chaotic sources, this analysis also provides a precise estimate of the system's correlation dimension. Besides describing the results concerning the discrimination of chaotic systems from noise sources, I will also discuss the possibility of using the related correlation dimension estimates to improve the third stage of the method introduced above for the identification of suitable embedding parameters. The discovery of chaos as a possible dynamical regime for nonlinear systems led to the search of chaotic behaviour in experimental recordings. In some fields, this search gave plenty of positive results: for example, chaotic dynamics was successfully identified and tamed in electronic circuits and laser-based optical setups. These two families of experimental chaotic systems eventually became versatile tools to study chaos and its possible applications. On the other hand, chaotic behaviour is also looked for in climate science, biology, neuroscience, and even economics. In these fields, nonlinearity is widespread: many smaller units interact nonlinearly, yielding a collective motion that can be described by means of few, nonlinearly coupled effective degrees of freedom. The corresponding recorded signals exhibit, in many cases, an irregular and complex evolution. A possible underlying chaotic evolution—as opposed to a stochastic one—would be of interest both to reveal the presence of determinism and to predict the system's future states. While some claims concerning the existence of chaos in these fields have been made, most results are debated or inconclusive. Nonstationarity, low signal-to-noise ratio, external perturbations and poor reproducibility are just few among the issues that hinder the search of chaos in natural systems. In the final part of this work, I will briefly discuss the problem of chasing chaos in experimental recordings by considering two example sequences, the first one generated by an electronic circuit and the second one corresponding to recordings of brain activity. The present thesis is organized as follows. The core concepts of time series analysis, including the key features of chaotic dynamics, are presented in Chapter 1. A brief review of the search for chaos in experimental systems is also provided; the difficulties concerning this quest in some research fields are also highlighted. Chapter 2 describes the embedding procedure and the issue of optimally choosing the related parameters. Thereupon, existing methods to carry out the embedding choice are reviewed and their limitations are pointed out. In addition, two embedding-dependent nonlinear techniques that are ordinarily used to characterize chaos, namely the estimation of correlation dimension by means of correlation integrals and the assessment of maximum Lyapunov exponent, are presented. The new approach for the identification of suitable embedding parameters, which makes up the core topic of the present thesis work, is the subject of Chapter 3 and 4. While Chapter 3 contains the theoretical outline of the approach, as well as its implementation details, Chapter 4 discusses the application of the approach to benchmark synthetic and experimental sequences, thus illustrating its perks and its limitations. The study of the asymptotic behaviour of the time-dependent divergent exponent is presented in Chapter 5. The alternative estimator of correlation dimension, which relies on this asymptotic metric, is discussed as a possible improvement to the approach described in Chapters 3, 4. The search for chaos out of experimental data is discussed in Chapter 6 by means of two examples of real-world recordings. Concluding remarks are finally drawn in Chapter 7.

A new approach to optimal embedding of time series / Perinelli, Alessio. - (2020 Nov 20), pp. 1-126. [10.15168/11572_280754]

A new approach to optimal embedding of time series

Perinelli, Alessio

2020-11-20

Abstract

The analysis of signals stemming from a physical system is crucial for the experimental investigation of the underlying dynamics that drives the system itself. The field of time series analysis comprises a wide variety of techniques developed with the purpose of characterizing signals and, ultimately, of providing insights on the phenomena that govern the temporal evolution of the generating system. A renowned example in this field is given by spectral analysis: the use of Fourier or Laplace transforms to bring time-domain signals into the more convenient frequency space allows to disclose the key features of linear systems. A more complex scenario turns up when nonlinearity intervenes within a system's dynamics. Nonlinear coupling between a system's degrees of freedom brings about interesting dynamical regimes, such as self-sustained periodic (though anharmonic) oscillations ("limit cycles"), or quasi-periodic evolutions that exhibit sharp spectral lines while lacking strict periodicity ("limit tori"). Among the consequences of nonlinearity, the onset of chaos is definitely the most fascinating one. Chaos is a dynamical regime characterized by unpredictability and lack of periodicity, despite being generated by deterministic laws. Signals generated by chaotic dynamical systems appear as irregular: the corresponding spectra are broad and flat, prediction of future values is challenging, and evolutions within the systems' state spaces converge to strange attractor sets with noninteger dimensionality. Because of these properties, chaotic signals can be mistakenly classified as noise if linear techniques such as spectral analysis are used. The identification of chaos and its characterization require the assessment of dynamical invariants that quantify the complex features of a chaotic system's evolution. For example, Lyapunov exponents provide a marker of unpredictability; the estimation of attractor dimensions, on the other hand, highlights the unconventional geometry of a chaotic system's state space. Nonlinear time series analysis techniques act directly within the state space of the system under investigation. However, experimentally, full access to a system's state space is not always available. Often, only a scalar signal stemming from the dynamical system can be recorded, thus providing, upon sampling, a scalar sequence. Nevertheless, by virtue of a fundamental theorem by Takens, it is possible to reconstruct a proxy of the original state space evolution out of a single, scalar sequence. This reconstruction is carried out by means of the so-called embedding procedure: m-dimensional vectors are built by picking successive elements of the scalar sequence delayed by a lag L. On the other hand, besides posing some necessary conditions on the integer embedding parameters m and L, Takens' theorem does not provide any clue on how to choose them correctly. Although many optimal embedding criteria were proposed, a general answer to the problem is still lacking. As a matter of fact, conventional methods for optimal embedding are flawed by several drawbacks, the most relevant being the need for a subjective evaluation of the outcomes of applied algorithms. Tackling the issue of optimally selecting embedding parameters makes up the core topic of this thesis work. In particular, I will discuss a novel approach that was pursued by our research group and that led to the development of a new method for the identification of suitable embedding parameters. Rather than most conventional approaches, which seek a single optimal value for m and L to embed an input sequence, our approach provides a set of embedding choices that are equivalently suitable to reconstruct the dynamics. The suitability of each embedding choice m, L is assessed by relying on statistical testing, thus providing a criterion that does not require a subjective evaluation of outcomes. The starting point of our method are embedding-dependent correlation integrals, i.e. cumulative distributions of embedding vector distances, built out of an input scalar sequence. In the case of Gaussian white noise, an analytical expression for correlation integrals is available, and, by exploiting this expression, a gauge transformation of distances is introduced to provide a more convenient representation of correlation integrals. Under this new gauge, it is possible to test—in a computationally undemanding way—whether an input sequence is compatible with Gaussian white noise and, subsequently, whether the sequence is compatible with the hypothesis of an underlying chaotic system. These two statistical tests allow ruling out embedding choices that are unsuitable to reconstruct the dynamics. The estimation of correlation dimension, carried out by means of a newly devised estimator, makes up the third stage of the method: sets of embedding choices that provide uniform estimates of this dynamical invariant are deemed to be suitable to embed the sequence.The method was successfully applied to synthetic and experimental sequences, providing new insight into the longstanding issue of optimal embedding. For example, the relevance of the embedding window (m-1)L, i.e. the time span covered by each embedding vector, is naturally highlighted by our approach. In addition, our method provides some information on the adequacy of the sampling period used to record the input sequence.The method correctly distinguishes a chaotic sequence from surrogate ones generated out of it and having the same power spectrum. The technique of surrogate generation, which I also addressed during my Ph. D. work to develop new dedicated algorithms and to analyze brain signals, allows to estimate significance levels in situations where standard analytical algorithms are unapplicable. The novel embedding approach being able to tell apart an original sequence from surrogate ones shows its capability to distinguish signals beyond their spectral—or autocorrelation—similarities.One of the possible applications of the new approach concerns another longstanding issue, namely that of distinguishing noise from chaos. To this purpose, complementary information is provided by analyzing the asymptotic (long-time) behaviour of the so-called time-dependent divergence exponent. This embedding-dependent metric is commonly used to estimate—by processing its short-time linearly growing region—the maximum Lyapunov exponent out of a scalar sequence. However, insights on the kind of source generating the sequence can be extracted from the—usually overlooked—asymptotic behaviour of the divergence exponent. Moreover, in the case of chaotic sources, this analysis also provides a precise estimate of the system's correlation dimension. Besides describing the results concerning the discrimination of chaotic systems from noise sources, I will also discuss the possibility of using the related correlation dimension estimates to improve the third stage of the method introduced above for the identification of suitable embedding parameters. The discovery of chaos as a possible dynamical regime for nonlinear systems led to the search of chaotic behaviour in experimental recordings. In some fields, this search gave plenty of positive results: for example, chaotic dynamics was successfully identified and tamed in electronic circuits and laser-based optical setups. These two families of experimental chaotic systems eventually became versatile tools to study chaos and its possible applications. On the other hand, chaotic behaviour is also looked for in climate science, biology, neuroscience, and even economics. In these fields, nonlinearity is widespread: many smaller units interact nonlinearly, yielding a collective motion that can be described by means of few, nonlinearly coupled effective degrees of freedom. The corresponding recorded signals exhibit, in many cases, an irregular and complex evolution. A possible underlying chaotic evolution—as opposed to a stochastic one—would be of interest both to reveal the presence of determinism and to predict the system's future states. While some claims concerning the existence of chaos in these fields have been made, most results are debated or inconclusive. Nonstationarity, low signal-to-noise ratio, external perturbations and poor reproducibility are just few among the issues that hinder the search of chaos in natural systems. In the final part of this work, I will briefly discuss the problem of chasing chaos in experimental recordings by considering two example sequences, the first one generated by an electronic circuit and the second one corresponding to recordings of brain activity. The present thesis is organized as follows. The core concepts of time series analysis, including the key features of chaotic dynamics, are presented in Chapter 1. A brief review of the search for chaos in experimental systems is also provided; the difficulties concerning this quest in some research fields are also highlighted. Chapter 2 describes the embedding procedure and the issue of optimally choosing the related parameters. Thereupon, existing methods to carry out the embedding choice are reviewed and their limitations are pointed out. In addition, two embedding-dependent nonlinear techniques that are ordinarily used to characterize chaos, namely the estimation of correlation dimension by means of correlation integrals and the assessment of maximum Lyapunov exponent, are presented. The new approach for the identification of suitable embedding parameters, which makes up the core topic of the present thesis work, is the subject of Chapter 3 and 4. While Chapter 3 contains the theoretical outline of the approach, as well as its implementation details, Chapter 4 discusses the application of the approach to benchmark synthetic and experimental sequences, thus illustrating its perks and its limitations. The study of the asymptotic behaviour of the time-dependent divergent exponent is presented in Chapter 5. The alternative estimator of correlation dimension, which relies on this asymptotic metric, is discussed as a possible improvement to the approach described in Chapters 3, 4. The search for chaos out of experimental data is discussed in Chapter 6 by means of two examples of real-world recordings. Concluding remarks are finally drawn in Chapter 7.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di esame finale/Defended on
	
				20-nov-2020
			
	Ciclo
	
				XXXIII
			
	Anno Accademico
	
				2019-2020
			
	Dipartimento
	
				Fisica (29/10/12-)
			
	Corso di dottorato
	
				Physics
			
	Supervisore/Relatore di tesi Unitn (Unitn internal supervisor)
	
				Ricci, Leonardo
			
	Tesi in cotutela (Bi-nationally supervised Doctoral Thesis)
	
				no
			
	Codice DOI
	
				https://dx.doi.org/10.15168/11572_280754
			
	Lingua (Language)
	
				Inglese
			
	Settori scientifico-disciplinari (validi fino a 24/06/2024) - Reference SSD (valid until 24/06/2024)
	
				Settore FIS/02 - Fisica Teorica, Modelli e Metodi Matematici
			
	Appare nelle tipologie:
	
				08.1 Tesi di dottorato (Doctoral Thesis)

File in questo prodotto:

File	Dimensione	Formato
phD_thesis_Perinelli_Alessio.pdf accesso aperto Descrizione: Manoscritto Tesi di dottorato Tipologia: Tesi di dottorato (Doctoral Thesis) Licenza: Creative commons Dimensione 8.27 MB Formato Adobe PDF Visualizza/Apri	8.27 MB	Adobe PDF	Visualizza/Apri