*3.2.3 Statistics of the periodogram*

**Frequency spacing:** The frequency spacing, Δ*f*, has only general guidelines: too small frequency spacing can lead to unnecessarily long computation times, which adds up fastly for large data sets. Too coarse frequency spacing can risk missing narrow peaks in the periodogram – which would fall between adjacent grid points. However, there is a controversy when considering these frequency grids as independent points when applying statistical significance tests in the periodogram ordi-

*Real Perspective of Fourier Transforms and Current Developments in Superconductivity*

An evenly sampled time series represents a pointwise product of the original continuous signal with a sequence of Dirac delta functions (a *Dirac comb*) at the sampling times. The Nyquist limit is a direct consequence of the symmetry in this Dirac comb window. Beyond this limit, the spectrum becomes a periodic repetition of itself – that is why the periodogram is unique between the limits �*f Nyquist*. The rise of power in the spectrum beyond the Nyquist limit is called *aliasing* since these peaks are not real but "alias" of the real power inside the Nyquist interval in the

• The structure in the observing window can lead to *partial aliasing* in the

• The non-structured spacing of observations also leads to the arising of *non-*

• The maximum frequency limit *might or might not exist*, and if it exists, it tends

For irregularly sampled time series, if there is a *periodic pattern in the observation times gaps*, this can lead to a peak in the periodogram indicating a periodicity. For example, the daily pattern of measurements in astronomy: an observation in time *t*<sup>0</sup> is likely to be followed by other observation only at time *t*<sup>0</sup> þ *np* (*p* is an integer number of days, and *n* is an integer). Therefore, it can generate a peak at the

We find in the literature some proposals for the maximum frequency (Nyquistlike) limit for irregular sampling [15, 21–26]. These estimates are easy to calculate

There are also, in the literature, some Nyquist-like limits based on not-so-simple

• Greatest common divisor (gcd) of the time intervals. We should consider sampling times as integer numbers (what can always be done by re-scaling the

• Frequency limit due to *time windowing* when observations are not pointwise instantaneous, but instead, they consist of *short-time* (*δt*) *integrations* of the

and reduce to the Nyquist frequency limit in the evenly sampled case:

nates (testing for true periodicities).

For unevenly sampled time series:

*structured peaks* in the window transform;

• Arithmetic average of the time intervals.

• Harmonic average of the time intervals.

• Median of the time intervals.

statistics of the time intervals [15]:

time values).

**84**

• Minimum of the time intervals.

to be *far higher* than for the evenly sampled case.

original signal.

periodogram;

frequency *p* in the periodogram.

The classical periodogram has a fundamental statistical property for evenly sampled time series: when the signal consists solely of pure Gaussian noise, the values of the periodogram are exponentially distributed – for irregularly sampled times, this property no longer holds.

The Scargle's generalized form of the periodogram brings back that statistical simplicity for the irregularly sampled case: for time series consisting solely of pure Gaussian noise, the unnormalized periodogram has its ordinates exponentially distributed.

This statistical property is used to test for what would be a "true" periodicity in periodogram ordinates. The standard procedure is to assume that the periodogram maximum ordinate represents a true periodicity, called *Fisher criteria*, and to test this value against all others ordinates – supposedly arising from the background noise.

Scargle defined a *False Alarm Probability (FAP)* that, based on the assumed distribution of Gaussian noise, simply measures the probability that a time series without any signal would arise, due to stochastic fluctuations only, an ordinate of the observed magnitude in the periodogram. Following Scargle [15], the *detection threshold*, *z*0, is a magnitude level above which, if we claim that a peak is due a real signal, we would only be wrong a small fraction *p*<sup>0</sup> (FAP) of the time:

$$z\_0 = -\ln\left[\mathbf{1} - \left(\mathbf{1} - p\_0\right)^{1/N}\right],\tag{10}$$

where *p*<sup>0</sup> (FAP) is a small number, and *N* is the number of *independent* frequencies tested.

It is worth noting that this statistical analysis answers the question: "What is the probability that a time series without any periodic component would make arise a peak of that magnitude in the periodogram?" It *does not* answer the utterly more physically significant, more direct question: *"What is the probability that this periodogram feature comes from a periodic phenomenon?"*

The ability to analytically quantify the relationship between peak height and statistical significance of a feature in the periodogram has been one of the main reasons for the widespread use of the Lomb-Scargle periodogram [10–13, 23–25]. However, the independence of the tested frequencies remains an open issue.

Data quality (and quantity) generally reflects on the peak height related to the background noise, which gives peak *significance*, as discussed above. Neither the number of points in a time series or the signal-to-noise ratio affects the peak frequency determination nor its *precision*. The uncertainty in the frequency value of a peak is related to the peak width, usually in Fourier analysis defined as the peak half-width. For this reason, in periodogram analysis, Gaussian error bars should be avoided as a way to report uncertainties in frequency determinations.

• The maximum frequency (Nyquist-like) allowed can be much higher than in the evenly sampled case. We can define a Nyquist-like frequency related to the inverse of the minimum time interval, or even the gcd of the time intervals [8].

normalization factor. In other words, *P*ð Þ *ω<sup>a</sup> =P*ð Þ *ω<sup>b</sup>* is independent of the chosen normalization factor (there are several different options in the

• There is a low false-negative probability for the periodogram for a real signal present in the time series. Then, if we guess correctly a frequency *ω*<sup>0</sup> present in the time series, we have a very low probability of obtaining a small value for *P*ð Þ *ω*<sup>0</sup> . The problem is that we also obtain high periodogram ordinates for other frequencies besides the true ones – this makes it difficult to identify them if we

In the LST periodogram, a freely normalized version of the Lomb-Scargle periodogram is initially defined over the broadest possible range of frequencies. The frequency set is chosen to include all wavelengths about which could exist information on the time series. The next step is to choose the minimum frequency as zero, the frequency grid spacing *δf*, or the total length of time series, or some fraction in between these values. We can choose the maximum frequency as the highest frequency about which we believe there is any information in the time series, as some *a priori* known Nyquist limit, up to the limit of the inverse of the gcd of time intervals. Above this limit, we probably have to deal we some folding in the periodogram. The frequency grid *δf* has to be chosen to allow the calculation to be

*PLST*ð Þ¼ *ω K* � *PLS*ð Þ *ω* (11)

*<sup>ω</sup>P*ð Þ *<sup>ω</sup>* <sup>Δ</sup>*<sup>ω</sup>* � ��<sup>1</sup>

. This values of

• The signal-noise (S/N) ratio does not depend on the periodogram

literature).

do not know beforehand.

*Periodogram Analysis under the Popper-Bayes Approach*

*DOI: http://dx.doi.org/10.5772/intechopen.93162*

computationally feasible.

We define

various periodograms.

**87**

1.Smoothing the periodogram.

*4.1.1 Normalizing by the bandwidth total content*

where the normalizing constant *<sup>K</sup>* is set as *<sup>K</sup>* <sup>¼</sup> <sup>P</sup>

power in the bandwidth and reflect the times series total variance.

*P*ð Þ *ω<sup>a</sup> =P*ð Þ *ω<sup>b</sup>* , does not depend on the times series total power.

**4.2 Periodogram analysis by combination of information**

2. Stacking independent periodogram estimates.

the constant *K* is such that the function *PLST* normalizes to total area under the curve equals to 1 over the whole set of frequencies *ωi*. This area represents the total

The two main ideas of the Lomb-Scargle-Tarantola periodogram are

Note that the S/N ratio, as well as the ratio between any two distinct frequencies

This procedure is equivalent to a stretching of the data series variable *X t*ð Þ in the time domain and also has the property of making comparable the total power of the

Since its proposition, the periodogram is recognized as high noisy statistics, even for less noisy data. Smoothing the periodogram is not a new idea. There are several
