**Functional Data Analysis for Biomechanics**

Elizabeth Crane, David Childers, Geoffrey Gerstner and Edward Rothman *University of Michigan U.S.A*

### **1. Introduction**

The application of nonlinear tools and advanced statistical methods is becoming more prevalent in biomechanical analyses. In a traditional biomechanics laboratory with motion analysis equipment, large amounts of kinematic data can be collected relatively easily. However, a significant gap exists between all the data that are collected and the data that are actually analyzed. Because movements occur over a period of time, whether seconds or minutes, each movement is represented by a continuous series of kinematic data (e.g., 60 or 120 observations per second). Using standard analytic methods, the continuous data associated with each movement are often reduced to a single discrete number. This reduction to a single summary value, such as a peak flexion, extension, or range of motion, excludes potentially valuable information. Reducing a curve representing hip motion during gait to a single range of motion value, for example, precludes the analysis of the entire movement pattern or the timing of the movement. A handful of investigations have recognized this limitation and have begun using functions to maintain the shape and timing of the movement in the analysis.

The primary purpose of this chapter is to introduce an emerging collection of statistical methods called Functional Data Analysis (FDA). FDA is distinct from traditional analytic methods because how data changes continuously over time can be assessed. Therefore, information in continuous signals can be retained, such as changes in joint angles or in landmark positions during a movement task. FDA can be used for both exploratory and hypothesis driven analyses with traditional multivariate statistical methods that have been modified for functional predictor and response variables. Although representing motion data as a set of functions is not new to biomechanics analyses (Chester & Wrigley, 2008; Deluzio & Astephen, 2007; Landry et al., 2007; Lee et al., 2009; Sadeghi et al., 2002; 2000), statistical methods developed specifically for analyzing these functions have not been available. More recently, FDA methods have been used within biomechanics to study mastication (Crane et al., 2010), back pain (Page et al., 2006), as well as age, gender, and speed effects on walking (Røislien et al., 2009). Given the interest in and need for treating motion data as functions, it is important that methods for analyzing a set a functions using emerging statistical methods are brought to the attention of those in the biomechanics community.

Although several excellent references exist for Functional Data Analysis (Ramsay, 2000; Ramsay et al., 2009; Ramsay & Silverman, 2002; 2005) there are important issues for biomechanists to be aware of when implementing this set of statistical tools. Therefore, the aims of this chapter are to provide an overview of the steps associated with FDA, to focus on

Fig. 2. A single chewing cycle. Dots indicate the recorded observations while the line

Section 2.2 for one possible solution to this issue). Second, wild fluctuations in the data can be

Functional Data Analysis for Biomechanics 79

Fig. 3. Functions representing displacement, velocity, and acceleration for a single chewing

Functions can be smoothed by minimizing the number of basis functions and using regression analysis or by using a roughness penalty approach. Regression splines use a least squares

represents the function that was fit to the observed data.

amplified as a result of the basis system selected.

cycle.

**2.2 Smoothing**

issues related to aligning curves from cyclical data, to discuss specific statistical techniques, and finally to provide an example of how to implement the techniques introduced in the chapter.

### **2. Overview of basic FDA procedures**

There are four primary procedures in FDA: transforming data into functions, smoothing functions, registering functions, and analysis. The purpose of this section is to provide a conceptual introduction to each of these procedures. Open source tools are available for performing the procedures described in this chapter using R statistical software (R Development Core Team, 2010) along with the freely available *fda* library (Ramsay et al., 2010).

### **2.1 Transforming data into functions**

The first step in FDA is transforming the captured time series data into functions. For cyclical behaviors such as locomotion or mastication, it is common to begin by dividing a long sequence of motion data captured during walking or chewing into individual cycles (Fig. 1).

Each cycle is then transformed into a function. Therefore, if one cycle was represented by *n* observations, by transforming those data points into a function, the behavior can now be considered a single functional observation (Fig. 2).

In Functional Data Analysis, functions are represented as a linear combination of a set of basis functions (see Equation 1).

$$f(t) = \sum\_{k=1}^{K} c\_k \phi\_k(t) \tag{1}$$

One of the first decisions that must be made is the type of basis system to use to represent the data. Typical selections are Fourier and spline basis systems, but others such as polynomials or wavelets can also be used. A Fourier basis system is typically used for periodic data, while splines (in particular b-splines) are used for non-periodic data. For *K* basis functions, there will be *K* corresponding coefficients. As the number of basis functions used to model the behavior increases, complexity also increases. While increasing the number of basis functions can produce an excellent fit to the data, the complexity of the function increases computational time for certain procedures and can increase the analytical complexity and interpretation.

An important advantage of transforming discrete data into functions is the ease with which derivatives can be computed and assessed, a common task in biomechanical analyses (Fig. 3). Two points should be given consideration when transforming data into functions if derivatives are going to be used. First, end point error can be amplified in derivatives (see 2 Will-be-set-by-IN-TECH

issues related to aligning curves from cyclical data, to discuss specific statistical techniques, and finally to provide an example of how to implement the techniques introduced in the

There are four primary procedures in FDA: transforming data into functions, smoothing functions, registering functions, and analysis. The purpose of this section is to provide a conceptual introduction to each of these procedures. Open source tools are available for performing the procedures described in this chapter using R statistical software (R Development Core Team, 2010) along with the freely available *fda* library (Ramsay et al., 2010).

The first step in FDA is transforming the captured time series data into functions. For cyclical behaviors such as locomotion or mastication, it is common to begin by dividing a long sequence of motion data captured during walking or chewing into individual cycles (Fig. 1).

Fig. 1. Displacement data representing a time series of chewing kinematics captured using an

Each cycle is then transformed into a function. Therefore, if one cycle was represented by *n* observations, by transforming those data points into a function, the behavior can now be

In Functional Data Analysis, functions are represented as a linear combination of a set of basis

*K* ∑ *k*=1

One of the first decisions that must be made is the type of basis system to use to represent the data. Typical selections are Fourier and spline basis systems, but others such as polynomials or wavelets can also be used. A Fourier basis system is typically used for periodic data, while splines (in particular b-splines) are used for non-periodic data. For *K* basis functions, there will be *K* corresponding coefficients. As the number of basis functions used to model the behavior increases, complexity also increases. While increasing the number of basis functions can produce an excellent fit to the data, the complexity of the function increases computational time for certain procedures and can increase the analytical complexity and interpretation. An important advantage of transforming discrete data into functions is the ease with which derivatives can be computed and assessed, a common task in biomechanical analyses (Fig. 3). Two points should be given consideration when transforming data into functions if derivatives are going to be used. First, end point error can be amplified in derivatives (see

*ckφk*(*t*) (1)

*f*(*t*) =

chapter.

**2. Overview of basic FDA procedures**

**2.1 Transforming data into functions**

optoelectronic motion capture system.

functions (see Equation 1).

considered a single functional observation (Fig. 2).

Fig. 2. A single chewing cycle. Dots indicate the recorded observations while the line represents the function that was fit to the observed data.

Section 2.2 for one possible solution to this issue). Second, wild fluctuations in the data can be amplified as a result of the basis system selected.

Fig. 3. Functions representing displacement, velocity, and acceleration for a single chewing cycle.

### **2.2 Smoothing**

Functions can be smoothed by minimizing the number of basis functions and using regression analysis or by using a roughness penalty approach. Regression splines use a least squares

biologically relevant points in the curve (Fig. 5). While this may seem obvious, it is important to understand the possible sources of noise introduced by the measurement technique as well as how the data smoothing method modified the curve to reduce the noise. We note that different techniques can produce different results that may affect interpretation of the behavior and analysis. This is illustrated below with a chewing cycle. Within a cycle, there are four biologically relevant phases and, therefore, three transition points between phases. The four phases, in order shown in Figure 5, are fast close (FC), slow close (SC), slow open (SO), and fast open (FO). When two different smoothing techniques are applied to the same curve prior to identifying the location of the transition points between these phases, it is clear how

Functional Data Analysis for Biomechanics 81

Fig. 5. From left to right are the FC-SC, maximum close, and SO-FO transitions. Dashed lines represent the location of transitions identified after regression spline smoothing and solid lines represent the location of transitions identified after roughness penalty smoothing.

An important curve that many biomechanists generate is a mean curve to represent a specific behavior. It is well known that when a curve contains both phase and amplitude variation, they can not be easily compared and an average curve does not accurately represent the true behavior (Fig. 6). Therefore, the purpose of curve alignment is to reduce phase variability while preserving the curves shape and amplitude. A common method for aligning curves used in biomechanics applications is a linear time normalization procedure. An analogous procedure exists in Function Data Analysis called registration. Curves can be registered using a continuous method or by using landmarks. Both approaches use time warping functions, which align curves to minimize phase variation. Two curves are perfectly aligned if they differ only in amplitude. If *f*<sup>1</sup> and *f*<sup>2</sup> are perfectly aligned, then the plot of *f*1(*t*) versus *f*2(*t*) will yield a straight line. Continuous registration uses an iterative numerical approach to choose time warping functions that maximize the proportionality of all the functional observations. The algorithm begins by aligning each individual curve in the set of curves to the group average curve. Once the new set of aligned curves is generated, a new average curve is

**2.3 Registering functions - Curve alignment**

the smoothing technique can substantially change the location of these points.

estimation process. The advantage of regression splines is their simplicity. However, the use of a large number *K* of basis functions relative to the number of data points in the curve tends to overfit the data. Conversely, using a small number of regression splines risks losing localized functional features. Further, regression splines can produce end-point error in derivative estimates. Thus, regression splines can work well when simplicity is important, but have limitations for more complex cases.

An alternative to using regression splines to smooth the data is an approach that smoothes the data using a roughness penalty. With this approach it is common to have one basis function for each observation in the curve. Because this curve may contain noise, which can be especially amplified in derivatives, a measure of function complexity is penalized to impose smoothness. Unlike regression splines, the roughness penalty approach uses a large number of basis functions without overfitting or sigularity problems. Figure 4 illustrates the same curve fit with each type of smoothing.

Fig. 4. The left plot overlays raw data (dashed line) from a chewing cycle with a function (solid line) fit to the data using a regression spline. The right plot overlays the same raw data (dashed line) with a function (solid line) fit to the data using a roughness penalty approach.

For further discussion and computational details related to smoothing see Ramsay et al. (2009). Briefly, the roughness of a curve *f* is typically measured as the integrated squared second derivative (Equation 2).

$$\text{ROUGH}(f) = \int [D^2 f(t)]^2 \mathbf{d}t \tag{2}$$

In Equation 2, [*D*<sup>2</sup> *f*(*t*)]<sup>2</sup> is a measure of roughness of function *f* at value time *t*. The idea is that when a function contains noise, the noise is amplified in the second derivative. Therefore, the square of the second derivative will be large where noise is present in the curve. To create the smooth curve, a multiple of the roughness penalty is applied to the error sum of squares and a smoothing parameter *λ* is used to specify the degree of penalty. As *λ* approaches 0, the fit of the function to the observations improves.

Whether data are filtered using more traditional methods, such as digital filters, or are smoothed using regression splines or a roughness penalty approach, it is important to note that each of these smoothing techniques has the potential to impact the location of specific and 4 Will-be-set-by-IN-TECH

estimation process. The advantage of regression splines is their simplicity. However, the use of a large number *K* of basis functions relative to the number of data points in the curve tends to overfit the data. Conversely, using a small number of regression splines risks losing localized functional features. Further, regression splines can produce end-point error in derivative estimates. Thus, regression splines can work well when simplicity is important, but have

An alternative to using regression splines to smooth the data is an approach that smoothes the data using a roughness penalty. With this approach it is common to have one basis function for each observation in the curve. Because this curve may contain noise, which can be especially amplified in derivatives, a measure of function complexity is penalized to impose smoothness. Unlike regression splines, the roughness penalty approach uses a large number of basis functions without overfitting or sigularity problems. Figure 4 illustrates the same

Fig. 4. The left plot overlays raw data (dashed line) from a chewing cycle with a function (solid line) fit to the data using a regression spline. The right plot overlays the same raw data (dashed line) with a function (solid line) fit to the data using a roughness penalty approach. For further discussion and computational details related to smoothing see Ramsay et al. (2009). Briefly, the roughness of a curve *f* is typically measured as the integrated squared

In Equation 2, [*D*<sup>2</sup> *f*(*t*)]<sup>2</sup> is a measure of roughness of function *f* at value time *t*. The idea is that when a function contains noise, the noise is amplified in the second derivative. Therefore, the square of the second derivative will be large where noise is present in the curve. To create the smooth curve, a multiple of the roughness penalty is applied to the error sum of squares and a smoothing parameter *λ* is used to specify the degree of penalty. As *λ* approaches 0, the

Whether data are filtered using more traditional methods, such as digital filters, or are smoothed using regression splines or a roughness penalty approach, it is important to note that each of these smoothing techniques has the potential to impact the location of specific and

[*D*<sup>2</sup> *f*(*t*)]2d*t* (2)

ROUGH(*f*) =

limitations for more complex cases.

curve fit with each type of smoothing.

second derivative (Equation 2).

fit of the function to the observations improves.

biologically relevant points in the curve (Fig. 5). While this may seem obvious, it is important to understand the possible sources of noise introduced by the measurement technique as well as how the data smoothing method modified the curve to reduce the noise. We note that different techniques can produce different results that may affect interpretation of the behavior and analysis. This is illustrated below with a chewing cycle. Within a cycle, there are four biologically relevant phases and, therefore, three transition points between phases. The four phases, in order shown in Figure 5, are fast close (FC), slow close (SC), slow open (SO), and fast open (FO). When two different smoothing techniques are applied to the same curve prior to identifying the location of the transition points between these phases, it is clear how the smoothing technique can substantially change the location of these points.

Fig. 5. From left to right are the FC-SC, maximum close, and SO-FO transitions. Dashed lines represent the location of transitions identified after regression spline smoothing and solid lines represent the location of transitions identified after roughness penalty smoothing.

### **2.3 Registering functions - Curve alignment**

An important curve that many biomechanists generate is a mean curve to represent a specific behavior. It is well known that when a curve contains both phase and amplitude variation, they can not be easily compared and an average curve does not accurately represent the true behavior (Fig. 6). Therefore, the purpose of curve alignment is to reduce phase variability while preserving the curves shape and amplitude. A common method for aligning curves used in biomechanics applications is a linear time normalization procedure. An analogous procedure exists in Function Data Analysis called registration. Curves can be registered using a continuous method or by using landmarks. Both approaches use time warping functions, which align curves to minimize phase variation. Two curves are perfectly aligned if they differ only in amplitude. If *f*<sup>1</sup> and *f*<sup>2</sup> are perfectly aligned, then the plot of *f*1(*t*) versus *f*2(*t*) will yield a straight line. Continuous registration uses an iterative numerical approach to choose time warping functions that maximize the proportionality of all the functional observations. The algorithm begins by aligning each individual curve in the set of curves to the group average curve. Once the new set of aligned curves is generated, a new average curve is

Fig. 7. This figure shows the warping functions associated with each of the registered curves.

Functional Data Analysis for Biomechanics 83

An important advantage of treating the behavior as a function rather than *n* discrete data points is that the data do not have to be reduced to a single number to be analyzed. For example, it is common to identify peak flexion or extension or range of motion for analysis using traditional multivariate statistical methods. However, reducing the data into one or two descriptive statistics, such as an average, maxima, or minima, eliminates a large amount of valuable information that cannot be summarized with one number. In traditional multivariate statistics, high correlation among predictor variables causes unstable results. Because the points of a curve in a time series are highly correlated, putting all of the points representing the curve into a multivariate analysis could result in severe collinearity. By using FDA and representing each curve as a function, it is possible to use a functional analogue of traditional

methods without the problem of collinearity. An example is provided in Section 4.

One advantage of FDA is that position variables, as well as their time derivatives, are all possibilities that can be considered for use in registration. However, there are characteristics associated with curves that improve registration. The presence or absence of these characteristics can suggest whether position or derivative variables should be used. The main consideration is that curves with characteristic features such as maxima, minima, or inflection points result in a set of registered curves with less phase variability compared to registering a set of curves without one or more of these features, particularly during

The issue of having a peak or inflection point that can be used for registration purposes is an important one. Take, for example, position data representing a chewing cycle. The chewing cycle can be defined two different ways, from maximum open to maximum open

**3. Important considerations for curve registration**

**3.1 What makes a curve suitable for registration?**

**2.4 Statistical tests**

continuous registration.

calculated and the set of aligned curves are aligned again to the new average curve. See Ramsay & Silverman (2005) for further details.

Landmark registration is in many ways a simpler form of registering a set of curves. A landmark is defined as a point that is identifiable in every curve. This may be a minima, maxima, or zero crossing. Landmark registration aligns all specified landmarks by transforming time for each curve so that the landmarks occur in the same location. A common procedure for landmark registration is to first identify the location of the landmark of interest for each individual curve. Next, an average of these landmarks is calculated. Finally, time is transformed for each curve such that the identified landmark occurs at the average location. Although landmark registration is computationally less intensive than continuous registration, there is an upfront data processing cost to identifying the location of the landmark(s) for each individual curve.

Fig. 6. A set of five curves illustrating the effect of curve alignment. Without alignment the average curve does not provide an accurate representation of the behavior.

Whether continuous registration or landmark registration is used, the outcomes include a new set of aligned curves as well as time warping functions. The time warping functions contain information about how phase was adjusted. A unique time warping function is associated with each individual curve. Figure 7 shows the warping functions associated with each of the registered curves.

6 Will-be-set-by-IN-TECH

calculated and the set of aligned curves are aligned again to the new average curve. See

Landmark registration is in many ways a simpler form of registering a set of curves. A landmark is defined as a point that is identifiable in every curve. This may be a minima, maxima, or zero crossing. Landmark registration aligns all specified landmarks by transforming time for each curve so that the landmarks occur in the same location. A common procedure for landmark registration is to first identify the location of the landmark of interest for each individual curve. Next, an average of these landmarks is calculated. Finally, time is transformed for each curve such that the identified landmark occurs at the average location. Although landmark registration is computationally less intensive than continuous registration, there is an upfront data processing cost to identifying the location

Fig. 6. A set of five curves illustrating the effect of curve alignment. Without alignment the

Whether continuous registration or landmark registration is used, the outcomes include a new set of aligned curves as well as time warping functions. The time warping functions contain information about how phase was adjusted. A unique time warping function is associated with each individual curve. Figure 7 shows the warping functions associated with each of the

average curve does not provide an accurate representation of the behavior.

registered curves.

Ramsay & Silverman (2005) for further details.

of the landmark(s) for each individual curve.

Fig. 7. This figure shows the warping functions associated with each of the registered curves.

### **2.4 Statistical tests**

An important advantage of treating the behavior as a function rather than *n* discrete data points is that the data do not have to be reduced to a single number to be analyzed. For example, it is common to identify peak flexion or extension or range of motion for analysis using traditional multivariate statistical methods. However, reducing the data into one or two descriptive statistics, such as an average, maxima, or minima, eliminates a large amount of valuable information that cannot be summarized with one number. In traditional multivariate statistics, high correlation among predictor variables causes unstable results. Because the points of a curve in a time series are highly correlated, putting all of the points representing the curve into a multivariate analysis could result in severe collinearity. By using FDA and representing each curve as a function, it is possible to use a functional analogue of traditional methods without the problem of collinearity. An example is provided in Section 4.

### **3. Important considerations for curve registration**

### **3.1 What makes a curve suitable for registration?**

One advantage of FDA is that position variables, as well as their time derivatives, are all possibilities that can be considered for use in registration. However, there are characteristics associated with curves that improve registration. The presence or absence of these characteristics can suggest whether position or derivative variables should be used. The main consideration is that curves with characteristic features such as maxima, minima, or inflection points result in a set of registered curves with less phase variability compared to registering a set of curves without one or more of these features, particularly during continuous registration.

The issue of having a peak or inflection point that can be used for registration purposes is an important one. Take, for example, position data representing a chewing cycle. The chewing cycle can be defined two different ways, from maximum open to maximum open

unusually long phase at the end of the cycle where the teeth are together (i.e., closed gape). Figure 9 illustrates a raw, unaltered cycle, the corresponding time-normalized cycle, and the cycle that has been padded at the end using Page et al. (2006) method. The problem that occurs is that the phase at the end is artificial, since it has only been added to extend the duration of the cycle. The consequence to this solution is that statistical analyses, such as functional principal component analysis or functional analysis of variance, will provide inaccurate results because data were inappropriately added to cycles, thereby changing the behavioral pattern of the movement. Therefore, the outcome of these types of typical FDA analyses will provide inaccurate results. Thus, although completing raw data may be suitable for discrete tasks, the interpretation of the behavior will be affected and incorrect if this

Functional Data Analysis for Biomechanics 85

Fig. 9. Illustration of the outcome of creating a common time interval by completing the raw

As an alternative, padding cycles by evenly distributing missing values could be a potential solution. However, Figure 10 illustrates that the problem remains with the missing value

Fig. 10. Illustration of the outcome of creating a common time interval by padding the raw cycles by evenly distributing missing values. The new curve does not provide an accurate

Because a common time interval is required for FDA analyses, it is important to assess whether registration procedures are necessary after standard time normalization has been applied to the data. Crane et al. (2010) found that registration after time normalization may not always be necessary. They concluded that when obtaining an average curve was the purpose of registration, the additional alignment did not affect the interpretation of the behavior

data with a suitable number of pairs. The new curve does not provide an accurate

method is applied to cyclical data.

representation of the behavior.

representation of the behavior.

method.

or maximum close to maximum close (Fig. 8). Defining the cycle from maximum close to maximum close produces a curve without characteristic features needed for registration. It is important to consider that position data may not always have characteristic features that are needed for registration, yet using position data is sometimes important for a particular analysis. The absence of a characteristic feature can be ameliorated in cyclical position data by two possible methods. First, one can consider dividing cycles using a different start and end event that may adjust the cycle to include a clearly defined maxima, minima, or zero crossing within the cycle. Second, a derivate of the position curves can be estimated and these derivative curves can be aligned. Once the derivative curves are aligned, the resulting time transformation information associated with each curve can be used to register the position curves. To illustrate the difference the selection of start and end events can make in the shape of the curve, Figure 8 is an example of a cycle that begins and ends with the teeth together (i.e., maximum close). Maximum opening represents the characteristics feature that can be used for registration. Therefore, a clearly defined minima is present in the displacement data. This curve can be compared to a chewing cycle that begins and ends with maximum open (see Fig. 2)

Fig. 8. Example chewing cycle define from maximum close to maximum close.

### **3.2 Is time normalization necessary?**

As previously discussed, when curve alignment is required prior to an analysis, there are three registration methods from which to select: linear time normalization, continuous registration, and landmark registration. However, an issue occurs when working with cyclical data that require time normalization prior to continuous or landmark registration. That is, a prerequisite of the FDA registration procedures is that all curves have the same time interval. Three methods can be considered for achieving this prerequisite when cycles have variable durations.

The first method is a procedure used by Page et al. (2006) for creating a common time interval across all cycles. Their procedure recommends completing the raw data with a suitable number of pairs so that the trial with the longest duration becomes the duration for all trials. While this will work for single movements, such as the sit-to-stand movement presented in the Page et al. (2006) analysis, it does not work for cyclical data. When cycles are padded at the end to create a common time interval, artifacts are created in the data that create two problems precluding the generalization of results. First, artifacts in cyclical data change the interpretation of the actual behavior. For example, when individual chewing cycles are extended so that each cycle has the same duration, chewing is characterized as having an 8 Will-be-set-by-IN-TECH

or maximum close to maximum close (Fig. 8). Defining the cycle from maximum close to maximum close produces a curve without characteristic features needed for registration. It is important to consider that position data may not always have characteristic features that are needed for registration, yet using position data is sometimes important for a particular analysis. The absence of a characteristic feature can be ameliorated in cyclical position data by two possible methods. First, one can consider dividing cycles using a different start and end event that may adjust the cycle to include a clearly defined maxima, minima, or zero crossing within the cycle. Second, a derivate of the position curves can be estimated and these derivative curves can be aligned. Once the derivative curves are aligned, the resulting time transformation information associated with each curve can be used to register the position curves. To illustrate the difference the selection of start and end events can make in the shape of the curve, Figure 8 is an example of a cycle that begins and ends with the teeth together (i.e., maximum close). Maximum opening represents the characteristics feature that can be used for registration. Therefore, a clearly defined minima is present in the displacement data. This curve can be compared to a chewing cycle that begins and ends with maximum open (see

Fig. 8. Example chewing cycle define from maximum close to maximum close.

As previously discussed, when curve alignment is required prior to an analysis, there are three registration methods from which to select: linear time normalization, continuous registration, and landmark registration. However, an issue occurs when working with cyclical data that require time normalization prior to continuous or landmark registration. That is, a prerequisite of the FDA registration procedures is that all curves have the same time interval. Three methods can be considered for achieving this prerequisite when cycles have variable

The first method is a procedure used by Page et al. (2006) for creating a common time interval across all cycles. Their procedure recommends completing the raw data with a suitable number of pairs so that the trial with the longest duration becomes the duration for all trials. While this will work for single movements, such as the sit-to-stand movement presented in the Page et al. (2006) analysis, it does not work for cyclical data. When cycles are padded at the end to create a common time interval, artifacts are created in the data that create two problems precluding the generalization of results. First, artifacts in cyclical data change the interpretation of the actual behavior. For example, when individual chewing cycles are extended so that each cycle has the same duration, chewing is characterized as having an

**3.2 Is time normalization necessary?**

Fig. 2)

durations.

unusually long phase at the end of the cycle where the teeth are together (i.e., closed gape). Figure 9 illustrates a raw, unaltered cycle, the corresponding time-normalized cycle, and the cycle that has been padded at the end using Page et al. (2006) method. The problem that occurs is that the phase at the end is artificial, since it has only been added to extend the duration of the cycle. The consequence to this solution is that statistical analyses, such as functional principal component analysis or functional analysis of variance, will provide inaccurate results because data were inappropriately added to cycles, thereby changing the behavioral pattern of the movement. Therefore, the outcome of these types of typical FDA analyses will provide inaccurate results. Thus, although completing raw data may be suitable for discrete tasks, the interpretation of the behavior will be affected and incorrect if this method is applied to cyclical data.

Fig. 9. Illustration of the outcome of creating a common time interval by completing the raw data with a suitable number of pairs. The new curve does not provide an accurate representation of the behavior.

As an alternative, padding cycles by evenly distributing missing values could be a potential solution. However, Figure 10 illustrates that the problem remains with the missing value method.

Fig. 10. Illustration of the outcome of creating a common time interval by padding the raw cycles by evenly distributing missing values. The new curve does not provide an accurate representation of the behavior.

Because a common time interval is required for FDA analyses, it is important to assess whether registration procedures are necessary after standard time normalization has been applied to the data. Crane et al. (2010) found that registration after time normalization may not always be necessary. They concluded that when obtaining an average curve was the purpose of registration, the additional alignment did not affect the interpretation of the behavior

Fig. 11. Example of knee angle data from a gait cycle.

• *fi*(*tj*) denotes the knee angle for trial *i* at time *tj* • *β<sup>j</sup>* is the effect of knee angle at time *tj* on *Y*

to reliably estimate a model of this complexity.

causing instability in estimated effects.

Table 1. *Y* versus peak knee flexion

• *i* = 1, ... , 60 indexes the trial

*β*ˆ

*β*ˆ

Estimate Std. Error t p-val

<sup>0</sup> 0.55 .55 1.00 0.33

Functional Data Analysis for Biomechanics 87

<sup>1</sup> -0.01 0.01 -0.34 0.73

100 ∑ *j*=1

This multiple regression model uses each angle measurement rather than just the angle at

1. With 101 parameters (including the intercept) and only 60 trials, there is not enough data

2. There would likely be a strong correlation between knee angles at nearby time points,

The functional nature of knee angle is apparent in Figure 11. By considering the data as a sample of observations from a smooth underlying curve, we will be able to use all of the information without resorting to the problematic multiple regression model. Before fitting a

*β<sup>j</sup> fi*(*tj*) + *�<sup>i</sup>* (4)

*Yi* = *β*<sup>0</sup> +

• *tj* (j = 1, ... ,100) indexes the times during the cycle when gait is measured

peak flexion. However, there are serious problems with this proposed model.

enough to warrant the additional data manipulation. So when might registration be useful? When specific functional data analyses, such as those discussed in Section 4, will be used, Crane et al. (2010) found that decreases in inter-subject variability within homogenous groups indicate registration may be justified. Because of this, when functional data analyses are used, registration improves the likelihood of finding significant differences between two or more populations, if they indeed exist.

### **4. Specific functional data analyses**

This section discusses how the linear model, one of the classic methods of data analysis, can be extended to functional biomechanical data. We illustrate the ideas in this section using a subset of data collected as part of a study by Gross et al. (In Press) that contains knee angles during a gait cycle and height normalized walking speed for 60 trials.

### **4.1 Classical linear model**

In the classical linear model, a response variable *Y* is modeled as a linear combination of predictor variables *X* and a random distrubance term *�*.

$$Y = \beta\_0 + \sum\_{j=1}^{p} \beta\_j X\_j + \epsilon \tag{3}$$

We can think of *β<sup>j</sup>* as the effect of *Xj* on the expected value of *Y*. Common goals of a linear model include


### **4.2 Functional linear model example**

Figure 11 displays the knee angle throughout a gait cycle for one of the 60 trials. Time is normalized so that the start and end of the gait cycle are 0 and 1, respectively. Notice that peak flexion occurs at time ≈ 0.7.

Suppose we wish to investigate whether peak flexion (denoted *X*) can predict a scalar outcome such as walking speed (normalized velocity) *Y*. The method of least-squares can be used to fit a model of the form *Y*ˆ = *β*ˆ <sup>0</sup> + *β*ˆ <sup>1</sup>*X*, where


The results of the fitted linear model on the *n* = 60 trials are presented in Table 1. Note that there is no significant linear relationship between peak knee flexion and *Y* (*p* = 0.73)

The analysis above, however, only selects a single feature of the gait cycle (peak flexion) as a predictor of *Y*. An entire gait cycle contains much more information than just the maximum, and perhaps angles other than the absolute maximum can predict *Y*. With 100 time-normalized angle measurements on each of 60 trials, we *could* make use of all the information in a multiple regression by fitting a model of the form

10 Will-be-set-by-IN-TECH

enough to warrant the additional data manipulation. So when might registration be useful? When specific functional data analyses, such as those discussed in Section 4, will be used, Crane et al. (2010) found that decreases in inter-subject variability within homogenous groups indicate registration may be justified. Because of this, when functional data analyses are used, registration improves the likelihood of finding significant differences between two or more

This section discusses how the linear model, one of the classic methods of data analysis, can be extended to functional biomechanical data. We illustrate the ideas in this section using a subset of data collected as part of a study by Gross et al. (In Press) that contains knee angles

In the classical linear model, a response variable *Y* is modeled as a linear combination of

*p* ∑ *j*=1

We can think of *β<sup>j</sup>* as the effect of *Xj* on the expected value of *Y*. Common goals of a linear

Figure 11 displays the knee angle throughout a gait cycle for one of the 60 trials. Time is normalized so that the start and end of the gait cycle are 0 and 1, respectively. Notice that

Suppose we wish to investigate whether peak flexion (denoted *X*) can predict a scalar outcome such as walking speed (normalized velocity) *Y*. The method of least-squares can be used to fit

The results of the fitted linear model on the *n* = 60 trials are presented in Table 1. Note that

The analysis above, however, only selects a single feature of the gait cycle (peak flexion) as a predictor of *Y*. An entire gait cycle contains much more information than just the maximum, and perhaps angles other than the absolute maximum can predict *Y*. With 100 time-normalized angle measurements on each of 60 trials, we *could* make use of all the

there is no significant linear relationship between peak knee flexion and *Y* (*p* = 0.73)

*βjXj* + *�* (3)

*Y* = *β*<sup>0</sup> +

during a gait cycle and height normalized walking speed for 60 trials.

predictor variables *X* and a random distrubance term *�*.

<sup>0</sup> + *β*ˆ

<sup>1</sup>*X*, where

information in a multiple regression by fitting a model of the form

populations, if they indeed exist.

**4.1 Classical linear model**

model include

1. Estimating the *β<sup>j</sup>*

2. Creating confidence intervals for *β<sup>j</sup>* 3. Testing the hypothesis *H*<sup>0</sup> : *β<sup>j</sup>* = 0

**4.2 Functional linear model example**

peak flexion occurs at time ≈ 0.7.

a model of the form *Y*ˆ = *β*ˆ

2. *X* is peak flexion

3. *β*ˆ

4. *β*ˆ

1. *Y*ˆ is predicted value of *Y*

<sup>0</sup> is the intercept, and

<sup>1</sup> is the estimated effect of angle on *Y*

**4. Specific functional data analyses**

Fig. 11. Example of knee angle data from a gait cycle.


Table 1. *Y* versus peak knee flexion

$$Y\_i = \beta\_0 + \sum\_{j=1}^{100} \beta\_j f\_i(t\_j) + \varepsilon\_i \tag{4}$$


This multiple regression model uses each angle measurement rather than just the angle at peak flexion. However, there are serious problems with this proposed model.


The functional nature of knee angle is apparent in Figure 11. By considering the data as a sample of observations from a smooth underlying curve, we will be able to use all of the information without resorting to the problematic multiple regression model. Before fitting a

Fig. 13. Regression coefficient associated with the functional linear model. Dashed lines represent the 95% confidence interval. When this interval does not include zero there is a

This section illustrates how the functional linear model reveals a more complete picture of a relationship than analysis of features (such as maxima). Our example involves a scalar outcome, but methodology exists for functional outcomes as well. Various other classic statistical methods also have functional analogues, including Analysis of Variance, Canonical

Functional Data Analysis for Biomechanics 89

The following code can be implemented in R (R Development Core Team, 2010) , an open source statistical software package. Once R is installed, the fda library (Ramsay et al., 2010) as well as the CraneLib will need to be installed. The fda library contains the functions that perform the procedures discussed in this chapter. The CraneLib contains the example data

mybasis <- create.bspline.basis(rangeval=c(0, 1), nbasis=NULL,

significant relationship between knee angle and *Y*.

**4.3 Other functional data analyses**

**5. Implementation**

library(fda) library(CraneLib) matplot(dataset)

Correlations, and Principal Components.

needed to run the following example.

knots=c(0,.2,.4,.6,.8,1)

norder=4, breaks=knots,

functional linear model, we must first fit a smooth curve to the data in Figure 11. We do this using b-spline basis functions. To avoid overfitting, a roughness penalty is imposed on the curvature of the fitted function. The smooth fits for 3 trials are displayed in Figure 12.

Fig. 12. Functions fit to three trials.

After estimating knee angle versus time functions for all 60 trials, we can replace the discrete angle measurements *f*(*t*1), ..., *f*(*t*100) with a continuous angle function *f*(*t*). With this continuous function, we can write the form of the **functional** linear model:

$$y\_i = \beta\_0 + \int \beta(t) f\_i(t) dt + \varepsilon\_i \tag{5}$$

In the functional linear model, *β*(*t*) is a coefficient function whose values capture the effect of knee angle at time *t* on walking speed. The details of how the functional linear model is estimated are beyond the scope of this text (see Ramsay et al. (2009) for more details). The results, however, describe the power of the functional linear model. Figure 13 shows how knee angle affects *Y* at various points in the gait cycle. Recall that the results from the ordinary linear regression of *Y* versus maximum angle showed no relationship. This is consistent with Figure 13 which shows *β*(*t*) close to 0 near the times of peak flexion (typically occurring between 0.71 and 0.73). Both the functional model and the ordinary linear model show no substantial relationship between *Y* and knee angle in this range of the gait cycle. The functional linear model does indicate that knee angle between .31 and .46, .55 and .64, as well as .85 and .95 seem to have an association with *Y*. When the confidence bands for the coefficient function are entirely above 0, this indicates that individuals with greater extension at these stages tend to also have an increased gait velocity *Y*. When the confidence bands for the coefficient are entirely below 0, this indicates that individuals with greater flexion at these stages tend to have a decreased velocity.

12 Will-be-set-by-IN-TECH

functional linear model, we must first fit a smooth curve to the data in Figure 11. We do this using b-spline basis functions. To avoid overfitting, a roughness penalty is imposed on the

After estimating knee angle versus time functions for all 60 trials, we can replace the discrete angle measurements *f*(*t*1), ..., *f*(*t*100) with a continuous angle function *f*(*t*). With

In the functional linear model, *β*(*t*) is a coefficient function whose values capture the effect of knee angle at time *t* on walking speed. The details of how the functional linear model is estimated are beyond the scope of this text (see Ramsay et al. (2009) for more details). The results, however, describe the power of the functional linear model. Figure 13 shows how knee angle affects *Y* at various points in the gait cycle. Recall that the results from the ordinary linear regression of *Y* versus maximum angle showed no relationship. This is consistent with Figure 13 which shows *β*(*t*) close to 0 near the times of peak flexion (typically occurring between 0.71 and 0.73). Both the functional model and the ordinary linear model show no substantial relationship between *Y* and knee angle in this range of the gait cycle. The functional linear model does indicate that knee angle between .31 and .46, .55 and .64, as well as .85 and .95 seem to have an association with *Y*. When the confidence bands for the coefficient function are entirely above 0, this indicates that individuals with greater extension at these stages tend to also have an increased gait velocity *Y*. When the confidence bands for the coefficient are entirely below 0, this indicates that individuals with greater flexion at these

*β*(*t*)*fi*(*t*)*dt* + *�<sup>i</sup>* (5)

this continuous function, we can write the form of the **functional** linear model:

*yi* = *β*<sup>0</sup> +

curvature of the fitted function. The smooth fits for 3 trials are displayed in Figure 12.

Fig. 12. Functions fit to three trials.

stages tend to have a decreased velocity.

Fig. 13. Regression coefficient associated with the functional linear model. Dashed lines represent the 95% confidence interval. When this interval does not include zero there is a significant relationship between knee angle and *Y*.

### **4.3 Other functional data analyses**

This section illustrates how the functional linear model reveals a more complete picture of a relationship than analysis of features (such as maxima). Our example involves a scalar outcome, but methodology exists for functional outcomes as well. Various other classic statistical methods also have functional analogues, including Analysis of Variance, Canonical Correlations, and Principal Components.

### **5. Implementation**

The following code can be implemented in R (R Development Core Team, 2010) , an open source statistical software package. Once R is installed, the fda library (Ramsay et al., 2010) as well as the CraneLib will need to be installed. The fda library contains the functions that perform the procedures discussed in this chapter. The CraneLib contains the example data needed to run the following example.

```
library(fda)
library(CraneLib)
matplot(dataset)
knots=c(0,.2,.4,.6,.8,1)
mybasis <- create.bspline.basis(rangeval=c(0, 1), nbasis=NULL,
    norder=4, breaks=knots,
```
plot(reglist\$warpfd)

lines(regMean, col=2)

plot(origMean)

**6. Conclusions**

sample code in R.

**7. References**

origMean <- mean.fd(myfd)

regMean <- mean.fd(reglist\$regfd)

doi:10.1016/j.jbiomech.2010.04.024.

doi:10.1016/j.humov.2011.05.001.

*Journal of Biomechanics* 39(13): 2526–2534.

URL: *http://www.R-project.org/*

*Statistical Association* 95(449): 9.

*Gait & Posture* 25(1): 86–93.

42(14): 2226–2230.

Functional Data Analysis is an important analytical method that can be used for exploratory and hypothesis driven analyses. A primary advantage to FDA is the ability to assess continuous data that change over time without having to reduce the signal into discrete variables. However, there are some important issues for biomechanists to be aware of when implementing this set of statistical tools. Therefore, this chapter provided an overview of the steps associated with FDA, focused on issues related to aligning curves from cyclical data, discussed specific statistical analytical techniques, and finally provided an example using

Functional Data Analysis for Biomechanics 91

Chester, V. L. & Wrigley, A. T. (2008). The identification of age-related differences in kinetic gait

Crane, E. A., Cassidy, R. B., Rothman, E. D. & Gerstner, G. E. (2010). Effect of

Deluzio, K. J. & Astephen, J. L. (2007). Biomechanical features of gait waveform data

Gross, M. M., Crane, E. A. & Fredrickson, B. (In Press). Effort-shape and kinematic

Landry, S. C., McKean, K. A., Hubley-Kozey, C. L., Stanish, W. D. & Deluzio, K. J. (2007). Knee

Lee, M., Roan, M. & Smith, B. (2009). An application of principal component analysis for lower

Page, A., Ayala, G., LeÛn, M. T., Peydro, M. F. & Prat, J. M. (2006). Normalizing temporal

R Development Core Team (2010). *R: A Language and Environment for Statistical Computing*, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.

Ramsay, J. O. (2000). Functional components of variaion in handwriting, *Journal of the American*

fast walking speed, *Journal of Biomechanics* 40(8): 1754–1761.

parameters using principal component analysis, *Clinical Biomechanics* 23(2): 212–220.

registration on cyclical kinematic data, *Journal of Biomechanics* 43(12): 2444–2447.

associated with knee osteoarthritis: An application of principal component analysis,

assessment of bodily expression of emotion during gait, *Human Movement Science* .

biomechanics of moderate oa patients measured during gait at a self-selected and

body kinematics between loaded and unloaded walking, *Journal of Biomechanics*

patterns to analyze sit-to-stand movements by using registration of functional data,

```
dropind=NULL, quadvals=NULL, values=NULL, names="bspl")
myfd <- Data2fd(dataset,argvals=seq(0, 1, len = 101),
    basisobj=mybasis)
plot(myfd)
plot(myfd[1]) # plots the first function
plot(myfd[1:3]) # plots the first three functions
samples <- seq(0,1, length=101)
par(mfrow=c(1,1), pty="m")
plotfit.fd(dataset, samples, myfd)
lambda <- 1e-12
norder <- 6
samples <- seq(0,1, length=101)
nbasis <- length(samples) + norder-2
mybasis <- create.bspline.basis(c(0,1), nbasis, norder, samples)
myfdPar <- fdPar(mybasis, 4, lambda)
myfd <- smooth.basis(samples, dataset, myfdPar)$fd
plot(myfd)
plot(myfd[1]) # plots the first function
plot(myfd[1:3]) # plots the first three functions
par(mfrow=c(1,1), pty="m")
plotfit.fd(dataset, samples, myfd)
FirstDeriv <- deriv.fd(myfd, 1)
plot(FirstDeriv)
DerivMat <- eval.fd(samples, FirstDeriv)
DerivMat[,1]
lambda <- 1
nbasis <- myfd$basis$nbasis
ntrials <- dim(dataset)[2]
y0fd <- mean.fd(myfd)
yfd <- myfd
y0vec <- eval.fd(samples, y0fd)
yvec <- eval.fd(samples, yfd)
coef0 <- matrix(0, nrow=nbasis, ncol=ntrials)
Wfd0 <- fd(coef0, mybasis)
WfdPar <- fdPar(Wfd0, 2, lambda)
reglist <- register.fd(y0fd, yfd, WfdPar, iterlim = 10, dbglev = 1)
names(reglist)
plot(reglist$regfd)
```

```
plot(reglist$warpfd)
origMean <- mean.fd(myfd)
regMean <- mean.fd(reglist$regfd)
plot(origMean)
lines(regMean, col=2)
```
### **6. Conclusions**

14 Will-be-set-by-IN-TECH

mybasis <- create.bspline.basis(c(0,1), nbasis, norder, samples)

dropind=NULL, quadvals=NULL, values=NULL, names="bspl") myfd <- Data2fd(dataset,argvals=seq(0, 1, len = 101),

basisobj=mybasis)

samples <- seq(0,1, length=101) par(mfrow=c(1,1), pty="m")

samples <- seq(0,1, length=101) nbasis <- length(samples) + norder-2

par(mfrow=c(1,1), pty="m")

nbasis <- myfd\$basis\$nbasis ntrials <- dim(dataset)[2] y0fd <- mean.fd(myfd)

Wfd0 <- fd(coef0, mybasis)

y0vec <- eval.fd(samples, y0fd) yvec <- eval.fd(samples, yfd)

WfdPar <- fdPar(Wfd0, 2, lambda)

myfdPar <- fdPar(mybasis, 4, lambda)

plotfit.fd(dataset, samples, myfd)

FirstDeriv <- deriv.fd(myfd, 1)

plot(myfd[1]) # plots the first function

DerivMat <- eval.fd(samples, FirstDeriv)

coef0 <- matrix(0, nrow=nbasis, ncol=ntrials)

reglist <- register.fd(y0fd, yfd, WfdPar, iterlim = 10, dbglev = 1)

plotfit.fd(dataset, samples, myfd)

plot(myfd[1]) # plots the first function

plot(myfd[1:3]) # plots the first three functions

myfd <- smooth.basis(samples, dataset, myfdPar)\$fd

plot(myfd[1:3]) # plots the first three functions

plot(myfd)

lambda <- 1e-12 norder <- 6

plot(myfd)

plot(FirstDeriv)

DerivMat[,1]

lambda <- 1

yfd <- myfd

names(reglist) plot(reglist\$regfd) Functional Data Analysis is an important analytical method that can be used for exploratory and hypothesis driven analyses. A primary advantage to FDA is the ability to assess continuous data that change over time without having to reduce the signal into discrete variables. However, there are some important issues for biomechanists to be aware of when implementing this set of statistical tools. Therefore, this chapter provided an overview of the steps associated with FDA, focused on issues related to aligning curves from cyclical data, discussed specific statistical analytical techniques, and finally provided an example using sample code in R.

### **7. References**


**1. Introduction**

computer biomechanics.

movements, but is not sufficient to answer questions like:

or the interior of automobiles).

**5** 

 *Germany* 

**Biomechanical Computer Models** 

In the past decade computer models have become very popular in the field of biomechanics due to exponentially increasing computer power. Biomechanical computer models can roughly be subdivided into two groups: multi-body models and numerical models. The theoretical aspects of both modelling strategies will be introduced. However, the focus of this chapter lies on demonstrating the power and versatility of computer models in the field of biomechanics by presenting sophisticated finite element models of human body parts. Special attention is paid to explain the setup of individual models using medical scan data. In order to reach the goal of individualising the model a chain of tools including medical imaging, image acquisition and processing, mesh generation, material modelling and finite element simulation –possibly on parallel computer architectures- becomes necessary. The basic concepts of these tools are described and application results are presented. The chapter ends with a short outlook into the future of

The field of biomechanics suffers from one very severe restriction; in general it is not possible for ethical reasons to measure forces and pressure inside the human body. Thus, typical measurement technology in biomechanics works on the interface between body and environment. Force platforms dynamically quantify reaction forces when a person is walking or running across the sensor, electromyography (EMG) monitors action potentials of contracting muscles with electrodes attached to the human skin. The information provided by the measurement technology is very important to investigate the mechanics of

• How can we optimise movements (e.g. in sports or rehabilitation) in order to minimise the loads on the joints? How can we better understand mechanisms of injury and thus

• How do we have to design the equipment to optimally suit the patient's or the athlete's requirements in terms of mechanical behaviour? These questions are related to the

• Ineffective and destructive loading has to be minimised (e.g. by damping through

• Body protection has to be maximised (e.g. through better design of cyclists' helmets

medical or technical equipment. This optimisation process has three aspects:

innovative cushions in the shoe sole or improved orthotic devices).

improve prevention? These questions are related to the human body.

K. Engel1, R. Herpers2 and U. Hartmann3

*2Bonn-Rhine-Sieg University of Applied Sciences* 

*1German Sport University Cologne* 

*3University of Applied Sciences Koblenz* 


URL: *http://CRAN.R-project.org/package=fda*

