Chaos Theory and Transportation Systems:
An Instructive Example
By
Christopher Frazier, Graduate Student Researcher, The University of Texas at Austin.
6.9 E. Cockrell Jr. Hall, Austin, TX 78712-1076, stanforth@mail.utexas.edu
Kara M. Kockelman, Clare Boothe Luce Assistant Professor of Civil Engineering
The University of Texas at Austin, 6.9 E. Cockrell Jr. Hall, Austin, TX 78712-1076
kkockelm@mail.utexas.edu, Phone: 512-471-0210, FAX: 512-475-8744
(Corresponding Author)
To be Presented at the 83rd Annual Meeting of the Transportation Research Board
January 2004, Washington D.C.
ABSTRACT
Chaos theory is used to analyze highly complex systems and thus may be useful for
transportation applications. In this paper, a series of analyses find and exploit chaos are outlined,
including time delays and embedding dimensions, Fourier power series, the correlation
dimension, the largest Lyapunov exponent, and predictions. As an example, traffic flow data is
analyzed and found to be chaotic, though it is shown that this could be the result of high-
frequency noise. When used with a low-pass filter, predictions based on chaos theory are shown
to have greater predictive power than a nonlinear least-squares method.
INTRODUCTION
Transportation systems are complex entities. Their current state and future evolution
depend greatly on myriad properties of interacting, often highly variable physical and human
elements. Proper representation of all dynamics in a model is complex: while certain
relationships can be developed through analysis, incorporation of immeasurable quantities, such
as laws and social codes, creates further complications. Faced with such challenges, chaotic data
analysis appears a promising option for many transportation applications. A branch of nonlinear
analysis, chaos theory is used to describe non-repeating systems that are too complex for
traditional techniques1. Importantly, chaos theory allows one to distinguish between random,
probabilistic, and deterministic systems.
On both theoretical and practical levels, there are three major benefits of chaos theory for
system analysis:
• It applies to highly nonlinear systems.
• It naturally accounts for all important system dynamics.
• It uncovers system information and relationships without having to uncover the laws or
equations of the underlying dynamics.
While it will not provide one with equations to develop a traditional model, it extracts
valuable information involving all system dynamics for general application. Chaos theory
seems naturally applicable to transportation systems, yet little convincing real-world evidence
exists. To this end, this paper illustrates such an application, in the context of traffic flow data.
This traffic example serves simply as a context for the framework, tools, and details of system
investigation that this paper provides.
LITERATURE REVIEW
A large amount of chaos theory literature has been published in the last decade. The
"discoverer" of chaos, Lorenz (1993) provides helpful, entry-level discussion. Casdagli et al.
(1992) present an effective introduction to chaos theory and analysis techniques. Hilborn (2001)
offers a broad yet detailed examination of chaos, while Argyris et al. (1994) provide more
mathematically complex coverage. Abarbanel (1996) offers an excellent presentation of chaotic
data analysis techniques.
In the context of transportation, much less has been published. Prigogine and Herman
(1971) modeled traffic integrating statistical mechanics with individual choice behavior and
showed it to exhibit a high degree of complexity. Disbro and Frame (1990) demonstrate how the
theoretically derived, Gazis, Herman and Rothery (GHR) traffic model (Gazis, et. al., 1961) is
highly chaotic, even when applied to small (eight-car) systems. Van Zuylen et al. (1999)
discussed the implications of human behavior, chaos and unpredictability for urban and
transportation planning and forecasting. Safanov et al. (2002) showed that chaotic behavior in
traffic can due to the delays in human reaction. Nair et al. (2001) analyzed traffic flow data and
characterized it as chaotic; their work relates most closely to the application pursued here and is
elaborated under the Empirical Analysis section. And Weidlich (2000) demonstrated how
random-utility-based models of relatively simple social behaviors produced chaotic behavior.
This work offers a detailed application of chaos to traffic flow data. And, in order to apply chaos
tools, one must first understand chaos theory.
CHAOS THEORY AND DATA ANALYSIS
Introduction to Chaos
Essentially, chaos is a nonlinear behavior that exists between the realms of periodic and
random. At first glance, some chaotic systems may appear to regular and periodic, whereas
others will appear strictly random; in both cases closer examination topples these assumptions.
Strictly speaking, chaotic systems are deterministic and, the exact system state can be written:
X(t) = (x(t),x(t −τ ),x(t − τ
2 ),...,x(t − (k − τ
)
1 )
(1)
where t is a scalar index for the data series and τ is the interval of observations.2 Let F:
k
k
ℜ → ℜ be the nonlinear function governing the system; then, the future state of the system at
any time t+τ can be ascertained. However, as no real-world system is likely to be completely
deterministic, a (relatively small) probabilistic component, p(t), with mean zero is added to
account for random effects (Lu and Smith, 1991):
x(t+τ) = F(X(t)) + p(t).
(2)
The state of the system, X(t), is critical to knowing the progression of a system, and even
a small change in it will radically alter the manner in which the system evolves. Thus, after a
short interval, the system effectively becomes unpredictable. This effect is known as sensitivity
to initial conditions and is a hallmark of chaotic systems.
Given a set of initial conditions, a dissipative3, nonrandom system eventually will settle
onto a path called an attractor. Depending on the nature of the system, this attractor may be a
point, a closed path, or, in the case of a chaotic system, a more complicated object. The attractor
is the geometric representation of Eq. 1 for some time t large enough to allow dissipation of
transient effects accompanying system initiation. For Eq. 2’s k-dimensional chaotic system, the
attractor will be a non-intersecting, bounded path with infinite length, yet able to be encapsulated
in a finite, k-dimensional volume. This attractor repeats its own geometric patterns infinitely.4
Knowing all this, assuming, or even hypothesizing, that a transportation system is chaotic
invites questions. For example, is human behavior ultimately deterministic? Lorenz (1993)
believes that such an assumption is most likely misguided. Yet the goal is to model human
behavior, not reproduce it. In transportation systems, legal and social constraints may bound
behavior, allowing modeler to more accurately predict human actions and system evolution. In
addition, Eq. 2’s random component allows one to incorporate variations in human behavior,
under the guise of probability.5 Underlying all of this is the basic assumption that transportation
systems are complex, and that the evolution of the systems is guided by non-random elements:
there is order contained within them and chaos theory may potentially be used to model this.
Embedding Dimensions
A key component of chaotic data analysis is Takens’ (1981) embedding theorem. Given
Eq. 1’s dynamical system, a scalar data measurement, s(t) = h(x(t)): ℜk → ℜ , dependent on the
system’s complete dynamics can be reconstructed into a d-dimensional vector series:
y(t) = [s(t), s(t + T ), s(t + 2T ),..., s(t + (d − )
1 T )]
(3)
where integer m > 0 and T = τ
m is called the (time) lag. Takens' theorem says that, if d is large
enough, the vector series y(t) reproduces many of the important dynamical characteristics of the
original series, x(t) (Abarbanel 1996). Thus, one does not need the original vector series in order
to analyze many of the system properties of the data series. A scalar function s(t) describing the
system is all that is necessary.
To apply this theorem effectively, good choices for the lag T and embedding dimension d
are needed. Choice of T should provide low correlation between adjacent elements in the
embedded vector (so that the original data series is not mimicked) without being too long. One
can use the first minimum of the average mutual information function:
Pr(s(t), s(t + T ))
I (τ ) =
∑ Pr(s(t),s(t +T))log
.
(4)
2
s t P s t
T
s (n),s (n+τ )
Pr( ( )) ( ( + ))
which measures the average amount of information (bits) shared by two measurements
(Abarbanel 1996). For a data series, the probabilities come from a histogram of the data points
and data point pairs.
To understand how to select a good embedding dimension, d, it is helpful to understand
what is happening geometrically. As d increases in the reconstruction of a data series, the
chaotic attractor "unfolds". When completely unfolded, a sequential path from one d-
dimensional point to the next will never cross itself. If d is too small, various paths of the
projected attractor will cross.
The method of false nearest-neighbors (for example, see Arbanel (1996)) recognizes that,
where the attractor’s paths appear to cross, two neighboring points actually will be far away in
the sequential order of the true embedded data series. Using this idea, Cao (1997) proposed a
method for determining a good embedding dimension. Let
E(d + )
1
E (
1 d ) =
(5)
E(d )
with
NN
N −dT −1
1
y
t
y
t
d + ( )
( )
1
− d+1
E(d ) =
∑
(6)
N −
NN
dT
t =0
y (t)
y
t
d
−
( )
d
and
y (t)
NN
− y (t) = max s(t + jT ) − s NN (t + jT) ,
(7)
d
d
0≤ j≤d 1
−
where N is the length of the original data series, the subscript d refers to the embedding
dimension, and the superscript NN means the nearest neighbor to the other vector as defined by
the metric of Eq. 7. As d gets large enough, E1(d) will tend to one. The appropriate embedding
dimension is given by the value of d where E1(d) stops changing. Practically, this means
choosing the d where E1(d) begins to "level out", near unity.
Cao (1997) also proposed a related method for determining whether or not the original
data series is random. Using a different metric,
*
E (d + )
1
E
(
2 d ) =
,
(8)
*
E (d )
where
N −dT −1
1
*
E (d ) =
∑ s(t + dT) − NN
s
(t + dT ) ,
(9)
N − dT t=0
a random signal will have an E2(d) that is close to unity for all values of d, whereas a chaotic
signal will have an E2(d) that is less than one for small values of d.
Determining the Presence of Chaos
In analyzing data, an important step is determining the presence of chaotic behavior. As
with all methods outlined in this paper, there is some degree of interpretation involved, and so a
positive result (for chaos) in one of these tests is no guarantee that chaos exists. What these tests
will do is distinguish clearly periodic and/or random data from chaos. An example of the
potentially misleading aspects of these tests is discussed in the example at the end of this paper.
One method to determine the presence of chaos uses the following Fourier power
spectrum6:
1 N
P(ω)
s(t)
∑−
−
=
i
π N t
e
ω
(10)
2
( (2 / ) )2
1
N
t =0
For periodic data, the power spectrum will spike at frequencies that characterize the system, and
lie close to zero for all others. For chaotic data, the spectrum will be broadband and have a
broad peak (Figure 1). Truly random noise should have a constant valued power spectrum; but
in practice, it can be difficult to distinguish very noisy data from truly chaotic behavior using this
method.
8
10
f(x) = sin(4x) + s in(x/2) - c o s (x)
x(k+ 1) = 3.77x(k)(1 - x(k))
6
10
6
10
4
.
)
]
4
.
)
]
q 10
q 10
e
e
(fr
(
fr
[
P
[
P
g
2
g 10
2
o
L 10
Lo
0
10
0
10
-2
10
0
2
4
6
8
10
0
500
1000
1500
2000
Frequenc y
Frequency
Figure 1. Fourier power spectra for periodic (left) and chaotic (right) logistic functions.
A more common technique to determine the presence of chaotic behavior is the largest
Lyapunov exponent, which measures the divergence of nearby trajectories. As the system
evolves, the sum of a series of attractor point values (in each dimension) will converge or
diverge. Lyapunov exponents measure the rate of convergence/divergence in each dimension,
and a chaotic system will exhibit trajectory divergence in at least one dimension. Thus, if the
largest Lyapunov exponent exceeds zero, then the system is chaotic. To determine the largest
Lyapunov exponent, the following function is used:7
1
1 N
s(t + ∆t) − s'(t + ∆t)
λ
ln
,
(11)
max =
∑−
N t
s t
s t
t
( )
'( )
∆ =0
−
where the primed and unprimed data points are distinct but close (i.e., within 10 or so time steps,
[Hilborn 2001]). As ∆t increases, the value of the exponent will, theoretically, converge to its
true value. In practice, however, due to a finite data sets and noise, the exponent can only be
determined within a range of values.
Another indicator is a fractal dimension, which will be non-integer for chaotic systems.
There are many types of fractal dimensions (Abarbanel 1996), but the most useful is the so-
called correlation dimension. The following discussion is based on Hilborn (2001). If one takes
a sphere of radius R centered around a particular data point in embedded space, then the average
number of data points contained in the sphere, not including the one at the center of the sphere, is
given by:
( N −dT − )
1 ( N −dT − )
1
1
C(R) =
∑ ∑Θ(R − y(i)− y( j))
(12)
N (N − )
1
i=0
j =0, j ≠i
where s(i) is the center of the sphere and Θ(x) is the Heavyside indicator function:
if
0
x < 0
Θ(x) =
.
(13)
if
1
x ≥ 0
The correlation dimension assumes that as R approaches zero, the rate at which C(R) changes is
given by the relation:
DC
C(R) = lim kR
(14)
R→0
or, solving for DC:
log C(R)
D = lim
.
(15)
C
R→0
log R
Since the data set will not be continuous, R cannot get too close to zero (there will be no points
in the sphere). To handle this situation, one plots C(R) versus R and selects the apparently linear
portion of the graph. The slope of this portion will approximate DC. If DC is integer, then the
attractor is a “normal” geometric object, like a point (DC = 1) or a surface (DC = 2); if it is non-
integer, then the attractor is “strange” and the system exhibits chaos.
Prediction
As discussed previously, though chaos is fundamentally deterministic, it is unpredictable
beyond short intervals. This section covers how short this interval will theoretically be and how
accurate predictions can be made within that period. The method follows those discussed in
Abarbanel (1997), Casdagli et al. (1997), and McNames (1998).
The (approximate) period limit on accurate predictions of a chaotic system is a function
of the largest Lyapunov exponent (Abarbanel 1997):
1
∆t
=
.
(16)
max
λmax
To be chaotic, the largest Lyapunov exponent must exceed zero. If it exceeds one, the maximum
length of an accurate prediction is less than the data series sampling frequency. Thus, only for
systems with Lyapunov exponents between zero and one are chaotic predictions of any practical
use. If the exponent is much less than one, long, accurate predictions are possible.
For prediction, one starts with the unfolded attractor in d-dimensional space and time lag
T. Given an initial vector y(t0), one selects the k nearest trajectories (not points) on the attractor,
and then the k nearest points to y(t0), one on each trajectory. An average of these trajectories is
used to find the next point on the predicted trajectory, y(t0 + dT).8 The predicted point is then set
as the new starting vector and the process is repeated.
This predictive technique offers the best model possible under the context of chaos
theory. Since truly chaotic systems are infinitely complex and extremely sensitive to changes in
system conditions, it is doubtful that the function F(x(t)) from Eq. 2, or even an approximation of
it, could be extracted from data. This problem is compounded by the abstraction of the data
through the use of embedding dimensions. Thus, searching for F(x(t)) is most likely a futile
task. Fortunately, chaos theory rigorously exploits the fact that information is contained in the
data and can be used for (short term) predictions.
EMPIRICAL ANALYSIS
Thus far, this paper has described key features of chaotic systems, the computations most
helpful in assessing the presence of chaos, and methods for predictive chaos applications. This
section describes an application to traffic counts, and illustrates how chaotic data analysis may
mislead.
Similar to Nair et al.’s application (2001), we apply chaotic data analysis techniques to
traffic flow measurements. The data were downloaded from the Freeway Performance
Measurement Project (PeMS) run by the University of California, Berkeley. The data come from
inductive-loop detectors embedded a section of Interstate 80 near Sacramento, California. The
30-second vehicle counts have been summed across all lanes and aggregated to five-minute
counts (in order to smooth higher-frequency fluctuations). The data was collected over a five-
week interval from noon on April 7, 2003, noon to noon on May 12, 2003. All calculations were
performed on the TSTool add-on program for Matlab.
Figure 2 shows the total time series. Nair et al. (2001) noted that weekend and weekday
traffic flow patterns are of a different flavor and recommended they be analyzed separately. The
different flow patterns in our data seem to suggest this, but to be certain, we apply Cao’s method
(Eq. 8) for determining whether or not a data series is random. Eq. 4 suggests a time delay of 35,
and Figure 3 plots the results of Cao’s method. E2(d) remains around one for all values of d,
suggesting that this is random data. However, from the plot of the data we see that this data
exhibits substantial periodicity (on the order of both a day and a week), so we follow Nair et al.’s
(2001) conclusion and focus only on a workweek, April 14 through 19.
4 0 0
3 5 0
3 0 0
2 5 0
w
o
Fl 200
1 5 0
1 0 0
5 0
0 0
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
7 0 0 0
8 0 0 0
9 0 0 0
1 0 0 0 0
T im e (x 5 m in. )
Figure 2. Time series of 5-minute traffic counts on Interstate 80 (noon on 4/7/2003 to noon on
5/12/2003).
1
0.8
) 0.6
(
d
E2
0.4
0.2
0
2
4
6
8
10
12
14
16
18
20
Dimension (d)
Figure 3. Plot of E2(d) for five-week traffic flow data
These five days of count data are plotted in Figure 4. For each day, there is a tight
morning peak and a broader evening peak, corresponding to the morning and evening commutes.
A strong periodic structure remains, and this will be important for the analysis.
Using the five days of weekday count data, one can first check for chaotic structure.
Figures 5, 6, and 7 show the Fourier power spectrum, largest Lyapunov exponent, and
correlation dimension, respectively. The broadband power spectrum and the positive Lyapunov
exponent are characteristic of chaotic data. The linear part of the plot in Figure 7 has a slope of
approximately 5.73, indicating the presence of chaos.
4 0 0
3 5 0
3 0 0
2 5 0
ow
Fl
2 0 0
1 5 0
1 0 0
5 0
0
2 0 0
4 0 0
6 0 0
8 0 0
1 0 0 0
1 2 0 0
1 4 0 0
T i m e ( x 5 m i n )
Figure 4. Time series of 5-minute traffic counts on Interstate 80 (midnight on 4/14/2003 through
11:55 pm on 4/18/2003)
4
10
2
10
.
)]
q
(fre
[
P
0
g 10
o
L
-2
10
0
0.1
0.2
0.3
0.4
0.5
Frequency
Figure 5. Fourier power spectrum for weekday traffic flow data.
7
6
5
4
LLE 3
2
1
0
200
400
600
800
1000
Separation Time
Figure 6. Largest Lyapunov exponent for weekday traffic flow data
(Common values lie between 5 and 6.5, indicating chaos.)
-6
-8
-10
-12
(
r
)
C
ln -14
-16
-18
-204
4.5
5
5.5
6
ln r
Figure 7. Correlation dimension for weekday traffic flow data
(Linear portion, approximated by dotted line, has slope of about 5.73, indicating chaos.)
4.5
4.4
4.3
4.2
Bit 4.1
4
3.9
3.8
0
10
20
30
40
50
60
Time Delay
Figure 8. Mutual information function for weekday traffic flow data.
To facilitate computation of the embedding dimension parameters, Figure 8 shows the
mutual information function. Though the first minimum occurs at a time delay of 3 (or 15
minutes), 12 (or 1 hour) is selected here, as it is more indicative of what the “true” first minimum
of this function would be, if it was smooth.
Cao’s methods (Eqs. 5-9) for determining d and for determining whether data are random
are plotted in Figure 9. The plot of E1(d) seems to level out at around 8, so this is selected this
as the data set’s embedding dimension. E2(d) is less than one for small d, suggesting that the
series is not random.
Finally, the last two days of the week are predicted based on data from the first three
days. Based on Eq. 16, the Lyapunov exponent suggests that the maximum window for accurate
predictions is about 55 seconds, which is less than the sampling period of 5 minutes. Despite
this, the predictions, shown in Figure 10, follow the actual data fairly accurately. Figure 11
provides a close view of the first few predictions, and the predictions appear much less accurate
on the scale of the small oscillations.
1
0.9
0.8
0.8
0.7
0.6
0.6
)
)
(
d
(
d
0.5
E1
E2 0.4
0.4
0.3
0.2
0.2
0.1
0
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
Dimension (d)
Dimension (d)
Figure 9. Plots of E1(d) (left) and E2(d) (right) for weekday traffic counts.
4 0 0
A c t u a l F l o w
P r e d i c t e d F l o w
3 5 0
3 0 0
2 5 0
w
o
Fl
2 0 0
1 5 0
1 0 0
5 0
9 0 0
9 5 0
1 0 0 0
1 0 5 0
1 1 0 0
1 1 5 0
1 2 0 0
1 2 5 0
1 3 0 0
1 3 5 0
1 4 0 0
T i m e ( x 5 m i n )
Figure 10. Comparison of actual and predicted Thursday and Friday traffic counts.
1 2 0
A c t u a l F l o w
1 1 0
P r e d i c t e d F l o w
1 0 0
9 0
8 0
7 0
ow
Fl
6 0
5 0
4 0
3 0
2 0
8 7 0
8 8 0
8 9 0
9 0 0
9 1 0
9 2 0
9 3 0
9 4 0
T i m e ( x 5 m i n )
Figure 11. Close-up of actual and predicted counts.
Add New Comment