**Forecasting of Medium-term Rainfall Using Artificial Neural Networks: Case Studies from Eastern Australia Neural Networks: Case Studies from Eastern Australia**

**Forecasting of Medium-term Rainfall Using Artificial** 

DOI: 10.5772/intechopen.72619

John Abbot and Jennifer Marohasy John Abbot and Jennifer Marohasy Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/intechopen.72619

#### **Abstract**

The advent of machine learning, of which artificial neural networks (ANN) are a component, has provided an opportunity for improved rainfall forecasts, which is of value for water infrastructure management, agriculture, mining and other industries. In this chapter, ANNs are shown to provide more skillful monthly rainfall forecasts for locations in southeastern Queensland, Australia, for lead-times of 3–12 months. The skill of the forecasts from the ANNs is highest when the models are individually optimized for each month, and when longer-duration series are used as input. The ANN technique has application where there is temperature and rainfall data extending back at least 50 years. Such datasets exist for much of Europe and North America, though a review of the available literature indicates most research into the application of ANN has focused on China, India and Australia.

**Keywords:** rainfall, forecast, monthly, neural network, Australia

## **1. Introduction**

Until relatively recently, simple statistical models were used by meteorological agencies around the world to forecast seasonal and monthly rainfall. Typically, these models use relationships between large scale climate indices, such as the Southern Oscillation Index, and rainfall at some future time, generally utilizing a small number of input variables, perhaps only one or two. For example, until May 2013 the Australian Bureau of Meteorology (BOM) generated seasonal rainfall forecasts based on a statistical scheme using an El Niño Southern Oscillation (ENSO) index as a primary predictor in a relatively simple statistical model [1, 2].

These traditional statistical models are limited in the number of input variables that can be effectively combined, while advances in machine learning has now significantly expanded

Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons

this potential [3]. Machine learning has close relationships with artificial intelligence, pattern recognition and data mining. Data mining focuses on the discovery of previously unknown properties embedded in data [4], whereas machine learning focuses on prediction based on known properties learned from exposure to data sets during a process known as "training". A principle objective of the learning process is to construct a model that can generalize from experience [5]. Subsequently, the performance of the trained model can be tested using data not utilized in the training set. Performance of the model in testing gives confidence, but not certainty, that the model would provide reliable forecasts if deployed operationally.

**2. Data, method, and a first study**

to generate a rainfall forecast.

is copied to the context unit.

cally be reduced from over 40 to less than 10.

measured in the past. The initial focus was on:

the Bureau);

advance;

desirability of long data series with few missing values.

with the administrative region of South East Queensland.

Monthly data were obtained from the Bureau's Climate Data Online. Data was downloaded for all 62 sites in south eastern Queensland that are considered in this chapter. The sites were chosen on the basis of their geographic spread and also the quality of the data: that is the

Forecasting of Medium-term Rainfall Using Artificial Neural Networks: Case Studies…

http://dx.doi.org/10.5772/intechopen.72619

35

South-eastern Queensland is defined very broadly in this chapter, and does not correspond

This chapter is a review of various studies undertaken since 2012 focused on this general area, with specific information on data and methodology in the published technical papers that are referenced [11–21]. However, in this first section, the method used in an early study [17] is provided in more detail, by way of background into how an ANN can be practically deployed

Many neural network applications incorporate multilayer perceptrons (MLPs) as fundamental processing elements (PEs) trained with a standard backpropagation algorithm. These neural networks can perform well in solving static problems but are limited in solving temporal problems, ones where the previous value of the input affects the current output. Recurrent networks, such as Jordan networks, extend the basic MLP architecture by also including context units, PEs that remember past activity. In the Jordan network, the output of the network

In addition, the context units are locally recurrent, that is, they feedback onto themselves. The local recurrence decreases the values by a multiplicative time constant (τ) as they are fed back. This constant determines the memory depth, that is, how long a given value fed to the context unit will be "remembered". The context unit acts as a simple lowpass filter, creating an output (*y*(*n*)), calculated as a weighted average value of some of its more recent past inputs. In the case of the Jordan context unit, the output is obtained by summing the past values multiplied by the scalar τ*n*, where:

y(n) = ∑x(n) τ*n* (1)

Genetic optimization provides an efficient way of selecting those inputs that are significant in determining target rainfall and eliminating those inputs with very low information content. Essentially, genetic optimization enables elimination of inputs that carry mainly noise rather than useful signal, so that the number of input considered in the optimized model might typi-

In the early study [17] forecasts were made for Lowood and two other sites within the Brisbane catchments (upstream of the Wivenhoe dam) using the Jordan network with optimization. The initial forecasts were made using only lagged input parameters – that is any variable

**i.** Benchmarking output from the ANN against POAMA (the operational model used by

**ii.** Evaluating the impact of 1-, 2- and 3-month lags, and accordingly the extent to which a rainfall forecast potentially loses skill moving from forecasting 1, 2 and 3 months in

Artificial neural networks (ANNs), a form or machine learning, provide several important advantages over simple statistical models. ANNs can accommodate non-linear relationships, and test multiple inputs, and this is particularly important when the influence of climate indices may vary geographically and temporarily in poorly understood ways [6].

Rather than progressing from simple statistical models to artificial neural networks, however, meteorological agencies around the world now tend to rely almost exclusively on general circulation modeling for rainfall forecasting. These are physical models, that attempt to simulate real-world oceanic and atmospheric circulation patterns. For example, the Australian Bureau of Meteorology uses the Predictive Oceanic Atmospheric Model for Australia (POAMA) as its operational model for forecasting daily rainfall, and also as the basis of its monthly and seasonal forecasts.

The skill of the forecasts from ANNs can be compared with POAMA (and other general circulation models) through a comparison of root mean square errors (RMSE), mean absolute errors (MAE) and correlation coefficients – with such comparisons a focus of this chapter. RMSE is commonly applied to compare skill between different rainfall forecasting models, as it gives a simple, transparent, quantitative measure of difference between input and target and is easily understood across disciplines [7]. RMSE is more sensitive than MAE to the occasional large error (the squaring process gives higher proportionate weight to the large errors), and is therefore arguably more useful when skill at forecasting floods is particularly relevant.

Given the importance of skillful monthly rainfall forecasts for most inhabited regions, for activities as diverse as crop harvesting, mine scheduling, and dam management, it is surprising that there are so few comparative studies and so little discussion about how advances in machine learning might aid medium-term rainfall forecasting – including the forecasting of flood events.

Queensland is a state in the north-east of Australia, facing the South Pacific. Brisbane is the fast-growing capital of Queensland, and is located in the south-east. Brisbane has a long history of flooding and the Wivenhoe dam was built specifically for flood mitigation following a flood in 1974.

The flooding of Brisbane in January 2011 has been termed a "dam release flood" [8] with the sudden release of water from the Wivenhoe storage a principal cause of flooding. The extent of the rainfall in 2010/2011 was *not* unprecedented relative to historical records that extend back to 1864; but the heavy rainfall was not forecast [9]. Because the heavy rain in December 2010 and January 2011 followed a long period of drought, and was not forecast, the Wivenhoe dam was not properly managed for flood mitigation – the purpose for which it was originally built. Because of the extent of the economic losses the flood event has led to a class action lawsuit against the dam operator SEQEB and the State of Queensland, with the trial expected to commence in late 2017 [10].
