Machine learning and bioclimatic mapping and prediction

Project status: 
In Progress
Project Leader(s): 
Dr Sue Worner, Lincoln University
Team Member(s): 
Dr Mike Joy
Team Member(s): 
Dr Takayoshi Ikeda
Team Member(s): 
Gwénaël Leday
Team Member(s): 
Joel Pitt
Models such as this can predict the future establishment and spread of invasive species.

Bioclimate mapping of pest species is used to identify risks under current and projected climates using a range of statistical, Artifical Neural Network (ANN), process and machine learning models.

Predicting the future establishment and spread of invasive species is an integral part of pest risk analysis. Increasingly, different classes of models are used to integrate the high dimensional array of climate and biotic information required to gain greater predictive precision.

Machine learning is a subfield of artificial intelligence and refers to computer programs and algorithms that allow computers to learn from data and identify hidden patterns within it. The objective of such programs is most often knowledge discovery by automatically extracting patterns and rules from existing data.

Machine learning programs comprise well-known artificial neural network algorithms as well as a wide range of emergent statistical modelling and computational techniques. Such models are used for risk assessments and bioclimatic mapping when detailed data of the relationship between a species and its environment are lacking.

This project involves large scale empirical comparisons to predict species' potential establishment based on climate and other variables. We are currently using eight supervised classical and machine learning methods, three meta-model or ensemble methods and nine performance metrics to analyse global invasive species distribution and climate data.

The current models are:
Discriminant analysis (Linear and Quadratic)
Logistic regression
Naive Bayes
Decision Tree
Conditional Tree
K-nearest neighbours
Support vector machines
Artificial neural networks

The ensemble methods are:
Boosting
Bagging
Random forests

All models are tested by bootstrapping, a method that:

1) permits measurement of overall uncertainty of prediction  for bioclimatic mapping
2) allows data quality to be assessed
3) indicates which models work best
4) can dentify specific target locations for which there is a high level of uncertainty
5) guides further research by generating hypotheses
6) identifies significant variables and their values suitable for modelling habitat suitability.

Back to Intelligent Systems for Biosecurity homepage