Standard Presentation Australian Marine Sciences Association 2026 Conference

Better models from imperfect data: a workflow for spatial and habitat bias correction in broad-scale marine habitat mapping that uses combined disparate datasets (140275)

Ben Radford 1 , Jac Monk 2 , Brooke Gibbons 3 , Sharyn Hickey 4 , Tim Langlois 3
  1. Australian Institute Of Marine Science, Crawley, WA, Australia
  2. School of Life and Environmental Science, , Deakin University, Warrnambool, Victoria, Australia
  3. School of Biological Sciences, The University of Western Australia, Crawley, Western Australia, Australia
  4. School of Agriculture Geography and Environmental Sciences, University of Western Austraia, Crawley, WA, Australia

Australia's coastline spans over 44 degrees of latitude, encompassing habitats from tropical coral reefs and mangroves to temperate seagrass meadows and kelp forests. To assess habitat extent and condition at national scales , and to predict impacts from coastal development, resource extraction, and climate-driven range shifts , there are increasing efforts to combine existing survey data from disparate projects into national datasets for broad-scale habitat modelling.

However, the underlying projects typically differ in objectives, spatial extents, and durations, and may employ habitat-stratified, opportunistic, or ad hoc sampling designs. When combined, these datasets often produce highly unbalanced and biased representations of habitat distribution, which propagate through to erroneous model outputs.

To identify and correct for these biases, we present a five-stage analytical workflow: (1) diagnosing spatial bias and autocorrelation in combined datasets using variogram analysis and Getis-Ord Gi* statistics; (2) identifying ecologically relevant multiscale environmental covariates from remote sensing products; (3) generating pseudo-habitat strata from these covariates using k-means clustering; (4) applying Generalized Random Tessellation Stratified (GRTS) sampling within pseudo-habitat strata to produce spatially and environmentally balanced modelling inputs; and (5) evaluating model outputs for spatial accuracy using spatial kernelling and geographically weighted regression.

We demonstrate this workflow using benthic habitat data derived from BRUVS imagery collated through the GlobalArchive national database. Through case studies, we show how the workflow identifies bias in combined datasets, systematically reduces it to produce more robust and generalisable habitat models, and quantifies spatial uncertainty , enabling managers and practitioners to understand where model predictions are reliable, where data gaps exist, and where targeted additional sampling would most improve predictions.