Modelling Gross Primary Productivity Using Sentinel-2 and Topographical Variables with Machine Learning in Golden Gate Highlands National Park

Morena Mapuru, Sifiso Xulu, Katlego Mashiane, Mahlatse Kganyago

Abstract


Gross Primary Productivity (GPP) is a key indicator of ecosystem function and carbon sequestration. Although MODIS-based GPP products are widely used, their coarse spatial resolution limits their effectiveness in capturing fine-scale carbon dynamics. This study aims to develop the first high-resolution GPP prediction model for Golden Gate Highlands National Park (GGHNP) using a downscaled MODIS GPP product as the response variable. Sentinel-2 spectral bands, vegetation indices, and topographical variables were used as predictors in two machine learning algorithms: Random Forest (RF) and Support Vector Machine (SVM). The study (a) explores the relationship between GPP and predictor variables, (b) evaluates RF and SVM performance, and (c) identifies important predictors through variable importance analysis. Correlation results showed strong positive relationships between GPP and vegetation indices such as the Chlorophyll Red Edge Index (ClRed Edge), Modified Soil Adjusted Vegetation Index (MSAVI), Normalized Difference Red Edge Index (NDRE), and Green NDVI (GNDVI). The blue and green spectral bands were negatively correlated with GPP. Model evaluation showed that RF performed best when topographical variables were combined with spectral bands and vegetation indices. SVM showed stronger performance than RF when using only spectral bands, only vegetation indices, or their combination. Both models performed poorly when using only topographical variables. Key predictors identified through variable importance analysis included the blue and green bands, Enhanced Vegetation Index (EVI), ClRed Edge, MSAVI, and NDRE. These results highlight the value of red-edge and near-infrared data in modelling vegetation productivity. This study demonstrates the potential of combining high-resolution remote sensing data with machine learning to estimate GPP and provides the first localized model for GGHNP. The approach offers a scalable tool for improving ecological monitoring and understanding carbon dynamics in mountainous environments.

Keywords: Gross Primary Productivity, Remote Sensing, Machine Learning, Sentinel-2, Vegetation Indices


Full Text: PDF