Plant stress: what is it and how to detect it. Part 2
Physical [empirical] models
Many processes have been modelled biologically, chemically, and physically to determine different indicators of plant stress. Photosynthesis is modelled chemically with the C4 photosynthesis model, stomatal conductance is estimated with the Ball-Berry model (4), and the list goes on. In the Ball-Berry model, g is the stomatal conductance for CO2 diffusion, An is the net leaf CO2 assimilation rate, Ds and cs are the vapor pressure deficit and CO2 concentration at the leaf surface, respectively, Γ is the CO2 compensation point, g0 is the value of g at the light compensation point, and a1 and D0 are empirical coefficients.
g = g0 + a1An(cs- Г)/(1 + Ds/D0) (4)
The problem with these models is upfront present: there is no way to determine all empirical values needed for these models in a non-disruptive manner. Also, it does not scale and not necessarily might be generalized to large areas.
Large amounts of data gathered either from satellite imagery or UAVs inspire research on the junction of agricultural practices and machine learning. Infield measurements, records of diseases, results of chemical and geological tests usually are used as labels, and numerous efforts were put to model some value of interest.
Works of plant stress modelling and prediction are numerous and investigated the possibility of droughts, soil salinity, nutrient content, heavy metals content, and other stresses. Commonly hyperspectral imagery is used as input, sometimes different spectral reflectance indices or several stacked bands are used, and the rarest is regular RGB imagery. Different models, as well, have been deployed to both classical machine learning and deep learning algorithms to produce noticeable results given the non-triviality of the task.
A widely used procedure is described in Relationship Between Field Measurement of Soil Moisture in the Effective Depth of Sugarcane Root Zone and Extracted Indices from Spectral Reflectance of Optical/Thermal Bands of Multispectral Satellite Images.
The actual soil moisture content was determined for five points per eight fields using the gravimetric method, resulting in forty ground truth data points. As for input data, several spectral indices, including Normalized Difference Vegetation Index (NDVI) and Temperature Vegetation Dryness Index (TVDI), were constructed from the Landsat 8 satellite imagery. As a model, PLSR was used. Conducted research found TVDI to be the most accurate for soil moisture estimation with a coefficient of determination value of 0.63.
An example of a visualized fitted model is presented in figure 2.
Deserves a notice the paper Neural Network Approach to Water-Stressed Crops Detection Using Multispectral WorldView-2 Satellite Imagery which is among the first applied Multi-Layer Perceptron to classify the soil under severe water stress beating previous approaches such as Logistic Regression. In contrast to previous methods, generalization across different soils and crops.
The researchers in paper Comparison between artificial neural network and partial least squares for online visible and near-infrared spectroscopy measurement of soil organic carbon, pH, and clay content have done extensive work of comparing usually used PLSR model to ANN to model soil organic carbon content, clay content and soil pH at ones. As an input, researchers used spectral imagery ranging from 371 to 2150 nm. The problem with pH is that this characteristic depends on compounds present in the soil but is not directly determined. Given said that research had found a better performance of ANN in estimating organic carbon content, clay content, and pH levels.
The issue of water stress assessment is reduced to classification in the paper Modelling Water Stress in a Shiraz Vineyard Using Hyperspectral Imaging and Machine Learning. The researchers measured in-field stem water potential values and classified areas with values lower than -0.7 MPa as non-stressed and areas with values higher as stresses. A Random Forest (RF) model and Extreme Gradient Boosting (XGBoost) were trained on hyperspectral imagery. RF produced slightly better test accuracy (81.7% versus 80.0%). One more interesting note is on wavelengths importance which is shown in figure 3. There are several wavelengths at which a valuable signal was considered as significant by both models.
Even though imagery and sensor data are countless, labeled data is the limiting factor for developing supervised models. Sadly there is no easy way to apply unsupervised algorithms to such applications.
One unsupervised approach is used to the preprocessing task when selecting the most informative wavebands. Widely Principal Component Analysis (PCA) discriminates similar bands that carry little or no new information. In the paper Early Visual Detection of Wheat Stripe Rust Using Visible/Near-Infrared Hyperspectral Imaging, researchers used PCA and successive projections algorithm (SPA) to reduce a hyperspectral signal to just 8 key wavelengths in the first case and 12 in the second. Three experiments were conducted training a feed-forward neural network on each subset of selected wavelengths and the 256 wavelength spectrum. As concluded by researchers, training a model on the full spectrum did not score better in early plant rust detection.
Another technique applied to plant stress detection is described in the paper Inferring Grassland Drought Stress with Unsupervised Learning from Airborne Hyperspectral VNIR Imagery. The authors used Simplex Volume Maximisation, a variant of matrix factorization algorithm, to identify soils with healthy and stressed water profiles using different water and vegetation indices, including NDVI, Photochemical Reflectance Index (PRI), Modified Chlorophyll Absorption and Reflectance Index (MCARI), and many others. Figure 4 presents an example of their results along with corresponding regular RGB images captured by UAV.
Clustering techniques are used to distinguish between a usual field canopy cover and plants that expose some anomalies. In Unsupervised classification of saturated areas using a time series of remotely sensed images, researchers were using the Automatic Classification of Time Series (ACTS) algorithm to cluster NDWI images such that within each landcover type into two or three clusters reasonably well-identifying areas of waterlogging, drought and land with no anomalies. Of course, some preprocessing has involved NDVI to distinguish seven landcover types before clustering each of them.
Available data and products
As mentioned before, numerous studies in the field of detection and forecast of plant stresses have been performed on small scales, that is a few kilometers squared fields or less. Unfortunately, almost none of such datasets end up being publicized and freely available.
Some data from monitoring stations can be acquired from the National Soil Moisture Network or GEO Department of TU Wien regarding surface soil temperature and soil moisture measures. These are great resources for water and temperature extreme stressors mapping shared generously with researchers and the public.
However, several other sources are not precisely of on-ground measures worthy of checking out provided by NASA. For example, the ECOSTRESS Program provides data of evapotranspiration derived from collected hyperspectral data. The problem is not even the fact it is derived measures but rather limited resolution and period of revisiting.
ISRIC — World Soil information portal released a SoilGrids250m platform combining numerous sources regarding soil classification, nitrogen content, soil organic carbon.
There is no shortage of multispectral data available easily. However, imagery of high resolution is harder to obtain. The best freely available imagery is provided by the Sentinel-2 mission giving access to 10 m per pixel resolution. Still, even this is far from ideal when defining statistics for an individual area. As can be seen in figure 5, even on the best resolution of Sentinel-2 imagery individually, fields are crudely pixelated.
Plant stress detection has a growing interest in precision agriculture applications. There are numerous stressors to which crops can be subjected to. Accurate laboratory tests provide valuable information regarding plants’ and soils’ status and resilience to harsh conditions, although such data is scarce and hard to collect for early detection.
A great deal of research has been put into investigating methods for stress detection using remote sensing techniques, including fluorescent, hyperspectral, multispectral, and thermal imaging. The specific spectral indices were developed to aid the extraction of the needed signal for estimation of the value of concern; these indices comprise the data from several reflectance wavelengths.
Machine learning techniques are used to investigate the possibility of discovering an accurate mapping of spectral imagery data to results of laboratory experiments of the soils and plants. The most common technique being variants of linear regression, deep neural networks are also starting to catch up. The later models are at least compatible and often better performing.
Despite all advances made, researchers around the globe still struggle to investigate to shorten the time between anomalies detection and actions to address them.
Publicly available data is still limited, hard to collect and process, only recently platforms to ease this process and aggregate needed information started to emerge.
Writer and Editor — Valentyna Fihurska