Deep Learning for Plant Stress Phenotyping: Trends and Future Perspectives

生物植物科学植物生物学压力（语言学）计算生物学植物语言学哲学

作者

Asheesh K. Singh,Baskar Ganapathysubramanian,Soumik Sarkar,Arti Singh

出处

期刊：Trends in Plant Science [Elsevier BV]
日期：2018-08-10 卷期号：23 (10): 883-898 被引量：511

链接

cell.com iastate.edu iastate.edu nih.govdoi.org

标识

DOI：10.1016/j.tplants.2018.07.004

摘要

Review of DL techniques applied to plant stress (biotic and abiotic) phenotyping to drive transformational changes in agricultural sciences. Comparative assessment of DL strategies across a wide range of plant species for plant stress identification, classification, quantification, and prediction (ICQP), specifically focusing on digital image–based phenotyping. Best practices, future avenues, and potential applications of DL techniques in plant sciences with a focus on plant stress phenotyping, including deployment of DL tools, image data fusion at multiple scales to enable accurate and reliable plant stress ICQP, and use of novel strategies to circumvent the need for accurately labeled data for training the DL tools. Deep learning (DL), a subset of machine learning approaches, has emerged as a versatile tool to assimilate large amounts of heterogeneous data and provide reliable predictions of complex and uncertain phenomena. These tools are increasingly being used by the plant science community to make sense of the large datasets now regularly collected via high-throughput phenotyping and genotyping. We review recent work where DL principles have been utilized for digital image–based plant stress phenotyping. We provide a comparative assessment of DL tools against other existing techniques, with respect to decision accuracy, data size requirement, and applicability in various scenarios. Finally, we outline several avenues of research leveraging current and future DL tools in plant science. Deep learning (DL), a subset of machine learning approaches, has emerged as a versatile tool to assimilate large amounts of heterogeneous data and provide reliable predictions of complex and uncertain phenomena. These tools are increasingly being used by the plant science community to make sense of the large datasets now regularly collected via high-throughput phenotyping and genotyping. We review recent work where DL principles have been utilized for digital image–based plant stress phenotyping. We provide a comparative assessment of DL tools against other existing techniques, with respect to decision accuracy, data size requirement, and applicability in various scenarios. Finally, we outline several avenues of research leveraging current and future DL tools in plant science. Recently, we reported on the potential and possibilities of utilizing machine learning (ML) for high-throughput stress phenotyping in plants [1Singh A. et al.Machine learning for high-throughput stress phenotyping in plants.Trends Plant Sci. 2016; 21: 110-124Abstract Full Text Full Text PDF PubMed Scopus (500) Google Scholar]. With the rapidly increasing sophistication, capability, and miniaturization of imaging sensors, the plant science community is facing a data deluge of plant images under various environments and under various stresses (biotic and abiotic). This ability to perform high-throughput phenotyping has resulted in increasing interest in automated approaches to extract features (i.e., symptoms and organs) of physiological interest from these large datasets with the intent of identifying and quantifying plant stresses. We complement our earlier review by focusing specifically on a very promising and rapidly advancing subset of ML tools in this work: deep learning (DL). This is especially important and topical due to the remarkable advances in DL tools that have transformed several disciplines, including consumer analytics, autonomous vehicles, automated medical diagnostics, and automated financial management. This is also an area that is quickly becoming the workhorse strategy for most ML applications (Figure 1 compares ML papers with DL papers over a 10-year period from 2008 to 2018). Our goal in this review is to provide a comprehensive overview, infer trends, and identify outstanding problems that the plant science community could pursue as we integrate DL concepts into our domain. We limit our review to DL applications that primarily utilize image data. This is motivated by the fact that digital imaging is relatively cheap; can be deployed in a scalable manner; can be easily integrated with manual, ground, and aerial platforms; and requires potentially the least technical expertise to deploy with off-the-shelf components for high-throughput plant phenotyping. This has resulted in a veritable explosion of research activities using digital imaging for plant applications. ML (and hence DL) concepts can be deployed on four broad categories of problems in plant stress phenotyping [1Singh A. et al.Machine learning for high-throughput stress phenotyping in plants.Trends Plant Sci. 2016; 21: 110-124Abstract Full Text Full Text PDF PubMed Scopus (500) Google Scholar]. These categories form part of the so-called 'ICQP' paradigm with the acronym representing the four categories (i) identification, (ii) classification, (iii) quantification, and (iv) prediction. These four categories naturally fall into a continuum of feature extraction where increasingly more information is inferred from a given image. Identification refers to detection of specific stress, that is, simply identifying which stress is being exhibited, for example, sudden death syndrome in soybean or rust in wheat. Classification is the next step, where ML is used to classify the image on the basis of stress symptoms and signatures. Here, the goal is to place the visual data (leaf, plant, canopy, or row) into a distinct stress class (e.g., low-, medium-, or high-stress categories). Quantification involves a more quantitative characterization of stress, such as incidence and severity. Disease incidence is defined as the rate of new cases of the disease, which is reported as the number of cases occurring within a period of time or at any time instant (generally at the time of maximum disease expression). In plant pathology, a common way to describe disease incidence is the percentage of diseased leaves on a single plant or the number of diseased plants out of the total number of plants in a field or plot [2Bock C. et al.Plant disease severity estimated visually, by digital photography and image analysis, and by hyperspectral imaging.Crit. Rev. Plant Sci. 2010; 29: 59-107Crossref Scopus (518) Google Scholar]. Disease severity is a more detailed quantification measure and is reported as the area of plant tissue affected by the disease (commonly presented as a percentage) on a leaf or on the entire plant canopy [2Bock C. et al.Plant disease severity estimated visually, by digital photography and image analysis, and by hyperspectral imaging.Crit. Rev. Plant Sci. 2010; 29: 59-107Crossref Scopus (518) Google Scholar]. The last category is prediction of plant stress ahead of time, before visible stress symptoms appear. This has substantial implications for the timely and cost-effective control of stress and is one of the key drivers of precision and prescriptive agriculture. In plant stress phenotyping a useful plant stress assessment strategy must satisfy the following requirements. (i) Reliability: the degree to which measurements of the same diseased individuals obtained under different conditions produce similar results [3Madden L.V. et al.The Study of Plant Disease Epidemics. American Phytopathological Society, 2007Google Scholar, 4Nutter F.W. et al.Assessing the accuracy, intra-rater repeatability, and inter-rater reliability of disease assessment systems.Phytopathology. 1993; 83: 806-812Crossref Google Scholar]. Reliability relates the magnitude of the measurement error in observed measurements to the inherent variability in the 'true' sample. The two types of reliability in plant disease assessment are intra- and interrater reliability. Intrarater reliability is the agreement between assessment (trait measurement) by the same rater taken temporally, while interrater reliability is agreement between assessment of the same plot or sample taken by different raters. (ii) Accuracy: the closeness of an estimate to the actual value. This is heavily influenced by the rater experience level. It is now well understood that these requirements are met with automated phenotyping workflows that enhance both reliability and accuracy by reducing bias (e.g., use of images rather than rater subjectivity), by removing rater fatigue (use of automated phenotyping systems), and by correct feature extraction (currently done using ML tools). This has spurred the development of various tools like the iPad app 'Estimate' [5Pethybridge S.J. Nelson S.C. Estimate, a new iPad application for assessment of plant disease severity using photographic standard area diagrams.Plant Dis. 2017; 102: 276-281Crossref Scopus (15) Google Scholar], a smartphone app for sugar beet disease detection [6Hallau L. et al.Automated identification of sugar beet diseases using smartphones.Plant Pathol. 2018; 67: 399-410Crossref Scopus (27) Google Scholar], and the smartphone app 'Leaf Doctor' [7Pethybridge S.J. Nelson S.C. Leaf Doctor: a new portable application for quantifying plant disease severity.Plant Dis. 2015; 99: 1310-1316Crossref PubMed Scopus (106) Google Scholar]. These tools seek to improve disease rating data quality by decreasing human error and to some extent inter- and intrarater variation. While conventional image processing has proved useful, the wide variability in quality and complexity (i.e., occlusion, debris, changes in illumination intensity and shading, and loss of function) of the images make consistent application of standard image processing strategies challenging to the ICQP paradigm. This is where ML (and especially DL) tools enable the creation of reliable workflows for feature identification. ML enables algorithmic learning from experience. According to Arthur Lee Samuel, ML is defined as the 'field of study that gives computers the ability to learn without being explicitly programmed'. Here, we distinguish between ML techniques that use 'handcrafted features' (i.e., a priori user-identified features) that are used for ICQP [1Singh A. et al.Machine learning for high-throughput stress phenotyping in plants.Trends Plant Sci. 2016; 21: 110-124Abstract Full Text Full Text PDF PubMed Scopus (500) Google Scholar] versus the more recent and promising DL techniques that do not require any hand-crafting of features, as they are able to automatically 'learn' the features or representations from the image data (Figure 2). In ML, feature hand-crafting refers to the exercise of choosing appropriate parts (e.g., one color channel from an RGB image) or transformations [e.g., scale invariant feature transform (SIFT)] that is applied to the raw datasets before training to enhance the performance of an ML model. While ML experts have developed different procedures of feature extraction depending on data and model types, they are still predominantly heuristic. Therefore, the feature extraction process in traditional ML involves time-consuming trial-and-error steps and success may depend on the level of experience of the data scientist. In this regard, one of the fundamental advantages of DL is that it involves an automatic hierarchical feature extraction process via learning a large bank of nonlinear filters prior to performing decision-making, such as classification. Hence, DL models typically work quite well with raw data and do not require the trial-and-error–based hand-crafted feature extraction process. Note that this applies to all ML problems regardless whether they are supervised (i.e., learning from data with target labels) or unsupervised (i.e., learning from data without target labels). While using traditional ML, both supervised and unsupervised learning approaches typically require feature extraction for better performance. For example, for large dimensional data, both support vector machine (SVM, a supervised technique) and K-means clustering (an unsupervised technique) may benefit from a principal component analysis (PCA)-based feature extraction (i.e., performing PCA and selecting only few top principal components for learning). On the other hand, a deep convolutional neural network (CNN, a supervised model) or a deep autoencoder (an unsupervised model) may not need any such feature extraction and can leverage the raw data directly. Our focus in this review is on the applicability and promise of DL tools as they have shown exceptional success in recent years on complicated phenotyping problems with good predictive ability. We hope that the plant science community can leverage the rapid advances and application of DL tools (from successful non-agricultural applications) to enable transformative advances in agriculture. We next introduce the basic idea of DL. DL is a class of ML techniques which utilizes a stack of multiple processing layers where each succeeding layer uses the output from the previous layer as input to learn representations of data with multiple levels of abstraction (Figure 3). Typically, DL models are built using multilayer neural networks where two subsequent layers of features are connected by neurons that essentially represent various parametrized nonlinear transformations. Examples of model parameters include weights and biases of the neurons that get multiplied and added to the input, respectively. In the forward direction, input data gets transformed in a layer-by-layer fashion until the target layer or the decision layer. For example, for an image classification problem, an input image is transformed using the hierarchical nonlinear transformations and finally the image class becomes the output at the target layer. Training such a model begins with an initial parameter set (often randomly chosen) for the deep neural network (DNN). Errors are computed at the target layer between the actual outputs and the desired outputs given by the training data labels for a large number of examples. Then the errors are used in a feedback mechanism in a layer-by-layer fashion (from target to input) to update the parameters until a satisfactory level of decision accuracy is achieved at the target layer. Typically, a method called the error back-propagation algorithm is used for this training process. The typical features learnt by a DNN are hierarchical in nature, that is, while initial layers capture low-complexity fundamental features (such as edges and corners for an image classification problem), more complex features are formed at the higher layers of abstraction via complex combinations of the low-complexity features. Stochastic gradient descent (SGD) and its variants, such as minibatch gradient descent [8Goodfellow I. et al.Deep Learning. MIT Press, 2016Google Scholar], ADAM [9Kingma D.P. Ba J. Adam: A method for stochastic optimization.CoRR. 2014; (abs/1412.6980)Google Scholar], and ADMM [10Taylor, G., et al. (2016) Training neural networks without gradients: a scalable ADMM approach. In Proceedings of the 33rd International Conference on International Conference on Machine Learning (Vol. 48), pp. 2722–2731, JMLRGoogle Scholar], have been used to train DNNs. The foundation of DL started in the mid-1960s when Ivakhnenko and Lapa used multiple layers of nonlinear features with polynomial activation functions in an approach similar to DL as we know it today [11Schmidhuber J. Deep learning in neural networks: an overview.Neural Netw. 2015; 61: 85-117Crossref PubMed Scopus (10490) Google Scholar]. The next significant milestone was the use of neural networks [12Fukushima K. Neural-network model for a mechanism of pattern recognition unaffected by shift in position.Trans. IECE Japan. 1979; J62-A: 658-665Google Scholar]; however, weights were manually assigned. This was followed by application of back-propagation of errors to train deep models that could yield useful distributed representation. Then during the 1990s, the concept of CNNs was introduced [13Lecun Y. et al.Gradient-based learning applied to document recognition.Proc. IEEE. 1998; 86: 2278-2324Crossref Scopus (30142) Google Scholar] specifically for image recognition problems. However, the power and possibilities of DL remained unrealized primarily because of three reasons: (i) lack of very large datasets, (ii) lack of computing power, and (iii) certain algorithmic deficiencies. These issues started to get resolved in the mid-2000s with the advent of 'big data', which provided very large datasets and graphics processing units (GPUs) that provided computing resources and algorithmic advancements. Neural network capabilities started to be appreciated with these models showing significantly better performance than traditional ML models for most of the benchmark classification and prediction problems. In a seminal paper [14Hinton G.E. Salakhutdinov R.R. Reducing the dimensionality of data with neural networks.Science. 2006; 313: 504-507Crossref PubMed Scopus (13497) Google Scholar], Hinton and Salakhutdinov demonstrated the impressive capability of DNNs, which opened the door for a wide variety of uses of DL. Pioneering DL development and implementation studies [15Cireşan, D., et al. (2012) Multi-column deep neural networks for image classification. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3642-3649, IEEEGoogle Scholar, 16Cireşan D.C. et al.Deep, big, simple neural nets for handwritten digit recognition.Neural Comput. 2010; 22: 3207-3220Crossref PubMed Scopus (654) Google Scholar, 17Krizhevsky, A., et al. (2012) ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012 (Vol. 1) (Pereira, F. et al., eds), pp. 1097–1105, Curran Associates Inc.Google Scholar] specifically on deep convolutional neural network (DCNN) in the areas of speech processing and image processing attracted technological giants, such as Google, Facebook, Amazon, NVIDIA, and Microsoft, to invest significant resources in DL development. We see the impact of such investment in our everyday lives in the form of smartphone apps and consumer microtargeting. While traditional ML and computer vision techniques retain a role in specific instances (where large datasets are not available or there is a computation constraint), many current and future impactful scientific and technological innovations are becoming a reality via leveraging DL concepts. DL models can be trained in both supervised and unsupervised ways. In supervised DL, labeled input data (such as an image of a diseased leaf) are mapped to output (e.g., a soybean disease such as sudden death syndrome) via a weights vector and errors are back-propagated (adjusting the weights) from the output layer to the input; whereas, in the case of unsupervised DL, the objective is to identify patterns from the data for various purposes, such as clustering and hashing. A key difference between traditional supervised ML and supervised DL is that DL combines the two-step process of feature extraction and decision-making of traditional ML within one model and avoids the often suboptimal manual handcrafting (Figure 2). As discussed earlier, training DL models typically involve SGD, an iterative optimization algorithm (or variants thereof, such as ADAM), to accomplish the back-propagation-based (to update weights) model parameter learning. However, the choice of hyperparameters in the training process determines (to a large extent) a successful DNN model. Important hyperparameters include the network architecture (such as the number of units in a layer and the number of layers), the learning rate, and the choice of activation functions. While network architecture can be chosen carefully for an individual problem at hand, it is common practice in the community to begin with a preselected architecture that has been shown to be successful in various data domains and problems and adapt it to the problem under consideration. Examples of such popular architectures include AlexNet [17Krizhevsky, A., et al. (2012) ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS 2012 (Vol. 1) (Pereira, F. et al., eds), pp. 1097–1105, Curran Associates Inc.Google Scholar], ZFNet [18Zeiler M.D. Fergus R. Visualizing and understanding convolutional networks.CoRR. 2013; (abs/1311.2901)Google Scholar], VGGNet [19Simonyan K. Zisserman A. Very deep convolutional networks for large-scale image recognition.CoRR. 2014; (abs/1409.1556)Google Scholar], InceptionNet [20Szegedy C. et al.Inception-v4, Inception-ResNet and the impact of residual connections on learning.CoRR. 2016; (abs/1602.07261)Google Scholar], ExceptionNet [21Chollet F. Xception: deep learning with depthwise separable convolutions.CoRR. 2016; (abs/1610.02357)Google Scholar], and ResNet [22He K. et al.Deep residual learning for image recognition.CoRR. 2015; (abs/1512.03385)Google Scholar], which are briefly introduced in the next section. The process of leveraging such pre-existing network architectures effectively is called 'transfer learning' and is discussed in a later section (Table 1).Table 1Examples of Deep Learning Approaches in Plant Stress Image-Based PhenotypingDL algorithm application (ICQP)DL algorithm typePlantPlatformStress typeStress nameRefsIdentificationLeNet architectureBananaManualBiotic stressEarly scorch, cottony mold, ashen mold, late scorch, tiny whiteness67Amara J. et al.A deep learning-based approach for banana leaf diseases classification.Lecture Notes in Informatics (LNI). Gesellschaft für Informatik, 2017: 79-88Google ScholarIdentificationAlexNet, GoogLeNet, VGGNet-16, ResNet-20AppleManualBiotic stressAlternaria leaf spot, mosaic, rust, brown spot68Liu B. et al.Identification of apple leaf diseases based on deep convolutional neural networks.Symmetry. 2018; 10: 11Crossref Scopus (366) Google ScholarIdentificationInception-v3, ImageNetCassavaManualBiotic stressCassava brown streak disease, cassava mosaic disease, brown leaf spot, cassava green mite damage, cassava red mite damage69Ramcharan A. et al.Deep learning for image-based cassava disease detection.Front. Plant Sci. 2017; 8: 1852Crossref PubMed Scopus (307) Google ScholarIdentificationAlexNet, ALexNetOWTBn, GoogLeNet, Overfeat, VGGApple, banana, blueberry, cabbage, cantaloupe, cassava, celery, cherry, corn, cucumber, eggplant, gourd, grape, onion, orangeManualBiotic stressBacterial spot, apple scab, cedar apple rust, black rot, banana sigatoka, banana speckle, brown leaf spot, cassava green spider mite, Cercospora leaf spot, common rust, northern leaf blight, esca (black measles, late and early blight, cucumber mosaic, downy mildew, powdery mildew, frogeye leaf spot, leaf scorch, Septoria leaf spot, Septoria leaf blight, spider mites, tomato mosaic virus, leaf mold, target spot, TYLCV, huanglongbing66Ferentinos K.P. Deep learning models for plant disease detection and diagnosis.Comput. Electron. Agric. 2018; 145: 311-318Crossref Scopus (1150) Google ScholarIdentificationAlexNet, GoogLeNetApple, blueberry, cherry, corn, grape, peach, bell pepper, potato, raspberry, soybean, squash, strawberry, tomatoManualBiotic stressApple scab, apple black rot, apple cedar rust, cherry powdery mildew, corn gray leaf spot, corn common rust, corn northern leaf blight, grape black rot, grape black measles, grape leaf blight, orange huanglongbing (citrus greening), peach bacterial spot, bell pepper bacterial spot, potato early blight, potato late blight, squash powdery mildew, strawberry leaf scorch, tomato bacterial spot, tomato early blight, tomato late blight, tomato leaf mold, tomato Septoria leaf spot, tomato two-spotted spider mite, tomato target spot, tomato mosaic virus, tomato yellow leaf curl virus53Mohanty S.P. et al.Using deep learning for image-based plant disease detection.Front. Plant Sci. 2016; 7: 1419https://doi.org/10.3389/fpls.2016.01419Crossref PubMed Scopus (1595) Google ScholarIdentificationModified LeNetOliveManualBiotic stressOlive quick decline syndrome71Cruz A.C. et al.X-FIDO: an effective application for detecting olive quick decline syndrome with deep learning and data fusion.Front. Plant Sci. 2017; 8: 1741Crossref PubMed Scopus (107) Google ScholarIdentificationCNNCucumberManualBiotic stressMelon yellow spot virus, zucchini yellow mosaic virus, cucurbit chlorotic yellows virus, cucumber mosaic virus, papaya ring spot virus, watermelon mosaic virus, green mottle mosaic virus72Fujita, E., et al. (2016) Basic investigation on a robust and practical plant diagnostic system. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 989–992, IEEEGoogle ScholarIdentificationCaffeNet, ImageNetPear, cherry peach, apple, grapevineManualBiotic stressPorosity (pear, cherry, peach), powdery mildew (peach), peach leaf curl, fire blight (apple, pear), apple scab, powdery mildew (apple), rust (apple, pear), grey leaf spot (pear), wilt (grapevine), mites (grapevine), downy mildew (grapevine), powdery mildew (grapevine)88Sladojevic S. et al.Deep neural networks based recognition of plant diseases by leaf image classification.Comput. Intell. Neurosci. 2016; 2016 (3289801)Crossref PubMed Scopus (856) Google ScholarIdentificationAlexNet, ZFNet, VGG-16, GoogLeNet, ResNet-50, ResNet-101, ResNetXt-101, Faster RCNN, R-FCN, SSDTomatoManualBiotic and abiotic stressGray mold, canker, leaf mold, plague, leaf miner, whitefly, low temperature, nutritional excess or deficiency, powdery mildew52Fuentes A. et al.A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition.Sensors (Basel, Switzerland). 2017; 17: 2022Crossref PubMed Scopus (629) Google ScholarIdentificationCNNMaizeUAVBiotic stressNorthern corn leaf blight75DeChant C. et al.Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning.Phytopathology. 2017; 107: 1426-1432Crossref PubMed Scopus (178) Google ScholarIdentificationVGG-FCN, VGG-CNNWheatManualBiotic stressPowdery mildew, smut, black chaff, stripe rust, leaf blotch, leaf rust73Lu J. et al.An in-field automatic wheat disease diagnosis system.CoRR. 2017; (abs/1710.08299)Google ScholarIdentificationVGG-A, CNNRadishUAVBiotic stressFusarium wilt74Ha J.G. et al.Deep convolutional neural network for classifying Fusarium wilt of radish from unmanned aerial vehicles.J. Appl. Remote Sens. 2017; 11: 042621Crossref Scopus (65) Google ScholarIdentificationSCRNNTomatoManualBiotic stressBacterial leaf spot, early blight tomato, late blight, Septoria leaf spot, two-spotted spider mite, tomato mosaic virus, tomato leaf mold, target spot of tomato, and tomato yellow leaf curl virus.70Yamamoto K. et al.Super-resolution of plant disease images for the acceleration of image-based phenotyping and vigor diagnosis in agriculture.Sensors (Basel). 2017; 17E2557Crossref PubMed Scopus (59) Google ScholarClassificationAlexNet, GoogLeNetTomatoManualBiotic stressTomato yellow leaf curl virus, tomato mosaic virus, target spot, spider mites, Septoria spot, leaf mold, late blight, early blight, bacterial spot77Brahimi M. et al.Deep learning for tomato diseases: classification and symptoms visualization.Appl. Artif. Intell. 2017; 31: 299-315Crossref Scopus (365) Google ScholarIdentification, classification, quantificationAlexNetSoybeanManualBiotic and abiotic stressBacterial blight, bacterial pustule, frogeye leaf spot, Septoria brown spot, sudden death syndrome, iron deficiency chlorosis, potassium deficiency, herbicide injury81Ghosal S. et al.An explainable deep machine vision framework for plant stress phenotyping.Proc. Natl. Acad. Sci. U. S. A. 2018; 115: 4613-4618Crossref PubMed Scopus (263) Google ScholarQuantificationVGG-16, VGG-19, Inception-v3, ResNet50AppleManualBiotic stressBlack rot78Wang G. et al.Automatic image-based plant disease severity estimation using deep learning.Comput. Intell. Neurosci. 2017; 2017 (2917536)Crossref Scopus (365) Google ScholarPredictionDNNTomatoManualAbiotic stressWater stress84Kaneda Y. et al.Multi-modal sliding window-based support vector regression for predicting plant water stress.Knowl. Based Syst. 2017; 134: 135-148Crossref Scopus (34) Google Scholar Open table in a new tab Apart from hyperparameter choices for DL training, another key aspect that requires attention during training such models is the issue of overfitting, when the model is learning excessive details about the training data (such as modeling the noise in training data), thus leading to significantly poor performance on the unseen test data (i.e., when the model is deployed). Before the resurgence of artificial neural networks (ANNs) in the form of DL models, the lack of approaches to prevent overfitting was one of the key reasons ANNs became unpopular in the ML community. Although large DL models with often millions of learnable parameters can suffer from similar issues, use of very large datasets has substantially reduced such problems. Therefore, if a sufficiently large dataset is not available, it is often very useful to augment the dataset (primarily by various transformations of original images, such as rotation). Apart from data augmentation and traditional regularization techniques to avoid overfitting, novel regularization techniques such as dropout [23Srivastava N. et al.Dropout: a simple way to prevent neural networks from overfitting.J. Mach. Learn. Res. 2014; 15: 1929-1958Google Scholar] have emerged to reduce overfitting DL models. Another algorithmic modification called batch normalization also helps deep network training by avoiding the problem of covariate shift [24Ioffe S. Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift.arXiv. 2015; (1502.03167)Google Scholar]. We refer the interested reader to comprehensive reviews, one that focuses

求助该文献

最长约 10秒，即可获得该文献文件

Deep Learning for Plant Stress Phenotyping: Trends and Future Perspectives

今日热心研友