摘要
Many common, biologically interpretable features have been identified from type III, IV, and VI effectors, which have also been used by various bioinformatic approaches for new effector prediction. Many effectors have been identified with bioinformatic assistance. Experimental validation of new effectors needs to consider the cost, so that the precision of bioinformatic approaches is most important. For this purpose, comprehensive analysis of typical features, ensemblers, and species-specific models are the most favorite choices. Bioinformatic prediction has also been used for effectorome analysis. Ensemblers are also favored since they often show high prediction accuracy. Natural language processing models can predict effectors accurately. The models can also identify new features of effectors and provide insights for understanding the mechanisms of effector secretion. Gram-negative bacteria deliver effector proteins through type III, IV, or VI secretion systems (T3SSs, T4SSs, and T6SSs) into host cells, causing infections and diseases. In general, effector proteins for each of these distinct secretion systems lack homology and are difficult to identify. Sequence analysis has disclosed many common features, helping us to understand the evolution, function, and secretion mechanisms of the effectors. In combination with various algorithms, the known common features have facilitated accurate prediction of new effectors. Ensemblers or integrated pipelines achieve a better prediction of performance, which combines multiple computational models or modules with multidimensional features. Natural language processing (NLP) models also show the merits, which could enable discovery of novel features and, in turn, facilitate more precise effector prediction, extending our knowledge about each secretion mechanism. Gram-negative bacteria deliver effector proteins through type III, IV, or VI secretion systems (T3SSs, T4SSs, and T6SSs) into host cells, causing infections and diseases. In general, effector proteins for each of these distinct secretion systems lack homology and are difficult to identify. Sequence analysis has disclosed many common features, helping us to understand the evolution, function, and secretion mechanisms of the effectors. In combination with various algorithms, the known common features have facilitated accurate prediction of new effectors. Ensemblers or integrated pipelines achieve a better prediction of performance, which combines multiple computational models or modules with multidimensional features. Natural language processing (NLP) models also show the merits, which could enable discovery of novel features and, in turn, facilitate more precise effector prediction, extending our knowledge about each secretion mechanism. computing utility, similar to ‘computing power’ or ‘HashRate’. random combination of domains in a protein. In Legionella, the proteins contain a large number of domains showing homology to eukaryotic proteins. These domains are combined in diverse forms within type IV effector proteins, leading to the large variety of effector repertoire in different species or strains. the complete collection of effectors encoded by the genome(s) of a bacterial strain, species, or other taxon. bacterial proteins that are translocated into and exert function in the eukaryotic cells, according to the early definition. In Gram-negative bacteria, effectors now specify the proteins that can be translocated by T3SSs, T4SSs, or T6SSs. Therefore, the proteins translocated by T6SSs or some subtype of T4SSs into competing bacterial cells are also called effectors. a conserved domain found in type VI effectors. It can be used as another marker for T6SS effectors besides MIX. FIX and MIX are often mutually exclusive in effectors. a marker for type VI effectors. It is a conserved motif in the N-terminal region of many polymorphic T6SS effectors. recombination of a nucleotide sequence encoding the N-terminal secretion signal of a type III effector and/or the promoter with another sequence encoding a functional domain, generating a fusion gene that encodes a new type III effector.