摘要
Function or disfunction of proteins depends on the primary structures, and protein sequencing, which provides key information on protein related biological processes and disease, plays important roles in biological, biomedical, clinical research and application. To obtain the precise protein sequences, researchers developed different methods over the past few decades, and these methods include conventional methods and newly methods. The former includes Edman degradation and mass spectrometry (MS), and the latter includes single-molecule detection, nanopore and other lately developed techniques. In the 1960s, the classic Edman degradation was firstly developed for sequencing protein molecules from N-terminus using cyclic chemical reaction. Afterwards, solid-state, and gas-state Edman degradation was further developed that still plays a significant role in the modern technologies. This review discusses the principle and limits of Edman degradation. Moreover, we discussed advantages and shortcomings of MS-based approaches, which are the current standard methods for protein sequencing applications. Single-molecule approaches could bring revolution in proteomics, realizing high sensitivity for the low-abundance protein detection and single-cell proteomics. With the development of the single-molecule nucleic acid sequencing, four kinds of basic groups of DNA/RNA can be effectively detected using label-free or fluorescence labelling strategies. However, it is still a challenge to label and analyze all twenty kinds of amino acid residues. Moreover, sensitive optical detection has been utilized for high throughput protein sequencing using fluorescence labelling. In this approach, selected residues of peptides were labelled, and the C-terminus was anchored onto the glass substrate. N-terminus was degraded through Edman cycles. Finally, the sequence can be analyzed through the wide-field fluorescence signals. This method has potential of large-scale, sensitive, and parallel detection. We have discussed its principle and characteristic features in detail. Nanopore, including biological nanopore and solid-state nanopore, has been emerged as powerful technologies for protein sequencing. Nanopore can provide single-molecule sensing interface and controlled nano-confined space enabling ultimate sensitivity and high spatiotemporal resolution. The mechanism of nanopore-based technologies depends on the interaction of functional group and the nanopore, inducing the current modulations. The information of peptides can be obtained by monitoring the ionic current responses. Arrayed nanopores have potential of high-throughput detection at low-abundance. It is still in early stage of development and some challenges need to be addressed. As “finger-print” signal, Raman spectrum is an ideal candidate for protein sequencing. However, very weak signals can significantly restrict its application, especially at low concentration of target molecule. Surface enhanced Raman spectroscopy (SERS) can enhance the Raman signal to achieve the detection on the scale of a single molecule. Combination of the SERS and nanopore has demonstrated powerful capability of label-free detection of ten kinds of amino acids. Moreover, this method offers a new strategy for protein sequencing. Comparing with the weak Raman signal, fluorescence signals are more accessible, even on the level of single molecule. Several molecular dynamics (MD) simulations have been discussed to show possibility of fluorescence labelled protein sequencing within nanopore. Nevertheless, some drawbacks need to be addressed, especially the high-cost fabrication of nanopore and translocation of proteins through a pore. Specifically, this review also discusses the future challenges as well as summarize recent efforts to break the bottleneck of the current protein sequencing, promoting development of medical treatment, disease diagnosis and related fields.