作者
Inna Levin,Mitchell D. Miller,Robert Schwarzenbacher,Daniel McMullan,Polat Abdubek,Eileen Ambing,Tanya Biorac,Jamison Cambell,Jaume M. Cánaves,Hsiu‐Ju Chiu,Ashley M. Deacon,Michael DiDonato,Marc‐André Elsliger,Adam Godzik,Carina Grittini,Slawomir K. Grzechnik,Joanna Hale,Eric Hampton,Gye Won Han,Justin Haugen,Michael Hornsby,Lukasz Jaroszewski,Cathy Karlak,Heath E. Klock,Eric Koesema,Andreas Kreusch,Peter Kühn,Scott A. Lesley,Andrew T. Morse,Kin Moy,Edward Nigoghossian,Jiahong Ouyang,Rebecca Page,Kevin Quijano,Ron Reyes,Alyssa Robb,Eric Sims,Glen Spraggon,Raymond C. Stevens,Henry van den Bedem,Jeff Velasquez,Juli Vincent,Xianhong Wang,Bill West,Guenter Wolf,Qingping Xu,Olga Zagnitko,Keith O. Hodgson,John Wooley,Ian A. Wilson
摘要
TM1464 encodes an indigoidine synthase A (IndA)-like protein from Thermotoga maritima, with a molecular weight of 31,595 Da (residues 1–285) and a calculated isoelectric point of 5.5. IndA is involved in the biosynthesis of indigoidine, a blue pigment first described in Erwinia chrysanthemi and implicated in protection from oxidative stress and pathogenicity.1 IndA is present in yeast, Candida elegans, Arabidopsis, and most bacteria.2 Here, we report the crystal structure of TM1464 determined using the semiautomated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG).3 The structure of TM1464 [Fig. 1(A)] was determined to 1.9 Å resolution using the multiwavelength anomalous dispersion (MAD) method. Data collection, model, and refinement statistics are summarized in Table I. The final model includes 6 protein monomers (residues 1–284 plus 8 residues of the N-terminal expression and purification tag), 6 unknown ligand (UNL) molecules, 16 manganese ions, 11 ethylene glycol molecules, and 983 water molecules. The Matthews' coefficient (Vm)6 for TM1464 is 2.00 Å3/Da, and the estimated solvent content is 37.9%. The Ramachandran plot, produced by MolProbity,7 shows that 98.15% and 100% of the residues are in favored and allowed regions, respectively. Crystal structure of TM1464. (A) Stereo ribbon diagram of T. maritima TM1464 color-coded from N-terminus (blue) to C-terminus (red), showing the domain organization. Helices H1–H12, and β-strands (β1–β11) are indicated. (B) Diagram showing the secondary structure elements in TM1464 superimposed on its primary sequence. The α-helices, 310-helices, β-bulges, and γ-turns are indicated. The β-sheet strands are indicated by a red A, and β-hairpins are depicted as red loops. The TM1464 monomer contains 11 β-strands (β1–β11), 12 α-helices (H1–H12) and two 310-helices (H4′ and H11′) [Fig. 1(A and B)]. The total β-strand, α-helical, and 310-helical content is 21.9%, 31.3%, and 1.6%, respectively. TM1464 folds into an α/β/α domain with a central, 11-stranded β-sheet surrounded by 12 helices. The β-sheet is of mixed type and is composed of a 9-stranded β-sheet of 143256978 topology (β1–β4, β7–β11), with β-strands β5 and β6 forming an additional β-sheet through interactions with β3 [Fig. 1(A)]. The β-strands are connected by extended loop–helix–loop motifs. A structural similarity search, performed with the coordinates of TM1464 using the DALI server,8 showed no matches indicating that TM1464 is a new fold and the first structure of the IndA-like protein family (PF04227).2 The parallel portion of the central β-sheet (β2, β7–β11) shows some resemblance to the N-terminal domain of periplasmic glucose/galactose receptor from Salmonella typhimurium [Protein Data Bank (PDB) code: 1gca],9 but it encompasses only 50% of the structure. The root-mean-square deviation (RMSD) for this structural alignment is 3.6 Å over 112 aligned residues with 9% sequence identity. Models for TM1464 homologues can be accessed at http://www1.jcsg.org/cgi-bin/models/get_mor.pl?key=TM1464. The crystallographic packing in the orthorhombic and rhombohedral crystal forms of the TM1464 structure indicates that a trimer is the biologically-relevant oligomeric form. The trimer measures 80 Å in diameter and 55 Å in height, with an 8 Å-wide inner channel [Fig. 2(A)], with 3-fold symmetry. The trimer interfaces are formed by interactions between helices H4, H6, H7, H8, and β9. The subunit interactions are stabilized by an intersubunit Arg106–Glu206′ salt bridge, and by a 4-residue intersubunit salt-bridge network Arg81–Glu82–Lys242–Glu160′ (′ marks a neighboring subunit) that accounts for a buried surface area of 2444 Å2 per monomer. (A) The TM1464 trimer viewed along the 3-fold axis. One subunit is shown in blue. The 3 UNLs bound to the putative active sites are shown as red spheres. (B) TM1464 shown in surface representation illustrating the narrow active site pocket, with the bound UNL in ball-and-stick configuration. (C) Close-up of the putative active site. TM1464 is shown in ribbon representation with the UNL, manganese, and interacting residues and waters in ball-and-stick configuration (see text). A SigmaA-weighted Fo-Fc omit map contoured at 3.0 σ is shown for the UNL, manganese, and coordinating waters. Putative H-bond interactions between the UNL and the protein are shown in yellow dotted lines. A SigmaA-weighted Fc-Fc omit map shows compact density in each monomer within a groove on the edge of the β-sheet adjacent to the side chains of Glu17, Lys77, Ser128, Asp130, Lys147, and to the backbone amide of Val97. The location and conservation of these side chains indicate a possible location for the TM1464 active site [Fig. 2(C)]. The electron density of the putative ligand shows a heavier moiety at one end, whose geometry is consistent with a phosphate or sulfate. The electron density and putative hydrogen-bonding pattern suggests that the ligand may be a glycerol-3-phosphate analog. However, the ligand is larger than glycerol-3-phosphate, with density extending beyond the carbon 1 atom of glycerol-3-phosphate. Despite extensive modeling efforts, the density could not be definitively identified and was, therefore, modeled as a UNL. A water-mediated interaction is found between the putative phosphate of the UNL and the hydrated Mn ion associated with Asp126 and Glu160′ of the adjacent subunit. This suggests that oligomerization into a trimer is important for the formation of this putative active site [Fig. 2(A and C)]. We were unable to identify a similar active site configuration in the PDB, which indicates that TM1464 represents a functionally novel enzyme. The TM1464 structure reported here is a new fold and represents the first indigoidine synthase A (IndA)-like protein whose structure has been determined by X-ray crystallography. A glycerol-3-phosphate analog is bound to the protein's active site. The information reported here, in combination with further biochemical and biophysical studies, will yield valuable insights into the functional role of the indigoidine synthase A (IndA)-like protein. TM1464 (TIGR: TM1464; Swissprot: Q9X1H5) was amplified by polymerase chain reaction (PCR) from genomic DNA using PfuTurbo (Stratagene) and primer pairs encoding the predicted 5′- and 3′-ends. The PCR product was cloned into plasmid pMH1, which encodes an expression and purification tag (MGSDKIHHHHHH) at the amino terminus of the full-length protein. The cloning junctions were confirmed by sequencing. Protein expression was performed in a selenomethionine-containing medium using the Escherichia coli methionine auxotrophic strain DL41. Lysozyme was added to the culture at the end of fermentation to a final concentration of 250 μg/mL. Bacteria were lysed by sonication after a freeze-thaw procedure in Lysis Buffer [50 mM Tris, pH 7.9, 50 mM NaCl, 10 mM imidazole, 0.25 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP)], and the cell debris was pelleted by centrifugation at 3400 × g for 60 min. The soluble fraction was applied to a nickel-resin (Amersham Biosciences) pre-equilibrated with Lysis Buffer. The nickel-resin was washed with Wash Buffer [50 mM potassium phosphate, pH 7.8, 40 mM imidazole, 300 mM NaCl, 10% (v/v) glycerol, 0.25 mM TCEP], and the protein was eluted with Elution Buffer [20 mM Tris, pH 7.9, 300 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP]. Buffer exchange was performed to remove imidazole from the eluate, and the protein in Buffer Q [20 mM Tris pH 7.9, 5% (v/v) glycerol, 0.25 mM TCEP] containing 50 mM NaCl was applied to a Resource Q column (Amersham Biosciences) pre-equilibrated with the same buffer. The protein was eluted using a linear gradient of 50–500 mM NaCl in Buffer Q. The appropriate fractions were pooled, further purified using a Superdex 200 column (Amersham Biosciences) with elution in Crystallization Buffer [20 mM Tris, pH 7.9, 150 mM NaCl, 0.25 mM TCEP], and concentrated for crystallization assays to 18 mg/mL by centrifugal ultrafiltration (Millipore). The initial crystallization conditions [17% (w/v) polyethylene glycol (PEG 1000), 0.1MN-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid (HEPES), pH 7.4] were found using the nanodroplet vapor diffusion method,10 with standard JCSG crystallization protocols,2 and yielded needle-shaped crystals of rhombohedral (R3) symmetry. Optimization of these conditions using streak seeding in a hanging drop vapor diffusion standard 24-well Nextal plate led to improved crystals of the same symmetry, which diffracted to 2.4 Å resolution (λR3 in Table I). For crystal mounting, 19% (v/v) ethylene glycol (final concentration) was included as a cryoprotectant. Better quality crystals with orthorhombic (P212121) symmetry, diffracting to 1.9 Å (λ0, λ1, λ2, and λ3 in Table I), were obtained by addition of 10 mM MnCl2 to the drop solution, and reducing of the precipitant and the protein concentration. Final crystals were grown from 7% (w/v) PEG1000, 0.1M HEPES, pH 7.4, 10 mM MnCl2 using a protein concentration of 4.5 mg/mL. Crystallization drops were pre-equilibrated for 1 h and then streak-seeded from previously obtained orthorhombic crystals. For data collection, 17% (v/v) ethylene glycol (final concentration) was included as a cryoprotectant and the PEG1000 concentration was increased to 10%. MAD data were collected at the Advanced Light Source (ALS; Berkely, CA) on beamline 8.2.2 at wavelengths corresponding to the inflection point (λ1), low energy remote (λ2), and the peak (λ3) of a selenium MAD experiment. In addition, a 1.9 Å high-resolution data set (λ0) was collected from a second crystal on beamline 8.2.1, and a 2.4 Å resolution data set from a rhombohedral crystal (λR3) was collected on beamline 8.2.2. The data sets were collected at 100 K using Quantum 315 or Quantum 210 charge-coupled device (CCD) detectors. Data were integrated and reduced using Mosflm11 and then scaled with the program SCALA from the CCP4 suite.4 Data statistics are summarized in Table I. The initial structure was determined with the 2.6 Å selenium MAD data (λ1,2,3) using the CCP4 suite4 and SOLVE/RESOLVE.12 Heavy atom positions were refined using SHARP.13 Model building and refinement was performed with the 1.9 Å data set (λ0) using O,14 REFMAC5,4 and CNS.15 Refinement statistics are summarized in Table I. Another data set on a 2.4 Å rhombohedral crystal (R3) (see Table I) gave a molecular replacement solution using the orthorhombic structure as a search model, but the refinement was not completed due to the higher resolution orthorhombic structure (data available by contacting the JCSG via www.jcsg.org). The manganese assignment in the orthorhombic crystal form was based on anomalous difference Fourier maps, X-ray fluorescence, geometry, and the absence of similar peaks in the rhombohedral crystal form without added manganese. The TM1464 expression construct contains an alternate initiation codon that results in a valine instead of methionine at position 1. No electron density was observed for residues 285 and the rest of the expression and purification tags. Analysis of the stereochemical quality of the model was accomplished using the AutoDepInputTool (http://deposit.pdb.org/adit/), MolProbity,7 SFcheck 4.0,4 and WHAT IF 5.0.16 Protein quarternary structure analysis used the PQS server (http://pqs.ebi.ac.uk/). Figure 1(B) was adapted from an analysis using PDBsum (http://www.biochem.ucl.ac.uk/bsm/pdbsum/), and all others were prepared with PYMOL (DeLano Scientific). Atomic coordinates and experimental structure factors of TM1464 have been deposited within the PDB and are accessible under the code 1vkm. Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory (SSRL) and the Advanced Light Source (ALS). The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences). The ALS is supported by the Director, Office of Science, Office of Basic Energy Sciences, Materials Sciences Division, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098 at Lawrence Berkeley National Laboratory.