De novo protein design by deep network hallucination

  • 1.

    Xu, J. Distance-based protein folding powered by deep learning. Proc. Natl Acad. Sci. USA 116, 16856–16865 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 2.

    Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).

    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • 3.

    Yang, J. et al. Improved protein structure prediction using predicted interresidue orientations. Proc. Natl Acad. Sci. USA 117, 1496–1503 (2020).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 4.

    Biswas, S., Khimulya, G., Alley, E. C., Esvelt, K. M. & Church, G. M. Low-N protein engineering with data-efficient deep learning. Nat. Methods 18, 389–396 (2021).

    CAS 
    PubMed 

    Google Scholar 

  • 5.

    Madani, A. et al. ProGen: language modeling for protein generation. Preprint at https://arxiv.org/abs/2004.03497 (2020).

  • 6.

    Anand, N., Eguchi, R. & Huang, P. S. Fully differentiable full-atom protein backbone generation. In ICLR 2019 Workshop https://openreview.net/forum?id=SJxnVL8YOV (2019).

  • 7.

    Wang, J., Cao, H., Zhang, J. Z. H. & Qi, Y. Computational protein design with deep learning neural networks. Sci Rep. 8, 6349 (2018).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 8.

    Ingraham, J., Garg, V. K., Barzilay, R. & Jaakkola, T. Generative models for graph-based protein design. in ICLR 2019 Workshop https://openreview.net/forum?id=SJgxrLLKOE (2019).

  • 9.

    Anand, N., Eguchi, R. R., Derry, A., Altman, R. B. & Huang, P.-S. Protein sequence design with a learned potential. Preprint at https://doi.org/10.1101/2020.01.06.895466 (2020).

  • 10.

    Strokach, A., Becerra, D., Corbi-Verge, C., Perez-Riba, A. & Kim, P. M. Fast and flexible protein design using deep graph neural networks. Cell Syst. 11, 402–411.e4 (2020).

    CAS 
    PubMed 

    Google Scholar 

  • 11.

    Karimi, M., Zhu, S., Cao, Y. & Shen, Y. De novo protein design for novel folds using guided conditional Wasserstein generative adversarial networks. J. Chem. Inf. Model. 60, 5667–5681 (2020).

    CAS 
    PubMed 

    Google Scholar 

  • 12.

    Davidsen, K. et al. Deep generative models for T cell receptor protein sequences. eLife 8, e46935 (2019).

    PubMed 
    PubMed Central 

    Google Scholar 

  • 13.

    Costello, Z. & Martin, H. G. How to hallucinate functional proteins. Preprint at https://arxiv.org/abs/1903.00458 (2019).

  • 14.

    Eguchi, R. R., Anand, N., Choe, C. A. & Huang, P.-S. IG-VAE: generative modeling of immunoglobulin proteins by direct 3D coordinate generation. Preprint at https://doi.org/10.1101/2020.08.07.242347 (2020).

  • 15.

    Repecka, D. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat. Mach. Intell. 3, 324–333 (2021).

    Google Scholar 

  • 16.

    Hawkins-Hooker, A. et al. Generating functional protein variants with variational autoencoders. PLoS Comput. Biol. 17, e1008736 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 17.

    Senior, A. W. et al. Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins 87, 1141–1148 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 18.

    Mordvintsev, A., Olah, C. & Tyka, M. Inceptionism: going deeper into neural networks. Google AI Blog https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html (2015).

  • 19.

    Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 20.

    Rohl, C. A., Strauss, C. E. M., Misura, K. M. S. & Baker, D. Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93 (2004).

    CAS 
    PubMed 

    Google Scholar 

  • 21.

    Park, H. et al. Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules. J. Chem. Theory Comput. 12, 6201–6212 (2016).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 22.

    Rossi, P. et al. A microscale protein NMR sample screening pipeline. J. Biomol. NMR 46, 11–22 (2010).

    CAS 
    PubMed 

    Google Scholar 

  • 23.

    Koga, N. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 24.

    Dou, J. et al. De novo design of a fluorescence-activating β-barrel. Nature 561, 485–491 (2018).

    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 25.

    Norn, C. et al. Protein sequence design by conformational landscape optimization. Proc. Natl. Acad Sci. USA 118, e2017228118 (2021).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 26.

    Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Google Scholar 

  • 27.

    Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Google Scholar 

  • 28.

    Wang, J. et al. Deep learning methods for designing proteins scaffolding functional sites. Preprint at https://doi.org/10.1101/2021.11.10.468128 (2021).

  • 29.

    Jendrusch, M., Korbel, J. O. & Sadiq, S. K. AlphaDesign: A de novo protein design framework based on AlphaFold. Preprint at https://doi.org/10.1101/2021.10.11.463937 (2021).

  • 30.

    Tischer, D. et al. Design of proteins presenting discontinuous functional sites using deep learning. Preprint at https://doi.org/10.1101/2020.11.29.402743 (2020).

  • 31.

    Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 32.

    Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005).

    CAS 
    PubMed 

    Google Scholar 

  • 33.

    Pace, C. N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. How to measure and predict the molar absorption coefficient of a protein. Protein Sci. 4, 2411–2423 (1995).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 34.

    Acton, T. B. et al. Preparation of protein samples for NMR structure, function, and small-molecule screening studies. Methods Enzymol. 493, 21–60 (2011).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 35.

    Xiao, R. et al. The high-throughput protein sample production platform of the Northeast Structural Genomics Consortium. J. Struct. Biol. 172, 21–33 (2010).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 36.

    Jansson, M. et al. High-level production of uniformly 15N-and 13C-enriched fusion proteins in Escherichia coli. J. Biomol. NMR 7, 131–141 (1996).

    CAS 
    PubMed 

    Google Scholar 

  • 37.

    Ottiger, M., Delaglio, F. & Bax, A. Measurement of J and dipolar couplings from simplified two-dimensional NMR spectra. J. Magn. Reson. 131, 373–378 (1998).

    ADS 
    CAS 
    PubMed 

    Google Scholar 

  • 38.

    Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995).

    CAS 

    Google Scholar 

  • 39.

    Lee, W., Tonelli, M. & Markley, J. L. NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy. Bioinformatics 31, 1325–1327 (2015).

    Google Scholar 

  • 40.

    Favier, A. & Brutscher, B. NMRlib: user-friendly pulse sequence tools for Bruker NMR spectrometers. J. Biomol. NMR 73, 199–211 (2019).

    CAS 
    PubMed 

    Google Scholar 

  • 41.

    Hyberts, S. G., Milbradt, A. G., Wagner, A. B., Arthanari, H. & Wagner, G. Application of iterative soft thresholding for fast reconstruction of NMR data non-uniformly sampled with multidimensional Poisson gap scheduling. J. Biomol. NMR 52, 315–327 (2012).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 42.

    Ying, J., Delaglio, F., Torchia, D. A. & Bax, A. Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data. J. Biomol. NMR 68, 101–118 (2017).

    CAS 
    PubMed 

    Google Scholar 

  • 43.

    Lee, W. et al. I-PINE web server: an integrative probabilistic NMR assignment system for proteins. J. Biomol. NMR 73, 213–222 (2019).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 44.

    Moseley, H. N. B., Sahota, G. & Montelione, G. T. Assignment validation software suite for the evaluation and presentation of protein resonance assignment data. J. Biomol. NMR 28, 341–355 (2004).

    CAS 
    PubMed 

    Google Scholar 

  • 45.

    Shen, Y. & Bax, A. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J. Biomol. NMR 56, 227–241 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 46.

    Güntert, P., Mumenthaler, C. & Wüthrich, K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273, 283–298 (1997).

    PubMed 

    Google Scholar 

  • 47.

    Herrmann, T., Güntert, P. & Wüthrich, K. Protein NMR structure determination with automated NOE-identification in the NOESY spectra using the new software ATNOS. J. Biomol. NMR 24, 171–189 (2002).

    CAS 
    PubMed 

    Google Scholar 

  • 48.

    Huang, Y. J., Powers, R. & Montelione, G. T. Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J. Am. Chem. Soc. 127, 1665–1674 (2005).

    CAS 
    PubMed 

    Google Scholar 

  • 49.

    Huang, Y. J., Tejero, R., Powers, R. & Montelione, G. T. A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins 62, 587–603 (2006).

    CAS 
    PubMed 

    Google Scholar 

  • 50.

    Brünger, A. T. et al. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905–921 (1998).

    PubMed 

    Google Scholar 

  • 51.

    Bhattacharya, A., Tejero, R. & Montelione, G. T. Evaluating protein structures determined by structural genomics consortia. Proteins 66, 778–795 (2007).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 52.

    Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997).

    CAS 
    PubMed 

    Google Scholar 

  • 53.

    McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 54.

    DiMaio, F. et al. Improved low-resolution crystallographic refinement with Phenix and Rosetta. Nat. Methods 10, 1102–1104 (2013).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 55.

    Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).

    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 56.

    Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D 75, 861–877 (2019).

    CAS 

    Google Scholar 

  • 57.

    Theobald, D. L. & Wuttke, D. S. Accurate structural correlations from maximum likelihood superpositions. PLoS Comput. Biol. 4, e43 (2008).

    ADS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • 58.

    The PyMOL Molecular Graphics System version 2.4 (Schrödinger, 2021).

  • 59.

    Zweckstetter, M. NMR: prediction of molecular alignment from structure using the PALES software. Nat. Protoc. 3, 679–690 (2008).

    CAS 
    PubMed 

    Google Scholar 

  • 60.

    Montelione, G. T. & Wagner, G. 2D Chemical exchange NMR spectroscopy by proton-detected heteronuclear correlation. J. Am. Chem. Soc. 111, 3096–3098 (1989).

    CAS 

    Google Scholar 

  • Leave a Reply

    Your email address will not be published. Required fields are marked *