编码器
分辨率(逻辑)
计算生物学
基因组
计算机科学
生物
遗传学
人工智能
基因
操作系统
出处
期刊:Cornell University - arXiv
日期:2024-07-03
标识
DOI:10.48550/arxiv.2407.03392
摘要
A linear attention mechanism is described to extend the context length of an encoder only transformer, called M5 in this report, to a multi-million single nucleotide resolution foundation model pretrained on bacterial whole genomes. The linear attention mechanism used approximates a full quadratic attention mechanism tightly and has a simple and lightweight implementation for the use case when the key-query embedding dimensionality is low. The M5-small model is entirely trained and tested on one A100 GPU with 40gb of memory up to 196K nucleotides during training and 2M nucleotides during testing. We test the performance of the M5-small model and record notable improvements in performance as whole genome bacterial sequence lengths are increased as well as demonstrating the stability of the full multi-head attention approximation used as sequence length is increased.
科研通智能强力驱动
Strongly Powered by AbleSci AI