作者
Chi Sung Oh,Ki Chul Chun,Young-Yong Byun,Yong-Ki Kim,Soyoung Kim,Yesin Ryu,Jaewon Park,Sinho Kim,Sanguhn Cha,Dong-Hak Shin,Jungyu Lee,Jong-Pil Son,Byung-Kyu Ho,Seong-Jin Cho,Beomyong Kil,Sungoh Ahn,Baek-Min Lim,Yong-Sik Park,Ki-Jun Lee,Myung-Kyu Lee,Seungduk Baek,Junyong Noh,Jae‐Wook Lee,Seungseob Lee,Sooyoung Kim,Bo-Tak Lim,Seouk-Kyu Choi,Jin-Guk Kim,Hye-In Choi,Hyuk‐Jun Kwon,Jun Jin Kong,Kyomin Sohn,Nam Sung Kim,Kwang‐Il Park,Jung-Bae Lee
摘要
Rapidly evolving artificial intelligence (Al) technology, such as deep learning, has been successfully deployed in various applications: such as image recognition, health care, and autonomous driving. Such rapid evolution and successful deployment of Al technology have been possible owing to the emergence of accelerators, such as GPUs and TPUs, that have a higher data throughput. This, in turn, requires an enhanced memory system with large capacity and high bandwidth [1]; HBM has been the most preferred high-bandwidth memory technology due to its high-speed and low-power characteristics, and 1024 IOs facilitated by 2.5D silicon interposer technology, as well as large capacity realized by through-silicon via (TSV) stack technology [2]. Previous-generation HBM2 supports 8GB capacity with a stack of 8 DRAM dies (i.e., 8-high stack) and 341GB/s (2.7Gb/s/pin) bandwidth [3]. The HBM industry trend has been a speed improvement of 15~20% every year, while capacity increases by 1.5-2x every two years. In this paper, we present a 16GB HBM2E with circuit and design techniques to increase its bandwidth up to 640GB/s (5Gb/s/pin), while providing stable bit-cell operation in the 2 nd generation of a 10nm DRAM process: featuring (1) a data-bus window-extension technique to cope with reduced $t_{cco}$ , (2) a power delivery network (PDN) designed for stable operation at a high speed, (3) a synergetic on-die ECC scheme to reliably provide large capacity, and (4) an MBIST solution to efficiently test large capacity memory at a high speed.