作者
Rick Bahr,Clark Barrett,Nikhil Bhagdikar,Alex Carsello,Ross Daly,Caleb Donovick,David Durst,Kayvon Fatahalian,Kathleen Feng,Pat Hanrahan,Teguh Hofstee,Mark Horowitz,Dillon Huff,Fredrik Kjolstad,Taeyoung Kong,Qiaoyi Liu,Makai Mann,Jackson Melchert,Ankita Nayak,Aina Niemetz,Gedeon Nyengele,Priyanka Raina,Stephen M. Richardson,Raj Setaluri,Jeff Setter,K. Sreedhar,Maxwell Strange,James D. Thomas,Christopher Torng,Leonard Truong,Nestan Tsiskaridze,Keyi Zhang
摘要
Although an agile approach is standard for software design, how to properly adapt this method to hardware is still an open question. This work addresses this question while building a system on chip (SoC) with specialized accelerators. Rather than using a traditional waterfall design flow, which starts by studying the application to be accelerated, we begin by constructing a complete flow from an application expressed in a high-level domain-specific language (DSL), in our case Halide, to a generic coarse-grained reconfigurable array (CGRA). As our under-standing of the application grows, the CGRA design evolves, and we have developed a suite of tools that tune application code, the compiler, and the CGRA to increase the efficiency of the resulting implementation. To meet our continued need to update parts of the system while maintaining the end-to-end flow, we have created DSL-based hardware generators that not only provide the Verilog needed for the implementation of the CGRA, but also create the collateral that the compiler/mapper/place and route system needs to configure its operation. This work provides a systematic approach for desiging and evolving high-performance and energy-efficient hardware-software systems for any application domain.