类脑芯片咨询 门户 论文 其他体系 查看内容

Nature|利用积分光子张量核的并行卷积处理

tutu 2021-1-10 16:27

来自美国匹兹堡大学、德国明斯特大学、英国牛津大学、埃克塞特大学、瑞士洛桑EPFL及苏黎世IBM研究实验室等科研院所的国际合作团队开发了一个基于张量核心的计算专用集成光子硬件加速器,通过将相变材料与光子结构结 ...

文章链接:https://doi.org/10.1038/s41586-020-03070-1

基于张量核心的计算专用集成光子硬件加速器

摘要

随着超高速移动网络和互联网连接设备的普及,以及人工智能(AI)的崛起,世界正在产生指数级增长的数据,需要以一种快速和有效的方式处理这些数据。因此,高度并行、快速和可扩展的硬件变得越来越重要。这里,我们演示了一个计算专用的集成光子硬件加速器(张量核心),它能够以每秒数万亿次乘累加运算的速度运行(每秒10^12次MAC运算或每秒tera-MAC运算)。张量磁芯可以被认为是专用集成电路(ASIC)的光学模拟。它利用相变材料存储阵列和基于光子芯片的光频率梳(孤子微梳)来实现并行光子存储计算。计算减少到测量可重构和非谐振无源元件的光传输,并且可以在超过14千兆赫的带宽下工作,仅受调制器和光电探测器的速度限制。鉴于最近在微波线速率、超低损耗氮化硅波导和高速片上探测器和调制器下的孤子微谐振器混合集成方面取得的进展,我们的方法提供了一条通往光子张量核心的完全互补金属氧化物半导体(CMOS)晶片尺度集成的途径。尽管我们关注的重点是卷积处理,但总体而言,我们的研究结果表明,集成光子技术在大量数据的人工智能应用(如自动驾驶、实时视频处理和下一代云计算服务)中具有并行、快速和高效计算硬件的潜力。

With the proliferation of ultrahigh-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence (AI)1, the world is generating exponentially increasing amounts of data that need to be processed in a fast and efficient way. Highly parallelized, fast and scalable hardware is therefore becoming progressively more important2. Here we demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per second (1012 MAC operations per second or tera-MACs per second). The tensor core can be considered as the optical analogue of an application-specific integrated circuit (ASIC). It achieves parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs (soliton microcombs3). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant passive components and can operate at a bandwidth exceeding 14 gigahertz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates3,4,5, ultralow-loss silicon nitride waveguides6,7, and high-speed on-chip detectors and modulators, our approach provides a path towards full complementary metal–oxide–semiconductor (CMOS) wafer-scale integration of the photonic tensor core. Although we focus on convolutional processing, more generally our results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.

鲜花
鲜花
握手
握手
雷人
雷人
路过
路过
鸡蛋
鸡蛋
分享至 : QQ空间
收藏
便民服务

400-8826-226

电话服务热线时间:9:00 - 21:00

关注我们

ZUK微信