引用本文: | 陈胜刚,刘必慰,齐娟,等.片上多核处理器的非阻塞环设计与物理实现.[J].国防科技大学学报,2017,39(1):67-73.[点击复制] |
CHEN Shenggang,LIU Biwei,QI Juan,et al.Design and physical implementation of non-blocking ring on a multicore processor[J].Journal of National University of Defense Technology,2017,39(1):67-73[点击复制] |
|
|
|
本文已被:浏览 7959次 下载 6831次 |
片上多核处理器的非阻塞环设计与物理实现 |
陈胜刚, 刘必慰, 齐娟, 华迎召, 刑素芳, 丁艳平 |
(国防科技大学 计算机学院, 湖南 长沙 410073)
|
摘要: |
针对少量强核构成的片上多核处理器,设计了一种非阻塞双向环结构。该结构包含5层3种不同类型的环链路层,分别用于传输命令、大量数据以及小量数据;采用源路由方式,设计专门的拥塞控制网络,防止报文的相互覆盖;路由器采用无缓冲无阻塞结构,单节拍通过环的每个跳步,以降低环的传输延迟并实现可预知的确定延迟传输。针对环的链路距离长、位宽大的挑战,通过实验选择了合理的中继器插入方法,并采用相邻导线交替插入反相器以及信号线反向交错排布等串扰优化方法,对环进行物理设计和长链路进行延时优化。最终实现结果表明,所设计的环达到了1 GHz的工作主频,并具备高达256 GByte/s的链路带宽,完全满足高性能数字信号处理的需求。 |
关键词: 非阻塞环 片上网络 延时优化 串扰优化 |
DOI:10.11887/j.cn.201701011 |
投稿日期:2015-08-20 |
基金项目:国家自然科学基金资助项目(61133007,61402499) |
|
Design and physical implementation of non-blocking ring on a multicore processor |
CHEN Shenggang, LIU Biwei, QI Juan, HUA Yingzhao, XING Sufang, DING Yanping |
(College of Computer, National University of Defense Technology, Changsha 410073, China)
|
Abstract: |
A bi-directional non-blocking ring architecture was proposed for the multicore processor with relative less amount of high-performance cores. The architecture consists of five ring layers of three different types for commands, huge data and small data transportation, respectively. The source routing strategy was employed and an equipment state control interconnection was designed for congestion management. The router has a bufferless and contention-free structure and each hop only takes one clock cycle, thus minimizing the transmission delay and realizing deterministic routing. Considering the long links and high bandwidth of the ring, experiments were carried out to find a proper repeater insertion method, and the crosstalk optimizing methods, such as inverter insertion crosswise between two neighborhood lines and arranging neighborhood lines in signal transport direction, were studied to conduct physical design for the ring and delay optimization for the long links. Implementation results show that the designed ring′s bandwidth is 256 GByte/s @1 GHz, which can fulfill the data communication demands of the digital signal processing applications. |
Keywords: non-blocking ring networks-on-chip delay optimization crosstalk optimization |
|
|
|
|
|