TBR架构GPU中三角形高效光栅化
摘 要
目的 在基于分块渲染(TBR)架构的GPU中,三角形光栅化的速度对芯片的性能影响很大,采用传统的光栅化方法会产生大量多余的像素,无法发挥TBR架构的优势.方法 提出了一种该架构下的高效三角形光栅化算法,该算法充分利用了分块渲染的特点,通过预处理计算出三角形在每一个块内的绘制参数,得出三角形与块边界的位置关系,并将其随三角形的分块信息一起写入存储器,在光栅化阶段采用了Bresenham算法,利用生成的三角形边得到在每一个块内的扫描水平线,进而生成水平线上的每一个像素.结果 经过理论分析,该算法的光栅化效率可以达到83%以上,甚至接近100%,在FPGA原型验证系统上对该算法进行了功能和性能的验证.结论 提出的三角形光栅化算法,能够适应TBR的架构,实际测试像素填充率与频率高一倍的ATI M9相当,因此该算法能够达到较高的光栅化效率.
关键词
High efficiency triangle raster algorithm in GPU based on TBR architecture
Fu He, Xie Yongfang(School of Information Science and Engineering, Central South University, Changsha 410083, China) Abstract
Objective In GPUs based on TBR architecture, the efficiency of triangle raster greatly influences chip performance. Traditional algorithms generate severalredundant pixels and lose the advantages of TBR architecture.Method A high-efficiency triangle raster algorithm is proposed in this paper. The features of TBR architecture are fully utilized in this algorithm. The draw parameters of each tile are calculated in the pre-processing stage. The position relationship between the triangle and the tile boundary is obtained. All data are committed to memory. In the raster stage, the Bresenham algorithm is used to obtain all the horizontal scan lines in every tile untilall pixels of the scan lines are eventually generated. Result After theoretical analysis, the raster efficiency of the raster operation can reach 83% and even approach 100%,depending on the triangle shape. The function and performance of this algorithm are evaluated with a FPGA proto type verification system. Conclusion To adapt to the TBR architecture, a new triangle raster algorithm is proposed in this paper. The pixel fill rate is equivalent to ATI M9, which has twice the clock rate. Therefore, the algorithm can achieve high raster efficiency.
Keywords
|