Abstract:FMA(Fused Multiply-Add) with high precision is required in high performance microprocessors. A new 10 stages pipelined architecture of 128bit FMA is proposed. In this architecture, multiply, adder, LZA(Leading Zero Anticipator) and normalization with large width data-paths were partitioned and optimized carefully to balance the latency at every pipeline stage. After designed and synthesized with SMIC 0.13μm technology, the frequency of the FMA can reach 465MHz, which is about 130% better than previous 128bit FMA. Furthermore, its frequency can reach 1.075GHz with TSMC 65nm technology, which basically meets the requirements of the high performance computation.