Abstract:To optimize the software implementation of block cipher uBlock algorithm, the AVX2(Advanced Vector Extension 2) instruction set supporting 256-bit data width was implemented, the automatic optimization level of the compiler was increased, optimizing the calling process of functions, and the methods of data storage structure optimization, high-level parallelism and low latency instruction logic optimization were used in order to implement parallel computing under the single-thread condition. Using this efficient combination method, the speed values of the software implementation of uBlock-128/128 algorithm, uBlock-128/256 algorithm and uBlock-256/256 algorithm are 269%, 182% and 49% higher than the original code. Based on these optimization methods,the implementation of single-key scenario and multi-key scenario are given for three algorithm versions of uBlock-128 /128, uBlock-128/256 and uBlock-256/256.