Abstract:The real-time performance of each node in one cluster varies greatly due to different configurations and the job running on it. To improve the cluster performance, NPARSA (node real-time performance adaptive cluster resource scheduling algorithm) was proposed. The real-time performance of a cluster node was represented by its configuration (such as the number of its CPU cores,the speed of CPU, memory capacity, and disk capacity) and the real-time state parameters (such as the residual of CPU, memory, and disk). NPARSA chose the attribute weights for a node according to the type of the job to be handled, and assigned nodes with higher priority to the job. Virtual machine experiments and physical cluster experiments prove the effectiveness of NPARSA. Compared with Spark′s default scheduling algorithm,the algorithm that does not consider the job type and node matching, and the algorithm that uses the degree of job and node matching difference as the basis for resource allocation, NPARSA can improve the performance of a cluster and shorten the execution time of user jobs.