Prediction method of port blocking failure in high performance interconnection networks
Author:
Affiliation:

(College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China)

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    With the increase of system scale, chip power consumption and link rate, the overall failure rate of high-performance interconnection networks will continue rising, and the traditional operation and maintenance methods will be difficult to sustain, which brings great challenges to the overall reliability and availability of HPC(high performance computing). An unsupervised algorithm prediction model for serious network failures such as network port blocking was proposed. In this model, the symptomatic rules were extracted from the history information of the switch port status register and a new feature vector was formed. The K-means clustering algorithm was used to learn and classify the feature vectors. In the prediction, the DES(double exponential smoothing) algorithm was used to predict the port state in the future through a combination of the current state of the port, and a new feature vector was obtained and K-means algorithm was used to predict whether the port blocking failure would occur. The topology information was used to build independent sub prediction models with ToR switch ports and Spine switch ports respectively, so as to further improve the accuracy of prediction. The experimental results show that the prediction model can maintain the recall rate of 88.2%, and reach the accuracy rate of 65.2%. It can provide effective early warning and guidance for the operation and maintenance personnel in the actual system.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 08,2020
  • Revised:
  • Adopted:
  • Online: September 28,2022
  • Published: October 28,2022
Article QR Code