Abstract:With the rapid expansion of image data, large-scale image retrieval faces increasingly stringent efficiency requirements. Deep hashing is a key research direction in this field by mapping high-dimensional features into compact binary codes, thereby simultaneously enabling deep semantic learning and efficient image retrieval. Existing methods can be classified into three categories according to the extent of supervision utilized: unsupervised, weakly supervised, and fully supervised. Specifically, unsupervised methods mine latent semantic information from unlabeled data by modeling intrinsic data structures; weakly supervised methods extract effective supervisory signals from noisy or incomplete user-provided tags; and fully supervised methods rely on complete class labels to accurately model semantic relationships. The core ideas and representative achievements across these three categories have been systematically reviewed, and comprehensive comparisons of retrieval performance for representative methods have been conducted on multiple mainstream datasets. Moreover, despite significant progress, deep hashing still confronts substantial challenges in adapting to dynamically arriving data and achieving effective collaborative modeling in cross-modal scenarios. Future research should prioritize incrementally scalable hashing via continual learning, cross-modal hashing leveraging pre-trained models and so on, thereby promoting deep hashing toward greater efficiency, scalability, and real-world applicability.