Dr. Wei Wang is a Professor in the School of Computer Science and Engineering, The University of New South Wales, Australia. His current research interests include Similarity Query Processing, Artificial Intelligence, Knowledge Graphs, and Security for AI Models. He has published over a hundred research papers, with many in premier database journals (TODS, VLDB J, and TKDE) and conferences (SIGMOD, VLDB, ICDE, WWW, IJCAI, AAAI, ACL). More can be found on his homepage at: http://www.cse.unsw.edu.au/~weiw/
Similarity query processing has been an active research topic for several decades. It is an essential procedure in a wide range of applications. Recently, embedding and auto-encoding methods as well as pre-trained models have gained popularity. They basically deal with high-dimensional data, and this trend brings new opportunities and challenges to similarity query processing for high-dimensional data. Meanwhile, new techniques have emerged to tackle this long-standing problem theoretically and empirically. In this tutorial, we summarize existing solutions, especially recent advancements from both database (DB) and machine learning (ML) communities, and analyze their strengths and weaknesses. We review exact and approximate methods such as cover tree, locality sensitive hashing, product quantization, and proximity graphs.