About Me

Currently, Xunkai Li is a second year Ph.D. student of School of Computer Science in Beijing Institute of Technology supervised by Prof. Rong-Hua Li.

Since 2022, he worked with Dr. Wentao Zhang as a research intern at the Data-centric Machine Learning (DCML) group in the Center of Machine Learning Research at Peking University (PKU). As an active member of the team, Xunkai is the contributor of scalable graph learning system project SGL GitHub Repo stars.

Prior to that, he received his B.Eng. degree in computer science department from Shandong University in June 2022.

Email: cs.xunkai.li@gmail.com

Wechat (微信): 18045124943

🌟 🌟 If you are interested in collaborating with me or want to have a chat, always feel free to contact me through e-mail or WeChat :)

Motivated by the industrial demand, Xunkai’s research focuses on

  • General Data-centric Machine Learning:

    In recent years, AI model development has faced challenges, as many cutting-edge LLM designs still rely on the Transformer architecture. Performance improvements have shifted from enhancing models to focusing on data-related aspects.

    I prioritize optimizing data quality (e.g., addressing issues like imbalance, noise, and out-of-distribution), quantity (e.g., through improved annotation and augmentation), efficiency (e.g., using techniques like distillation, compression, and selection), and privacy.

    To efficiently obtain a large volume of high-quality data at a low cost, I carefully consider factors like acquisition cost and efficiency for large-scale models. My research includes a scientific evaluation of data quality, designing efficient data selection methods, creating effective data composition strategies, and exploring how large models can aid in data optimization, such as through automated data annotation.

  • Graph Learning in Complex Scenarios:

    The exploration of graph learning within intricate relational data modeling, scalability in managing large-scale graph, and distributed settings is imperative.

    Recognizing the limitations of traditional approaches in handling the multifaceted nature of modern graph-based structures, I hold this research aspect significance beyond theoretical advancements. It extends to practical applications that necessitate a nuanced understanding of diverse and intricate data relationships, particularly in the face of challenges posed by large-scale graph learning.

    By embracing a variety of graph structures, I aim to bolster the adaptability of graph-based models to diverse data architectures. The use of advanced graph structures, including directed graphs, hypergraphs, heterophily, and temporal graphs, is crucial for navigating the complexities of contemporary data landscapes. My research in developing effective graph models in various high-order relation data modeling while maintaining scalability in industry-based scenarios.

    Furthermore, the emphasis on distributed settings, exemplified by federated graph learning, highlights the global nature of contemporary data distribution. My research in graph learning for distributed scenarios strives to provide scalable and efficient solutions. This ensures the effective utilization of insights derived from graph-based models in a decentralized data landscape.

What's New

  • 2024-04: One paper is accepted by IJCAI 2024.
  • 2024-03: One paper is accepted by ICDE 2024.
  • 2024-02: One paper is accepted by VLDB 2024.
  • 2024-01: One paper is accepted by WWW 2024.
  • 2023-12: One paper is accepted by AAAI 2024.
  • 2023-10: One paper is accepted by ICDE 2024.
  • 2023-08: One paper is accepted by VLDB 2024.