Dr. Tang Nan, Qatar Computing Research Institute (QCRI)


Title: Large Language Models Meets Data Lakes

Abstract:

Large language models (LLMs) have demonstrated impressive capabilities in a variety of applications, such as natural language processing, code generation/debugging, and more. In this presentation, I will address two topics: LLMs for data lake applications, and utilizing multi-modal data lakes to enhance the reliability of LLMs. Firstly, I will examine the strengths and limitations of LLMs by analyzing a classical database problem, natural language to SQL (NLSQL). I will then demonstrate how combining LLMs with pluggable and tunable small language models (SLMs) can more effectively solve this issue. Secondly, I will discuss how to efficiently retrieve top-k datasets that can serve as input to LLMs to generate more dependable answers or as evidence to verify the outputs of LLMs.

  • Share this: