Davis Liang

Director of Applied Science at Abridge AI (Formerly Meta AI, Amazon AI)

prof_pic.jpg

I lead the Machine Learning Team at Abridge AI and apply my research in multilinguality, automatic speech recognition (ASR), and large language models (LLMs) to reinvent healthcare one conversation at a time. Previously, I was a Senior Research Scientist at Meta AI working on large-scale pretraining of multilingual language models.

Prior to Meta, I focused on question answering, information retrieval, machine translation, and speech recognition as an Applied Scientist at Amazon (AWS) AI. I was also a Software Engineer at Yahoo and obtained my MS degree in Computer Science from UC San Diego, where I was advised by Prof. Gary Cottrell.

Research Interests

I am interested in:

  • ML for Social Good, particularly addressing challenges in underserved sectors like healthcare and education and supporting underserved communities through improved capabilities for low-resource languages.
  • Safe ML, through rigorous evaluation methodologies and well-designed guardrails.
  • ML Beyond LLMs, exploring world models, diffusion architectures, and other emerging approaches that push beyond current paradigms.

Contact

Please send all research and work-related inquiries to davisblaine.liang(at)gmail.com.

News

Aug 19, 2025 Excited to share our new paper, “The Science of Confabulation Elimination,” on building systems that detect and eliminate hallucinations in AI-generated clinical documentation. [Article]
Jun 24, 2025 Proud to be a part of Abridge’s $300M Series E led by Andreessen Horowitz and fueling our next phase building agentic AI for healthcare conversations. [Article]
Sep 11, 2024 I had the opportunity to talk about the past, present, and future of AI in Healthcare with Out-of-Pocket Health. [Article].
Sep 2, 2023 We are releasing the Belebele dataset, a first-of-its-kind multilingual reading comprehension dataset spanning 122 language variants, 27 language families, and 29 scripts. [Paper] [Github] [Twitter]
Aug 28, 2023 I had a great time chatting with the New York Times about generative AI and the role of ML talent in supercharging the field of healthcare.
Apr 2, 2023 I’m excited to announce that I’m joining Abridge AI to work on reinventing healthcare for doctors and patients alike!
Jan 28, 2023 We are releasing XLM-V, a multilingual model with a 1 million token vocabulary [Link]. The model is also open-sourced in HuggingFace Transformers.
Feb 22, 2022 After four years at Amazon, I’ll be moving on to a new role. I’ll officially joining Meta AI (formerly Facebook AI) as a Senior Research Scientist in March!
Feb 20, 2018 Officially joining Amazon AI in East Palo Alto as an Applied Scientist where I’ll be working on speech recognition, information retrieval, and question answering.

Selected Publications

Please refer to my Google Scholar for a full list of publications

(*=equal contribution)

  1. Arxiv
    XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models
    Liang, Davis, Gonen, Hila, Mao, Yuning, Hou, Rui, Goyal, Naman, Ghazvininejad, Marjan, Zettlemoyer, Luke, and Khabsa, Madian
    EMNLP 2023 2023
  2. Arxiv
    The BELEBELE Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
    Bandarkar, Lucas, Liang, Davis, Muller, Benjamin, Artetxe, Mikel, Shukla, Satya Narayan, Husa, Donald, Goyal, Naman, Krishnan, Abhinandan, Zettlemoyer, Luke, and Khabsa, Madian
    arXiv preprint arXiv:2308.16884 2023
  3. Arxiv
    Attention-guided generative models for extractive question answering
    Xu, Peng*, Liang, Davis*, Huang, Zhiheng, and Xiang, Bing
    arXiv preprint arXiv:2110.06393 2021
  4. Arxiv
    Embedding-based Zero-shot Retrieval through Query Generation
    Liang, Davis*, Xu, Peng*, Shakeri, Siamak, Santos, Cicero Nogueira dos, Nallapati, Ramesh, Huang, Zhiheng, and Xiang, Bing
    arXiv preprint arXiv:2009.10270 2020
  5. ACL
    Masked language model scoring
    Salazar, Julian, Liang, Davis, Nguyen, Toan Q, and Kirchhoff, Katrin
    ACL 2020
  6. EMNLP Findings
    Improve transformer models with better relative position embeddings
    Huang, Zhiheng, Liang, Davis, Xu, Peng, and Xiang, Bing
    EMNLP Findings 2020
  7. Resistance AI
    Decoding and Diversity in Machine Translation
    Roberts, Nicholas, Liang, Davis, Neubig, Graham, and Lipton, Zachary C
    NeurIPS Resistance AI Workshop 2020
  8. SLT
    Learning noise-invariant representations for robust speech recognition
    Liang, Davis, Huang, Zhiheng, and Lipton, Zachary C
    2020
  9. IJCNLP
    Deep automated multi-task learning
    Liang, Davis, and Shu, Yan
    IJCNLP 2017