Lecture 12 Efficient LLM Inference - Search Videos

EfficientML.ai Lecture 12 - Transformer and LLM (Part I) (MIT 6.5940, Fall 2023)

EfficientML.ai Lecture 12 - Transformer and LLM (Part I) (MIT …

11.3K viewsOct 20, 2023

YouTubeMIT HAN Lab

2026 Ultimate LLM Inference Framework Guide: 7 Frameworks Compared - No More Confusion • StableLearn | Make AI Your Superpower

2026 Ultimate LLM Inference Framework Guide: 7 Frameworks …

stable-learn.com

Faster LLMs: Accelerate Inference with Speculative Decoding

Faster LLMs: Accelerate Inference with Speculative Decoding

22.1K views11 months ago

YouTubeIBM Technology

Speculative Decoding: 3× Faster LLM Inference with Zero Quality Loss

Speculative Decoding: 3× Faster LLM Inference with Zero Quality L…

709 views4 months ago

YouTubeTales Of Tensors

LLMs | Efficient LLM Decoding-I | Lec15.1

LLMs | Efficient LLM Decoding-I | Lec15.1

2.5K viewsOct 4, 2024

CMU LLM Inference (12): Reward Models and Best-of-N

CMU LLM Inference (12): Reward Models and Best-of-N

1.7K views7 months ago

YouTubeGraham Neubig

Lec 12 | Efficient LLMs: Part 02

Lec 12 | Efficient LLMs: Part 02

595 views7 months ago

Introduction · Hugging Face

Kai Sheng Tai: Sparsity for Efficient LLM Inference

432 viewsJan 1, 2025

YouTubeMayur Naik

Lossless LLM inference acceleration with Speculators

637 views5 months ago

Efficient LLM Inference (vLLM KV Cache, Flash Decoding & Lookahe…

9.4K viewsMar 1, 2024

YouTubeNoble Saji Mathews

What is LLM Inference?

251 viewsMay 3, 2025

YouTubeCodersArts

Scaling Ultra Low Latency LLM Inference

635 views9 months ago

YouTubeToronto Machine Learning Society (TMLS)

KV Caching: Speeding up LLM Inference [Lecture]

923 views5 months ago

YouTubeJordan Boyd-Graber

Lianmin Zheng on Efficient LLM Inference with SGLang

1.9K views10 months ago

YouTubeAMD Developer Central

Probabilistic ML - Lecture 24 - Variational Inference

3.7K viewsAug 4, 2023

YouTubeTübingen Machine Learning

PiLLM: Resource-Efficient LLM Inference Using Workload Predicti…

Efficient LLM RL Training with Experience Replay

20 views1 month ago

YouTubeAI Research Roundup

Tour De Force: LLM Inference Optimization From Simple To Sop…

132 views3 weeks ago

LLM in a flash: Efficient Large Language Model Inference with Li…

4.8K viewsDec 23, 2023

YouTubeAI Papers Academy

Understanding LLM Inference | NVIDIA Experts Deconstruct How …

24.1K viewsApr 23, 2024

YouTubeDataCamp

LLMs | Efficient LLM Decoding-II | Lec15.2

1.8K viewsOct 9, 2024

Optimize LLM inference with vLLM

14.4K views9 months ago

LLM in a flash: Efficient Large Language Model Inference with Li…

1.3K viewsDec 20, 2023

YouTubeArxiv Papers

Inferential Statistics – Sampling, Probability, and Inference (7-5)

87K viewsAug 23, 2016

YouTubeResearch By Design

Mark Moyou, PhD - Understanding the end-to-end LLM training and in…

935 viewsApr 26, 2025

Memory-Efficient LLM Inference on Edge Devices With NNTrainer - Eu…

577 views6 months ago

YouTubeThe Linux Foundation

vLLM Office Hours - Model Quantization for Efficient vLLM Inf…

1.9K viewsJul 29, 2024

YouTubeNeural Magic

Boost Your AI Predictions: Maximize Speed with vLLM Library for Larg…

9.4K viewsNov 27, 2023

YouTubeVenelin Valkov

LLM inference optimization: Architecture, KV cache and Flash …

15.3K viewsSep 7, 2024

YouTubeYanAITalk

See more videos