Red Hat AI tops MLPerf Inference v6.0 with vLLM on Qwen3-VL, Whisper, and GPT-OSS-120B
Red Hat, Wednesday, April 1st, 2026
Red Hat is proud to announce our strong results from the latest industry-standard MLPerf Inference v6.0 benchmark.
Our submission includes four AI workloads (Whisper-Large-v3, GPT-OSS-120B, Qwen3-VL-235B-A22B, and Llama-2-70b) on NVIDIA (H200, B200, L40S) and AMD (MI350X) GPUs, running on Red Hat Enterprise Linux (RHEL) and Red Hat OpenShift AI with our open source inference stack: vLLM, and llm-d.