David's random thoughts

HP这台是我多年来买过的首发笔记本新平台机器里对Linux支持最好的，没有之一。不过都已经拿去做Ubuntu认证了也不意外。。 LLM暂时简单测了一个Linux ROCm UMA的llama.cpp运行70B投机解码，理想情况下可以做到大约8.7 t/s左右。后面有空再做详细测试。

8-9 t/s并没有完全发挥8060S的潜力，llama.cpp的llama-server有一个小问题导致server配speculative decoding时性能欠佳（与具体硬件无关）：https://github.com/ggml-org/llama.cpp/issues/12968

动手简单修一下这个问题之后Qwen 2.5 72B iq4_xs + 1.5B draft在acceptance rate理想时可达到10-12 t/s左右
https://github.com/hjc4869/llama.cpp/commit/0b32f64ffbe973e99e0dc7097be31d4d966d476e

www.tg-me.com/us/telegram/com.david_random/564

1.6K viewsApr 19 at 11:57

tg-me.com/david_random/564

Create: 2025-04-19
Last Update: 2025-06-01 17:59:35

BY David's random thoughts

Share with your friend now:
tg-me.com/david_random/564

telegram Telegram | DID YOU KNOW?

Date: 2025-06-01| telegram

The S&P 500 slumped 1.8% on Monday and Tuesday, thanks to China Evergrande, the Chinese property company that looks like it is ready to default on its more-than $300 billion in debt. Cries of the next Lehman Brothers—or maybe the next Silverado?—echoed through the canyons of Wall Street as investors prepared for the worst.