Telegram Group & Telegram Channel
Slamming: Training a Speech Language Model on One GPU in a Day

19 Feb 2025 · Gallil Maimon, Avishai Elmakies, Yossi Adi ·

We introduce Slam, a recipe for training high-quality Speech Language Models (SLMs) on a single academic GPU in 24 hours. We do so through empirical analysis of model initialisation and architecture, synthetic training data, preference optimisation with synthetic data and tweaking all other components. We empirically demonstrate that this training recipe also scales well with more compute getting results on par with leading SLMs in a fraction of the compute cost. We hope these insights will make SLM training and research more accessible. In the context of SLM scaling laws, our results far outperform predicted compute optimal performance, giving an optimistic view to #SLM feasibility. See code, data, models, samples at - https://pages.cs.huji.ac.il/adiyoss-lab/slamming .

Paper: https://arxiv.org/pdf/2502.15814v1.pdf

Code: https://github.com/slp-rl/slamkit



https://www.tg-me.com/deep_learning_proj



tg-me.com/LLM_learning/41
Create:
Last Update:

Slamming: Training a Speech Language Model on One GPU in a Day

19 Feb 2025 · Gallil Maimon, Avishai Elmakies, Yossi Adi ·

We introduce Slam, a recipe for training high-quality Speech Language Models (SLMs) on a single academic GPU in 24 hours. We do so through empirical analysis of model initialisation and architecture, synthetic training data, preference optimisation with synthetic data and tweaking all other components. We empirically demonstrate that this training recipe also scales well with more compute getting results on par with leading SLMs in a fraction of the compute cost. We hope these insights will make SLM training and research more accessible. In the context of SLM scaling laws, our results far outperform predicted compute optimal performance, giving an optimistic view to #SLM feasibility. See code, data, models, samples at - https://pages.cs.huji.ac.il/adiyoss-lab/slamming .

Paper: https://arxiv.org/pdf/2502.15814v1.pdf

Code: https://github.com/slp-rl/slamkit



https://www.tg-me.com/deep_learning_proj

BY Github LLMs




Share with your friend now:
tg-me.com/LLM_learning/41

View MORE
Open in Telegram


telegram Telegram | DID YOU KNOW?

Date: |

The STAR Market, as is implied by the name, is heavily geared toward smaller innovative tech companies, in particular those engaged in strategically important fields, such as biopharmaceuticals, 5G technology, semiconductors, and new energy. The STAR Market currently has 340 listed securities. The STAR Market is seen as important for China’s high-tech and emerging industries, providing a space for smaller companies to raise capital in China. This is especially significant for technology companies that may be viewed with suspicion on overseas stock exchanges.

telegram from sg


Telegram Github LLMs
FROM USA