Telegram Group & Telegram Channel
Forwarded from Github LLMs
Slamming: Training a Speech Language Model on One GPU in a Day

19 Feb 2025 · Gallil Maimon, Avishai Elmakies, Yossi Adi ·

We introduce Slam, a recipe for training high-quality Speech Language Models (SLMs) on a single academic GPU in 24 hours. We do so through empirical analysis of model initialisation and architecture, synthetic training data, preference optimisation with synthetic data and tweaking all other components. We empirically demonstrate that this training recipe also scales well with more compute getting results on par with leading SLMs in a fraction of the compute cost. We hope these insights will make SLM training and research more accessible. In the context of SLM scaling laws, our results far outperform predicted compute optimal performance, giving an optimistic view to #SLM feasibility. See code, data, models, samples at - https://pages.cs.huji.ac.il/adiyoss-lab/slamming .

Paper: https://arxiv.org/pdf/2502.15814v1.pdf

Code: https://github.com/slp-rl/slamkit



https://www.tg-me.com/deep_learning_proj



tg-me.com/Machine_learn/3449
Create:
Last Update:

Slamming: Training a Speech Language Model on One GPU in a Day

19 Feb 2025 · Gallil Maimon, Avishai Elmakies, Yossi Adi ·

We introduce Slam, a recipe for training high-quality Speech Language Models (SLMs) on a single academic GPU in 24 hours. We do so through empirical analysis of model initialisation and architecture, synthetic training data, preference optimisation with synthetic data and tweaking all other components. We empirically demonstrate that this training recipe also scales well with more compute getting results on par with leading SLMs in a fraction of the compute cost. We hope these insights will make SLM training and research more accessible. In the context of SLM scaling laws, our results far outperform predicted compute optimal performance, giving an optimistic view to #SLM feasibility. See code, data, models, samples at - https://pages.cs.huji.ac.il/adiyoss-lab/slamming .

Paper: https://arxiv.org/pdf/2502.15814v1.pdf

Code: https://github.com/slp-rl/slamkit



https://www.tg-me.com/deep_learning_proj

BY Machine learning books and papers




Share with your friend now:
tg-me.com/Machine_learn/3449

View MORE
Open in Telegram


Machine learning books and papers Telegram | DID YOU KNOW?

Date: |

Telegram hopes to raise $1bn with a convertible bond private placement

The super secure UAE-based Telegram messenger service, developed by Russian-born software icon Pavel Durov, is looking to raise $1bn through a bond placement to a limited number of investors from Russia, Europe, Asia and the Middle East, the Kommersant daily reported citing unnamed sources on February 18, 2021.The issue reportedly comprises exchange bonds that could be converted into equity in the messaging service that is currently 100% owned by Durov and his brother Nikolai.Kommersant reports that the price of the conversion would be at a 10% discount to a potential IPO should it happen within five years.The minimum bond placement is said to be set at $50mn, but could be lowered to $10mn. Five-year bonds could carry an annual coupon of 7-8%.

The SSE was the first modern stock exchange to open in China, with trading commencing in 1990. It has now grown to become the largest stock exchange in Asia and the third-largest in the world by market capitalization, which stood at RMB 50.6 trillion (US$7.8 trillion) as of September 2021. Stocks (both A-shares and B-shares), bonds, funds, and derivatives are traded on the exchange. The SEE has two trading boards, the Main Board and the Science and Technology Innovation Board, the latter more commonly known as the STAR Market. The Main Board mainly hosts large, well-established Chinese companies and lists both A-shares and B-shares.

Machine learning books and papers from us


Telegram Machine learning books and papers
FROM USA