DeepSeek 29 Telegram Web

DeepSeek

🚀 Just published: "Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts" 🌐 Introducing Loss-Free Balancing—our latest innovation in MoE models that ditches the need for auxiliary loss. By dynamically adjusting expert biases, we ensure optimal…

To be specific, before the top-K routing decision, Loss-Free Balancing will first apply an expert-wise bias to the routing scores of each expert. By dynamically updating the bias of each expert according to its recent load, Loss-Free Balancing can consistently maintain a balanced distribution of expert load. In addition, since Loss-Free Balancing does not produce any interference gradients, it also elevates the upper bound of model performance gained from MoE training.

453 views03:30

DeepSeek

🚀 Exciting news! We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! Now, with enhanced writing, instruction-following, and human preference alignment, it’s available on Web and API. Enjoy seamless Function Calling, FIM, and Json Output all-in-one!

Note: Due to significant updates in this version, if performance drops in certain cases, we recommend adjusting the system prompt and temperature settings for the best results!

455 views13:44

DeepSeek

DeepSeek-V2.5 outperforms both DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724 on most benchmarks.

495 views13:44

DeepSeek

Photo

In our internal Chinese evaluations, DeepSeek-V2.5 shows a significant improvement in win rates against GPT-4o mini and ChatGPT-4o-latest (judged by GPT-4o) compared to DeepSeek-V2-0628.

525 views13:44

DeepSeek

In our internal Chinese evaluations, DeepSeek-V2.5 shows a significant improvement in win rates against GPT-4o mini and ChatGPT-4o-latest (judged by GPT-4o) compared to DeepSeek-V2-0628.

DeepSeek-V2.5 is now open-source on HuggingFace!
Check it out:
https://huggingface.co/deepseek-ai/DeepSeek-V2.5

huggingface.co

deepseek-ai/DeepSeek-V2.5 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

585 views13:45

DeepSeek

LMSYS Chatbot Arena Rankings Update: DeepSeek-V2.5 has ranked first among Chinese LLMs, outperforming closed-source models like Yi-Large-Preview, Qwen-Plus-0828, and GLM-4-0520. It’s also closely matched with GPT-4-Turbo-2024-04-09 in the arena score. Download the V2.5 checkpoints here:
https://huggingface.co/deepseek-ai/DeepSeek-V2.5

626 viewsedited 14:54

DeepSeek

Compared to DeepSeek-V2 and DeepSeek-Coder-V2, DeepSeek V2.5 has seen comprehensive improvements in rankings across all categories.

622 views14:55

DeepSeek

🚀 Introducing Janus: a revolutionary autoregressive framework for multimodal AI!
By decoupling visual encoding & unifying them with a single transformer, it outperforms previous models in both understanding & generation.
⚡️ Powerful, simple, flexible, & next-gen ready! 🔥

📄 Paper: https://arxiv.org/abs/2410.13848
💻 Project page: https://github.com/deepseek-ai/Janus

833 views01:30

DeepSeek

🚀 Introducint JanusFlow: harmonizing autoregressive LLMs with rectified flow!
By adopting the best practices in both fields, JanusFlow excels at both image understanding & generation in a single model.
⚡️ Powerful, simple, flexible, & your next-gen of Janus is here! 🔥
📄 Paper: https://arxiv.org/abs/2411.07975
💻 Code: https://github.com/deepseek-ai/Janus
and hf demo: https://huggingface.co/spaces/deepseek-ai/JanusFlow-1.3B

1.1K views04:31

DeepSeek

🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power!

🔍 o1-preview-level performance on AIME & MATH benchmarks.
💡 Transparent thought process in real-time.
🛠️ Open-source models & API coming soon!

🌐 Try it now at chat.deepseek.com
#DeepSeek

1.1K views12:00

DeepSeek

🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! 🔍 o1-preview-level performance on AIME & MATH benchmarks. 💡 Transparent thought process in real-time. 🛠️ Open-source models & API coming soon! 🌐 Try it now at chat.deepseek.com…

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

1.1K views12:00

DeepSeek

🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!

🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.

1.1K views12:00

DeepSeek

Re DeepSeek has not issued any cryptocurrency. Currently, there is only one official account on the Twitter platform. We will not contact anyone through other accounts.Please stay vigilant and guard against potential scams.

via Twitter @DeepSeek

1.0K views11:49

DeepSeek

🎉 Introducing DeepSeek App!

💡 Powered by world-class DeepSeek-V3
🆓 FREE to use with seamless interaction
📱 Now officially available on App Store & Google Play & Major Android markets
🔗Download now: https://download.deepseek.com/app/

🌟 1/3

via Twitter @DeepSeek

1.2K views12:52

2025/05/30 08:45:07
Back to Top

HTML Embed Code:

<iframe width="100%" src="https://www.tg-me.com/buyppe/webview?embed=1" title="Telegram Webview" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>