Telegram Group & Telegram Channel
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

13 Dec 2024 · Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan ·

We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades. For the vision component, we incorporate a dynamic tiling vision encoding strategy designed for processing high-resolution images with different aspect ratios. For the language component, we leverage #DeepSeekMoE models with the Multi-head Latent Attention mechanism, which compresses Key-Value cache into latent vectors, to enable efficient inference and high throughput. Trained on an improved vision-language dataset, DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, #DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models. Codes and pre-trained models are publicly accessible at https://github.com/deepseek-ai/DeepSeek-VL2.

Paper: https://arxiv.org/pdf/2412.10302v1.pdf

Code: https://github.com/deepseek-ai/deepseek-vl2

Datasets: RefCOCO TextVQA MMBench
DocVQA
💠
https://www.tg-me.com/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM



tg-me.com/LLM_learning/36
Create:
Last Update:

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

13 Dec 2024 · Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao, Yisong Wang, Chong Ruan ·

We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades. For the vision component, we incorporate a dynamic tiling vision encoding strategy designed for processing high-resolution images with different aspect ratios. For the language component, we leverage #DeepSeekMoE models with the Multi-head Latent Attention mechanism, which compresses Key-Value cache into latent vectors, to enable efficient inference and high throughput. Trained on an improved vision-language dataset, DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, #DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models. Codes and pre-trained models are publicly accessible at https://github.com/deepseek-ai/DeepSeek-VL2.

Paper: https://arxiv.org/pdf/2412.10302v1.pdf

Code: https://github.com/deepseek-ai/deepseek-vl2

Datasets: RefCOCO TextVQA MMBench
DocVQA
💠
https://www.tg-me.com/deep_learning_proj

BY Github LLMs




Share with your friend now:
tg-me.com/LLM_learning/36

View MORE
Open in Telegram


telegram Telegram | DID YOU KNOW?

Date: |

The STAR Market, as is implied by the name, is heavily geared toward smaller innovative tech companies, in particular those engaged in strategically important fields, such as biopharmaceuticals, 5G technology, semiconductors, and new energy. The STAR Market currently has 340 listed securities. The STAR Market is seen as important for China’s high-tech and emerging industries, providing a space for smaller companies to raise capital in China. This is especially significant for technology companies that may be viewed with suspicion on overseas stock exchanges.

Export WhatsApp stickers to Telegram on Android

From the Files app, scroll down to Internal storage, and tap on WhatsApp. Once you’re there, go to Media and then WhatsApp Stickers. Don’t be surprised if you find a large number of files in that folder—it holds your personal collection of stickers and every one you’ve ever received. Even the bad ones.Tap the three dots in the top right corner of your screen to Select all. If you want to trim the fat and grab only the best of the best, this is the perfect time to do so: choose the ones you want to export by long-pressing one file to activate selection mode, and then tapping on the rest. Once you’re done, hit the Share button (that “less than”-like symbol at the top of your screen). If you have a big collection—more than 500 stickers, for example—it’s possible that nothing will happen when you tap the Share button. Be patient—your phone’s just struggling with a heavy load.On the menu that pops from the bottom of the screen, choose Telegram, and then select the chat named Saved messages. This is a chat only you can see, and it will serve as your sticker bank. Unlike WhatsApp, Telegram doesn’t store your favorite stickers in a quick-access reservoir right beside the typing field, but you’ll be able to snatch them out of your Saved messages chat and forward them to any of your Telegram contacts. This also means you won’t have a quick way to save incoming stickers like you did on WhatsApp, so you’ll have to forward them from one chat to the other.

telegram from sg


Telegram Github LLMs
FROM USA