Telegram Group & Telegram Channel
Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs), a novel benchmark for evaluating Russian language models (LMs).

RuBLiMP consists of 45,000 minimal pairs and includes 12 grammatical phenomena well-represented in Russian linguistics, covering morphology, syntax, and semantics. A minimal pair consists of a grammatical and an ungrammatical sentence (e.g., The cat is on the mat / *The cat are on the mat), and an LM is expected to prefer the grammatical one based on the scoring function.

Our approach allows to:
🔸generate minimal pairs at scale from any text domain
🔸estimate if a grammatical sentence appears in the LM's pretraining corpus

💡RuBLiMP can be used for evaluating the sensitivity of LMs to grammatical phenomena in Russian and for developing ranking and grammatical error detection methods.

🔸 Read more in our pre-print: https://arxiv.org/abs/2406.19232
🔸 HuggingFace: https://huggingface.co/datasets/RussianNLP/rublimp
🔸 GitHub: https://github.com/RussianNLP/RuBLiMP



tg-me.com/nlp_seminar/130
Create:
Last Update:

Excited to share that we have released RuBLiMP (Russian Benchmark of Linguistic Minimal Pairs), a novel benchmark for evaluating Russian language models (LMs).

RuBLiMP consists of 45,000 minimal pairs and includes 12 grammatical phenomena well-represented in Russian linguistics, covering morphology, syntax, and semantics. A minimal pair consists of a grammatical and an ungrammatical sentence (e.g., The cat is on the mat / *The cat are on the mat), and an LM is expected to prefer the grammatical one based on the scoring function.

Our approach allows to:
🔸generate minimal pairs at scale from any text domain
🔸estimate if a grammatical sentence appears in the LM's pretraining corpus

💡RuBLiMP can be used for evaluating the sensitivity of LMs to grammatical phenomena in Russian and for developing ranking and grammatical error detection methods.

🔸 Read more in our pre-print: https://arxiv.org/abs/2406.19232
🔸 HuggingFace: https://huggingface.co/datasets/RussianNLP/rublimp
🔸 GitHub: https://github.com/RussianNLP/RuBLiMP

BY исследовано




Share with your friend now:
tg-me.com/nlp_seminar/130

View MORE
Open in Telegram


исследовано Telegram | DID YOU KNOW?

Date: |

Telegram announces Search Filters

With the help of the Search Filters option, users can now filter search results by type. They can do that by using the new tabs: Media, Links, Files and others. Searches can be done based on the particular time period like by typing in the date or even “Yesterday”. If users type in the name of a person, group, channel or bot, an extra filter will be applied to the searches.

Start with a fresh view of investing strategy. The combination of risks and fads this quarter looks to be topping. That means the future is ready to move in.Likely, there will not be a wholesale shift. Company actions will aim to benefit from economic growth, inflationary pressures and a return of market-determined interest rates. In turn, all of that should drive the stock market and investment returns higher.

исследовано from tw


Telegram исследовано
FROM USA