tg-me.com/Machine_learn/3117
Last Update:
PRIME Intellect has published INTELLECT-1 ( Instruct + Base ), the first 10 billion parameter language model collaboratively trained in 50 days by 30 participants worldwide.
PRIME Intellect used its own PRIME platform, designed to address the main problems of decentralized learning: network unreliability and dynamic management of computing nodes.
The platform utilized a network of 112 H100 GPUs across 3 continents and achieved a compute utilization rate of 96% under optimal conditions.
The training corpus consisted of 1 trillion public dataset tokens with the following percentage distribution: 55% fineweb-edu, 10% fineweb, 20% Stack V1, 10% dclm-baseline, 5% open-web-math.
INTELLECT-1 achieved 37.5% accuracy on the MMLU test and 72.26% on HellaSwag, and outperformed several other open-source models on WinoGrande with a score of 65.82%.
While these figures lag slightly behind today's popular models, the results of the experiment are a critical step toward democratizing AI development and preventing the consolidation of AI capabilities within a few organizations.import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1")
tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1")
input_text = "%prompt%"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(output_text)
@Machine_learn