Scientists are training a gargantuan one-trillion-parameter generative AI system dubbed ‘ScienceGPT’ based on scientific data from the newly established Aurora supercomputer.
The AuroraGPT AI model, which is being trained by researchers at the Argonne National Lab (ALN) in Illinois, USA, is powered by Intel’s Ponte Vecchio GPUs which provide the main computing power, and is being backed by the US government.
Training could take months to complete, according to HPC Wire, with training currently limited to 256 of the roughly 10,000 nodes of the Aurora supercomputer, before this is scaled up over time. Even given this limitation, Intel and ANL are only testing the model training on a string of 64 nodes, with caution due to Aurora’s unique design as a supercomputer.
At one trillion parameters, ScienceGPT will be one of the largest LLMs out there. While it won’t quite hit the size of the reported 1.7-trillion-parameter GPT-4, developed by OpenAI, it’ll be almost twice as large as the 560-billion-parameter Pathways Language Model, which powers Google’s Bard.
“It combines all the text, codes, specific scientific results, papers, into the model that science can use to speed up research,” said Ogi Brkic, vice president and general manager for data center and HPC solutions, in a press briefing.
It’ll operate like ChatGPT, but it’s yet unclear at the moment whether it will be multimodal, in that it will generate different kinds of media like text, images, and video.
Aurora – which will be the second exascale supercomputer in US history – has just established itself on the Top500 list of the most powerful supercomputers after years of being developed.
It’s the second-most powerful supercomputer after Froniter, and is powered by 60,000 Intel GPUs while boating 10,000 computing nodes over 166 racks, alongside more than 80,000 networking nodes.
It is still being finished, however, and will likely exceed Frontier’s performance when it’s fully up to speed, and all testing and finetuning is complete, said Top500.