Meta announces a GPT3-size language model you can download

Meta announces a GPT3-size language model you can download

This is another unbelievable constituent!

[Submitted on 2 May 2022]

Authors:Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

Download PDF

Abstract: Large language models, which are often trained for hundreds of thousands of
compute days, have shown remarkable capabilities for zero- and few-shot
learning. Given their computational cost, these models are difficult to
replicate without significant capital. For the few that are available through
APIs, no access is granted to the full model weights, making them difficult to
study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only
pre-trained transformers ranging from 125M to 175B parameters, which we aim to
fully and responsibly share with interested researchers. We show that OPT-175B
is comparable to GPT-3, while requiring only 1/7th the carbon footprint to
develop. We are also releasing our logbook detailing the infrastructure
challenges we faced, along with code for experimenting with all of the released

Submission history From: Susan Zhang [view email]

Mon, 2 May 2022 17:49:50 UTC (9,196 KB)

Read More
Share this on to discuss with people on this topicSign Up on now if you’re not registered yet.

About the author: Charlie
Fill your life with experiences so you always have a great story to tell

Get involved!

Get Connected!
One of the Biggest Social Platform for Entrepreneurs, College Students and all. Come and join our community. Expand your network and get to know new people!


No comments yet
Knowasiak We would like to show you notifications so you don't miss chats & status updates.
Allow Notifications