The notes of Justin Abrahms

Recently updated

tests for quartz
Jul 05, 2025
Zero Knowledge Proofs (ZKP)
Jul 05, 2025
Sprint Ceremony input/outputs
Jun 01, 2025

❯

❯

GPT-3

Aug 25, 20241 min read

GPT-3 or GPT-3.5 stands for “Generative Pre-trained Transformer 3”. It’s a transformer network.

The architecture is a standard transformer network (with a few engineering tweaks) with the unprecedented size of 2048-token-long context and 175 billion parameters (requiring 800 GB of storage).

wikipedia

The model was trained on massive amounts of web data (410 billion tokens with 60% weight in training) from Common Crawl

Graph View

Backlinks

ChatGPT

Created with Quartz v4.4.0 © 2025

GitHub
Email
bsky