bert
gpt1 2 3
palm
UL2
post training
RLHF
deepseek r1
chain of thought empowers transformers to solve inherently serial problems