bert

gpt1 2 3

palm

UL2

post training

RLHF

deepseek r1

chain of thought empowers transformers to solve inherently serial problems

Attention

Multimodal

RL

Technical Reports