History of open alignment, part 1
History of open alignment, part 2
History of open alignment, part 3
History of open alignment, part 4
which coincides with llama 1
then there's the reproducing core results phase, but people mostly using LoRA which I think hurt then
DPO came out as a paper in like July I think, was added places soon after, but didnt catch on for a while https://github.com/huggingface/trl/issues/405
Talks:
Papers / methods:
Evaluation Tools: