Jason’s Kindle Notes & Highlights

On the Edge: The Art of Risking Everything, by Nate Silver

This answer might minimize the loss function in the training data because the moon being made out of cheese is a centuries-old trope. But this is still misinformation, however harmless in this instance. So LLMs undergo another stage in their training: what’s called RLHF, or reinforcement learning from human feedback. Basically, it works like this: the AI labs hire cheap labor—often from Amazon’s Mechanical Turk, where you can employ human AI trainers from any of roughly fifty countries—to score the model’s answers in the form of an A/B test: A: The Moon is made out of cheese. B: The Moon is ...more

See Jason’s 64 highlights

On the Edge: The Art of Risking Everything

by Nate Silver

Rate this book

Clear rating

1 of 5 stars 2 of 5 stars 3 of 5 stars 4 of 5 stars 5 of 5 stars

Jason’s Kindle Notes & Highlights On the Edge: The Art of Risking Everything, by Nate Silver

Jason’s Kindle Notes & Highlights

On the Edge: The Art of Risking Everything, by Nate Silver