Kate’s Kindle Notes & Highlights

Co-Intelligence: Living and Working with AI, by Ethan Mollick

That feedback is then used to do additional training, fine-tuning the AI’s performance to fit the preferences of the human, providing additional learning that reinforces good answers and reduces bad answers, which is why the process is called Reinforcement Learning from Human Feedback (RLHF).