Kate O'Neill

9%
Flag icon
That feedback is then used to do additional training, fine-tuning the AI’s performance to fit the preferences of the human, providing additional learning that reinforces good answers and reduces bad answers, which is why the process is called Reinforcement Learning from Human Feedback (RLHF).
Co-Intelligence: Living and Working with AI
Rate this book
Clear rating
Open Preview