Colin’s Reviews > The Alignment Problem: Machine Learning and Human Values > Status Update

Colin
Colin is 80% done
But when they do it’s with absurdly high confidence
Jul 27, 2021 10:07PM
The Alignment Problem: Machine Learning and Human Values

flag

Colin’s Previous Updates

Colin
Colin is 10% done
DL systems are brittle - they MUST classify even static as one thing or another
Jul 27, 2021 10:06PM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 10% done
We don't want algorithmic recommenders naively reinforcing our worst impulses. We are not perfect.
Jul 26, 2021 07:31AM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 10% done
RL - telling AI what we want (coding reward function)
IRL - showing AI what we want (having them infer from model behavior)
Synthesis - knowing what we want when we see it and communicating that to the AI
Jul 24, 2021 10:21PM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 10% done
Dopamine - currency of reinforcement learning in the brain, telling you that your expectations are in error and you should learn from it. The reason why cocaine is addictive - you feel a rush from thinking it's gonna get better, but that's just the artificial dopamine speaking.
Jul 20, 2021 10:35PM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 10% done
Predictive policing - what if model predictions determine policing rates? A selection bias arising from poor quality data might tag a place as ok when it is not, resulting in reallocation of police resources away, exacerbating crime. Initiating a feedback loop. Selection bias meets confirmation bias.
Jul 17, 2021 05:06AM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 10% done
It is impossible to guarantee equal fairness across all domains of errors in a classification algorithm if the base rates between categories are not themselves equal. If you optimize for false negatives you will lose on false positives. That means all algorithms will be open to critique and you need to optimize for what is more fit for purpose
Jul 17, 2021 12:21AM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 5% done
Being bias-blind doesn’t work for fairness - not only are correlated variables still encoded regardless, removing these variables from analysis makes it harder to correct for reality-bias. The question is not what data is in the set but how it is used.
Jul 16, 2021 09:11PM
The Alignment Problem: Machine Learning and Human Values


Colin
Colin is 5% done
It is not the algorithm at fault but the representativeness of the training dataset
Jul 12, 2021 11:17PM
The Alignment Problem: Machine Learning and Human Values


No comments have been added yet.