To Holland, the obvious answer was to implement a kind of Hebbian reinforcement. Whenever the agent does something right and gets a positive feedback from the environment, it should strengthen the classifiers responsible. Whenever it does something wrong, it should likewise weaken the classifiers responsible. And either way, it should ignore the classifiers that were irrelevant.

