Intuitively, H(D) measures the minimum number of random bits that you'd need to generate a single sample from D – on average, if you were generating lots of independent samples. It also measures the minimum number of bits that you'd need to send your friend, if you wanted to tell her which element from D was chosen – again on average, if you were telling her about lots of independent draws.