Saturday, October 31, 2009

GATE 1999 CS question 1.1 (Probability)

1.1 Suppose the expectation of a random variable X is 5. Which of the following is true?
(a) There is a sample point at which X has a value 5.
(b) There is a sample point at which X has a value greater than 5.
(c) There is a sample point at which X has a value greater than or equal to 5.
(d) None of the above
First of all, what is this ‘expectation of a random variable’ business?
Well, an ‘expectation’ is a sort of sum – it’s  a ‘weighted’ sum. As a formula, it is
E = summation of xi*pi
where xi are the sample points (different values taken) of the random variable, and pi are the probabilities of each value in turn.
Imagine you were a gambler and also happened to know a dice maker. You tell him to make the dice such that numbers 1 to 4 appear with probability 1/12 each, while 5 and 6 appear with probability 1/3 each (verify that this adds up to a total probability of 1). Now, what is the ‘expected value’ of this dice? Well, as I said, expectation is a weighted sum, with the weights being the probabilities. So here, it would be
E = 1*1/12 + 2*1/12 + 3*1/12 + 4*1/12 + 5*1/3 + 6*1/3
E = 4.5
Ha, what is this, the expected value of the dice roll is 4.5? How can a dice roll a non-integer value?
Well, this mathematical expected value isn’t really the expected value of the dice. Yeah, that’s how strange math is. :)
Intuitively, it’s actually a value we try to find such that the difference between the actual value of the random variable and this value is minimum. Now, since we don’t really know the actual value of the random variable beforehand, we keep this value close to the most probable values while not getting too distant from the other values. That is why we got a 4.5 above – it’s a compromise between the high probability of 5 and 6 and the low probabilities of 1,2,3 and 4.
Ok, that’s about the basics of expectation, now let’s dive into this problem. Here, they say that the expectation of a random variable is 5. Well now, let’s think through the options.
Consider (a). Do we really need one of the sample points to be 5 to get an expectation of 5? What about the expectation of an RV (random variable) that takes values 3,4,6,7 with equal probabilities? By now you must be able to calculate and tell that it’s 5. So, (a) is false.
What about (b) then? Well, that appears about right, doesn’t it? A variable which takes only values 1, 2, 3, and 4 can never have an expectation of 5. Even a variable which takes values 4 and 5 with non-zero probabilities for each can’t have an expected value of 5: 4*(something less than 1) + 5*(something less than 1) can never give you 5, only 5*1.0 gives 5. So, is (b) the answer?
But wait, I said 5*1.0 at the last there, can we make the expectation equal to that? Ah, clever idea. What do we need for that? The ‘summation of xi*pi’ should be just 5*1.0 which means there is only one sample point ‘5’ with probability 1.0. In this case, the expectation is of course 5.
So, it's not necessary that there is a sample point greater than 5 - a single sample point at 5 itself also gives that same expectation. 
So, the answer is (c) There is a sample point at which X has a value greater than or equal to 5.