Jump to content

Regression toward the mean


Recommended Posts

Reread what I wrote about his example. I said that someone who scored a 145 on the first test would, on average, score a 140 the second time around. Obviously, 50% of the time those retaking the test would get a 145 again, and the other 50% they'd get a 135. That works out to an average outcome of 140.

875749[/snapback]

You said "someone", thus, it looks like you're talking about "one person". If you're talking about a group of people, please answer the questions in my previous post about if "regression toward the mean" applies to individuals.

 

With one person taking this test, he only gets one outcome, either 135 or 145. There's nothing to average after the score is out.

 

Also, it's called "expected value", not "average outcome". You can say, before taking the test, the expected value of the score is 140. However, once the test is taken, there is only one score, 135 or 145. Even the expected value is 140, you can not score 140 in this test.

 

It's common that expected value is NOT one of the possible outcomes. Need more example? take the famous dice example in this thread, the expected value of one throw is 3.5, but you can never throw a 3.5. Please don't confuse expected value with possible outcomes.

 

When talking about "regression toward the mean", it's about the possibility of the score upon retaking the test being closer to the mean. It's not about if the expected value is closer to the mean or not.

Link to comment
Share on other sites

  • Replies 474
  • Created
  • Last Reply

Top Posters In This Topic

You said "someone", thus, it looks like you're talking about "one person". If you're talking about a group of people, please answer the questions in my previous post about if "regression toward the mean" applies to individuals.

 

With one person taking this test, he only gets one outcome, either 135 or 145. There's nothing to average after the score is out.

 

Also, it's called "expected value", not "average outcome". You can say, before taking the test, the expected value of the score is 140. However, once the test is taken, there is only one score, 135 or 145. Even the expected value is 140, you can not score 140 in this test.

 

It's common that expected value is NOT one of the possible outcomes. Need more example? take the famous dice example in this thread, the expected value of one throw is 3.5, but you can never throw a 3.5. Please don't confuse expected value and possible outcomes.

 

When talking about "regression toward the mean", it's about the possibility of the score upon retaking the test being closer to the mean. It's not about if the expected value is closer to the mean or not.

875790[/snapback]

Thanks for that "expected value" link. :lol: Here's an interesting quote from it:

In probability theory the expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome of the experiment multiplied by its payoff ("value"). Thus, it represents the average amount one "expects" as the outcome of the random trial when identical odds are repeated many times. Note that the value itself may not be expected in the general sense; it may be unlikely or even impossible. For example, the expected value from the roll of an ordinary six-sided die is 3.5, which is not one of the possible outcomes.

To return to the discussion at hand: clearly in your example, it's impossible for any one person to regress towards the population's mean. But as a group, those who score high or low on the first test regress toward the mean on being retested.

Link to comment
Share on other sites

Do you imply that "regression toward the mean" only applies to a group of people and doesn't apply to individuals?

 

Anyway, answer this question, does "regression toward the mean" apply to individuals?

 

Or if you want an example, here is one. If a person's real IQ is 140 (known by other more accurate tests) and scores a 160 or 120 in a test with zero-mean normally-distributed error, will he likely get a score closer to the mean when retaking the test?

 

If your answer is yes, please answer the next question.

 

Q: If a person (not a group of people) gets a 135 in this test with abnormally distributed error, does he likely get a score (either 135 or 145) closer to the mean when retaking the test?

875763[/snapback]

In real world examples, (that is, normally distributed population, normally distributed measurement error with a mean of zero) regression toward the mean applies to individuals in a certain sense. That is, if someone got a 140 on an I.Q. test, that person is more likely to get a lower score the second time around, than he or she is to get the same score or better. This doesn't mean that all the people who got 140s the first time around will get lower scores upon being retested, just that most will.

 

To address your example, the most likely outcome for the person's retest is 140, because that is this person's true I.Q. If you're saying that the test only allows someone to score a 120 or 160; then either outcome is equally likely both for the initial test score and for the retest.

 

I can't really answer your second question without knowing more. Assuming the person to which you're referring has a true I.Q. of 140; that person's expected value on the retest is 140 (50% chance of getting a 135; 50% chance of getting 145). But based on the way the I.Q. test is set up, it's impossible for any specific person to do anything other than score exactly five points away from the population's mean I.Q. of 140.

Link to comment
Share on other sites

To return to the discussion at hand: clearly in your example, it's impossible for any one person to regress towards the population's mean. But as a group, those who score high or low on the first test regress toward the mean on being retested.

875850[/snapback]

Good. We now agree this abnormally distributed error doesn't cause any individual regressed toward the mean. In other words, even with error existed, "regression toward the mean" on individuals may not happen.

 

I think I just show you that error doesn't always cause "regression toward the mean" on individuals.

Link to comment
Share on other sites

In real world examples, (that is, normally distributed population, normally distributed measurement error with a mean of zero) regression toward the mean applies to individuals in a certain sense. That is, if someone got a 140 on an I.Q. test, that person is more likely to get a lower score the second time around, than he or she is to get the same score or better. This doesn't mean that all the people who got 140s the first time around will get lower scores upon being retested, just that most will.

As you may know, I agree that normally distributed errors cause regression toward the mean. However, here I'm trying to show you that not all the errors cause regression toward the mean by using this extremely abnormally distributed error.

 

To address your example, the most likely outcome for the person's retest is 140, because that is this person's true I.Q. If you're saying that the test only allows someone to score a 120 or 160; then either outcome is equally likely both for the initial test score and for the retest.

Please note I stated the error here is normally distributed. Let me repeat the question:

 

If a person's real IQ is 140 (known by other more accurate tests) and scores a 160 or 120 in a test with zero-mean normally-distributed error, will he likely get a score closer to the mean when retaking the test?

 

It seems like you agree "regression toward the mean" normally applies to individuals.

 

I can't really answer your second question without knowing more. Assuming the person to which you're referring has a true I.Q. of 140; that person's expected value on the retest is 140 (50% chance of getting a 135; 50% chance of getting 145). But based on the way the I.Q. test is set up, it's impossible for any specific person to do anything other than score exactly five points away from the population's mean I.Q. of 140.

875854[/snapback]

Right, the person's true IQ is 140. And because of an abnormally distributed error, when this person takes this test, the only outcomes are 135 (50%) and 145 (50%). Also, as I mentioned in another post, it's common that the expected value is not one of the possible outcomes.

 

I'm aware this case is unrealistic. However, this abnormally distributed error example shows that not all the errors cause "regression toward the mean" on individuals.

Link to comment
Share on other sites

Good. We now agree this abnormally distributed error doesn't cause any individual regressed toward the mean. In other words, even with error existed, "regression toward the mean" on individuals may not happen.

 

I think I just show you that error doesn't always cause "regression toward the mean" on individuals.

875857[/snapback]

No, not always. But under normal, real life circumstances, an error-prone test is generally associated with regression toward the population's mean in test/retest situations.

Link to comment
Share on other sites

As you may know, I agree that normally distributed errors cause regression toward the mean. However, here I'm trying to show you that not all the errors cause regression toward the mean by using this extremely abnormally distributed error.

Please note I stated the error here is normally distributed. Let me repeat the question:

 

If a person's real IQ is 140 (known by other more accurate tests) and scores a 160 or 120 in a test with zero-mean normally-distributed error, will he likely get a score closer to the mean when retaking the test?

 

It seems like you agree "regression toward the mean" normally applies to individuals.

Right, the person's true IQ is 140. And because of an abnormally distributed error, when this person takes this test, the only outcomes are 135 (50%) and 145 (50%). Also, as I mentioned in another post, it's common that the expected value is not one of the possible outcomes.

 

I'm aware this case is unrealistic. However, this abnormally distributed error example shows that not all the errors cause "regression toward the mean" on individuals.

875863[/snapback]

It looks like we're in agreement on this one.

Link to comment
Share on other sites

No, not always. But under normal, real life circumstances, an error-prone test is generally associated with regression toward the population's mean in test/retest situations.

876235[/snapback]

It seems like we also agree on this one.

 

Under normal circumstances, a normally distributed error would cause "regression toward the mean". When talking about math or statistics, in theory, errors don't always cause "regression toward the mean".

 

Thus, you can NOT simply say "errors cause regression toward the mean", since this statement is not always true. You have to be more specific and state the conditions when your statement is true. Be more scientific, for example,

 

"In normal circumstances, errors cause regression toward the mean"

or

"When the error is normally distributed, error causes regression toward the mean".

Link to comment
Share on other sites

"In normal circumstances, errors cause regression toward the mean"

or

"When the error is normally distributed, error causes regression toward the mean".

876830[/snapback]

 

Of course, the correct answer is NEITHER. :nana: Error does not CAUSE regression toward the mean. Error itself can regress toward the mean of the error...

Link to comment
Share on other sites

Of course, the correct answer is NEITHER.  :nana:  Error does not CAUSE regression toward the mean.  Error itself can regress toward the mean of the error...

877008[/snapback]

Didn't I state that the "mean" in "regression toward the mean" is "the mean of error" when I brought up two questions?

Link to comment
Share on other sites

Didn't I state that the "mean" in "regression toward the mean" is "the mean of error" when I brought up two questions?

877021[/snapback]

 

But that still doesn't equate to "cause". That may just be inexactness due to you trying to explain math to a pinhead (in fact, I'll give you the benefit of the doubt and say it is). But it's just that inexactness that's led to this discussion going on for two thousand posts.

 

What you mean to say, I think, is that the regression of normally distributed error towards the mean of the error can look like regression to the mean of the population. That's not a "causal" relationship...in fact, it's really not any sort of relationship, since you're talking about two independent statistical distrubitions. It just appears to be, to people like HA who think they're mathematicians yet can't define terms like "variance".

Link to comment
Share on other sites

What you mean to say, I think, is that the regression of normally distributed error towards the mean of the error can look like regression to the mean of the population. 

877066[/snapback]

No, I'm not talking about mean of population or population distribution. I'm only talking about error distribution and normally distributed error causing "regression toward the mean (of error)" phenomenon.

 

You need to explain things step by step. It seems like HA agrees that not all the errors cause regression toward the meam. Although he and me may not refer to the same definition of "mean", it's not the point of my example, which is if errors always cause regression toward the mean. In my example, both "mean of population" and "mean of error" are the same, so we can focus on the effect of error distribution. It looks like the discussion now can move to the next topic (for example, mean of error vs. mean of population).

 

Anyway, I'm done with the discussion and already let HA know what I try to show him. Now, let's go back to the usual name-calling shouting match.

Link to comment
Share on other sites

But that still doesn't equate to "cause". That may just be inexactness due to you trying to explain math to a pinhead (in fact, I'll give you the benefit of the doubt and say it is). But it's just that inexactness that's led to this discussion going on for two thousand posts.
I've been trying to explain math to a pinhead for at least the last 50 pages. It's been less than fun.

 

You're right in saying that "inexactness" has led to this discussion becoming exceptionally long. To say," you're too stupid to know the difference between error and variance," translates into, "I think the difference between error and variance somehow undermines some portion of what you're saying. But because I'm an inconsiderate jerk who isn't afraid of a little board pollution, I won't bother giving specifics about which portions of Arm's examples are supposed to be incorrect. In fact, specifics aren't that important to me anyway. If I say that Arm is an idiot who doesn't understand statistics often enough, people will believe me. This, even if they don't understand the underlying debate. Especially if they don't understand the underlying debate."

 

The bottom line is that you've mocked me for saying the same thing people at Stanford and the University of Chicago are saying. This is a reflection on your credibility, not mine.

Link to comment
Share on other sites

I've been trying to explain math to a pinhead for at least the last 50 pages. It's been less than fun.

 

You're right in saying that "inexactness" has led to this discussion becoming exceptionally long. To say," you're too stupid to know the difference between error and variance," translates into, "I think the difference between error and variance somehow undermines some portion of what you're saying. But because I'm an inconsiderate jerk who isn't afraid of a little board pollution, I won't bother giving specifics about which portions of Arm's examples are supposed to be incorrect. In fact, specifics aren't that important to me anyway. If I say that Arm is an idiot who doesn't understand statistics often enough, people will believe me. This, even if they don't understand the underlying debate. Especially if they don't understand the underlying debate."

 

The bottom line is that you've mocked me for saying the same thing people at Stanford and the University of Chicago are saying. This is a reflection on your credibility, not mine.

 

No, it's a reflection on your math skills. You haven't been trying to explain math at all...you haven't used any. :doh: You don't understand what Stanford and UC's web sites (not "Stanford and UC," you moron) are saying. If you could derive the effect MATHEMATICALLY, you'd know excatly what said pages are trying to say and why you misunderstand them. You can't.

Link to comment
Share on other sites

It seems like we also agree on this one.

 

Under normal circumstances, a normally distributed error would cause "regression toward the mean". When talking about math or statistics, in theory, errors don't always cause "regression toward the mean".

 

Thus, you can NOT simply say "errors cause regression toward the mean", since this statement is not always true. You have to be more specific and state the conditions when your statement is true. Be more scientific, for example,

 

"In normal circumstances, errors cause regression toward the mean"

or

"When the error is normally distributed, error causes regression toward the mean".

It sounds like you've got a firm grasp of this issue. You're right to say there are circumstances where measurement error would not be associated with regression towards the population's mean. But the example that started the whole debate was an example of I.Q. test scores. I argued that the average person who scores a 140 on an I.Q. test will, on average, obtain a somewhat lower score upon being retested. Bungee Jumper called me an idiot who doesn't understand the first thing about statistics, and the debate was on. Happily, I've found sources like Stanford, the University of Chicago, and others which state that those who obtain very high or very low scores on their first tests tend to regress somewhat towards the population's mean upon being retested. An intelligent, unbiased person paying attention to this debate from start to finish would clearly realize I've long since won.

 

Bungee Jumper's notion of "regression toward the mean of error" can seem seductive. It's true that whether a person gets lucky or unlucky on the first test, that person is expected to be luck-neutral upon being retested. However, the group of people who scored above the population mean on the first test will typically contain more lucky people than unlucky. Likewise, the group of people who scored below the population's mean will typically contain more unlucky people than lucky people. (I'm assuming a normally distributed population, and a normally distributed measurement error term.) If you were to retest everyone who scored above the population's mean, or any given subset of that group, you would find that those whom you retested would see their socres regress somewhat toward the population's mean. It's the population mean that's being referred to in the phrase, "regression toward the mean," and not the mean of the error.

Link to comment
Share on other sites

Mr. Mohra: So, I'm tendin' bar there at Ecklund and Swedlin's last Tuesday and this little guy's drinkin' and he says, "So where can a guy find some action? I'm goin' crazy out there at the lake." And I says, "What kinda action?" and he says, "Woman action, what do I look like?" And I says, "Well, what do I look like, I don't arrange that kinda thing," and he says, "I'm goin' crazy out there at the lake," and I says, "Well, this ain't that kinda place."

 

Officer Olson: Uh-huh.

 

Mr. Mohra: So he says, "So I get it, so you think I'm some kinda jerk for askin'," only he doesn't use the word jerk.

 

Officer Olson: I understand.

 

Mr. Mohra: And then he calls me a jerk and says the last guy who thought he was a jerk was dead now. So I don't say nothin' and he says, "What do ya think about that?" So I says, "Well, that don't sound like too good a deal for him then."

 

Officer Olson: Ya got that right.

 

Mr. Mohra: And he says, "Yah, that guy's dead and I don't mean of old age." And then he says, "Geez, I'm goin' crazy out there at the lake."

 

Officer Olson: White Bear Lake?

 

Mr. Mohra: Well, Ecklund & Swedlin's, that's closer ta Moose Lake, so I made that assumption.

 

Officer Olson: Oh sure.

 

Mr. Mohra: So, ya know, he's drinkin', so I don't think a whole great deal of it, but Mrs. Mohra heard about the homicides down here and she thought I should call it in, so I called it in.

 

End o' story.

Link to comment
Share on other sites

No, it's a reflection on your math skills. You haven't been trying to explain math at all...you haven't used any. :doh: You don't understand what Stanford and UC's web sites (not "Stanford and UC," you moron) are saying. If you could derive the effect MATHEMATICALLY, you'd know excatly what said pages are trying to say and why you misunderstand them. You can't.

Once again, you're trying to convince people I must have somehow misunderstood something or another, while providing no specifics on what this something might actually be. Quite frankly, if you try to provide specifics, you'll once again fall on your face. I don't blame you for remaining intentionally vague.

 

As far as the math angle goes, I've provided a lot more math to this discussion than you. You just didn't understand it.

Link to comment
Share on other sites

Once again, you're trying to convince people I must have somehow misunderstood something or another, while providing no specifics on what this something might actually be. Quite frankly, if you try to provide specifics, you'll once again fall on your face. I don't blame you for remaining intentionally vague.

 

As far as the math angle goes, I've provided a lot more math to this discussion than you. You just didn't understand it.

 

I have provided specifics. And math. Regression is a function of the variance in a statistical distribution. Period. That's specific, and it's math. And it's right. And it's exactly what you haven't been saying.

Link to comment
Share on other sites

×
×
  • Create New...