Jump to content

Regression toward the mean


Recommended Posts

I'm not disputing the quotes from Stanford or anyone (save Berkely, which was really weird).  They all say exactly what I've been saying: excessive error will regress to the mean OF THE ERROR. 

 

I'm disputing your interpretation of them, that it represents regression to the population mean.  It doesn't.  You don't understand the difference between error and population variance.  Hell, you don't even understand the difference between an individual and a population, apparently.  And you still don't understand what that effect means: it means, simply, that you've arbitrarily picked a sample with high net error.  Period.  That's why I called it a completely fictitious effect earlier - becaues it only exists if you specifically look for it.  In the limit of an entire population, it doesn't mean a damned thing.

 

But unless you are saying now that this is NOT regression toward the population mean, you're still wrong.  And if you're saying that now...you're trying to weasel out of your earlier idiot statements.

874924[/snapback]

I've remained very consistent throughout this debate. Someone who scored a 140 on an I.Q. test the first time around will, in general, tend to score somewhat closer to the population's mean upon retaking the test. This debate began because you felt the phenomenon described was untrue. The quotes from Stanford, Berkeley, the University of Chicago, the EPA, and the other sources I cited clearly demonstrate that the phenomenon I've been describing for the last 50 pages is real. Your attempts to ridicule me for having described this phenomenon have caused you to look foolish and ignorant.

Link to comment
Share on other sites

  • Replies 474
  • Created
  • Last Reply

Top Posters In This Topic

I've remained very consistent throughout this debate. Someone who scored a 140 on an I.Q. test the first time around will, in general, tend to score somewhat closer to the population's mean upon retaking the test. This debate began because you felt the phenomenon described was untrue. The quotes from Stanford, Berkeley, the University of Chicago, the EPA, and the other sources I cited clearly demonstrate that the phenomenon I've been describing for the last 50 pages is real. Your attempts to ridicule me for having described this phenomenon have caused you to look foolish and ignorant.

875007[/snapback]

 

No, you originally said regression toward the mean was caused by error. Then you changed your story four times. Then you linked to stuff you still don't understand, because you still don't know the math (if you did, you'd be using mathematical terms like "variance", and "correllation" - terms that, unlike "luck" and "error", actually do explain regression toward the mean).

 

And that is why you're still wrong, and still an idiot. You can't relate your non-mathematical examples to reality. All you've been doing is parroting other people's writing without any insight whatsoever into what it means. Anyone can parrot other people. It takes actual knowledge to do the math and figure these things out for yourself - which is well within the capabilities of myself and MANY people here, most notably NOT including yourself.

 

 

Never mind the fact that, if you did understand any of it, you'd realize that you've actually proven your original conjecture of a eugenics program to be completely impossible. :w00t: But again, that would require understanding the math.

Link to comment
Share on other sites

No, you originally said regression toward the mean was caused by error.  Then you changed your story four times.  Then you linked to stuff you still don't understand, because you still don't know the math (if you did, you'd be using mathematical terms like "variance", and "correllation" - terms that, unlike "luck" and "error", actually do explain regression toward the mean). 

 

And that is why you're still wrong, and still an idiot.  You can't relate your non-mathematical examples to reality.  All you've been doing is parroting other people's writing without any insight whatsoever into what it means.  Anyone can parrot other people.  It takes actual knowledge to do the math and figure these things out for yourself - which is well within the capabilities of myself and MANY people here, most notably NOT including yourself. 

Never mind the fact that, if you did understand any of it, you'd realize that you've actually proven your original conjecture of a eugenics program to be completely impossible.  :w00t:  But again, that would require understanding the math.

875044[/snapback]

You talk big about math, and I remember several instances of you promising Monte Carlo simulations. At no point did you deliver on those promises. The only one who's contributed any real math to this debate is me--which I did through my Monte Carlo simulation.

 

Your statement that I've changed my tune even once is dead wrong. Consistently, from the very beginning, I've said the following: suppose an I.Q. test involves measurement error. Someone with a true I.Q. of 130 could get lucky and score a 140, or unlucky and score a 120. Given this (reasonable) assumption, someone who scores a 140 on an I.Q. test is more likely to be a lucky 130 than an unlucky 150. Therefore, people who score 140s on I.Q. tests are expected to, on average, obtain somewhat lower scores upon being retested. I said this in the beginning of the debate, I said it in the middle, and I'm saying it now. Oh, and by the way, I've got articles from Stanford, Berkeley, the University of Chicago, and a number of other sources with which to back up what I've been saying. You imply that I'm igorant because I use words like "luck" and "error" to explain the phenomenon. How, then, do you explain the fact that the author of the Stanford article also used the words "luck" and "error" when saying exactly the same things I've been saying?

Link to comment
Share on other sites

You talk big about math, and I remember several instances of you promising Monte Carlo simulations. At no point did you deliver on those promises. The only one who's contributed any real math to this debate is me--which I did through my Monte Carlo simulation.

 

Actually, that was entirely unlike a Monte Carlo simulation. Not that it's the least bit surprising that you don't know what one of those is, either.

 

Your statement that I've changed my tune even once is dead wrong. Consistently, from the very beginning, I've said the following: suppose an I.Q. test involves measurement error. Someone with a true I.Q. of 130 could get lucky and score a 140, or unlucky and score a 120. Given this (reasonable) assumption, someone who scores a 140 on an I.Q. test is more likely to be a lucky 130 than an unlucky 150. Therefore, people who score 140s on I.Q. tests are expected to, on average, obtain somewhat lower scores upon being retested. I said this in the beginning of the debate, I said it in the middle, and I'm saying it now.

 

And you've consistently called it "regression toward the mean", which is wrong. And you've specifically said "regression toward the mean is caused by error", which is wrong.

 

Which means either you've been consistent...and wrong. Or you've been inconsistent...and still wrong. Either way, you're an idiot.

 

Oh, and by the way, I've got articles from Stanford, Berkeley, the University of Chicago, and a number of other sources with which to back up what I've been saying. You imply that I'm igorant because I use words like "luck" and "error" to explain the phenomenon. How, then, do you explain the fact that the author of the Stanford article also used the words "luck" and "error" when saying exactly the same things I've been saying?

875061[/snapback]

 

I explain it because IT'S NOT MATH. Again, do the math. Link to math. Relate the math to what the articles say (but skip the Berkely article; as I've said multiple times, it's crap. It's almost as stupid as you are.) This is why you keep devolving into semantic arguments: you refuse to discuss THE MATH. Explain what your linked articles mean...using MATH. I've only been encouraging you for several hundred posts already...but you still can't, because you don't know what "variance" is, or why it's important, or why it's not the same as "error".

 

 

 

And this is hardly the only point you're confused on, by the way. I'm still sitting on about eight more egregious errors you've made. There's just no point in discussing them when you still don't know the basics.

Link to comment
Share on other sites

Actually, that was entirely unlike a Monte Carlo simulation.  Not that it's the least bit surprising that you don't know what one of those is, either.

If you understand the way my simulation was set up, and if you understand the definition of a Monte Carlo simulation, you'll know that what I did was in fact a Monte Carlo simulation.

And you've consistently called it "regression toward the mean", which is wrong.

The EPA quote also referred to the phenomenon as "regression toward the mean," as did the statistics texbook quote I found; as did the Hyperstat link.

 

And you've specifically said "regression toward the mean is caused by error", which is wrong.
In the absence of measurement error, the phenomenon I've described would disappear. Someone with a true I.Q. of 140 would score a 140 on an I.Q. test the first time, the second time, and the third time. Nobody who scored a 140 would be a lucky 130 or an unlucky 150.
I explain it because IT'S NOT MATH.  Again, do the math.  Link to math.  Relate the math to what the articles say (but skip the Berkely article; as I've said multiple times, it's crap.  It's almost as stupid as you are.)

The Berkeley article is correct. As for your demand that I "do the math" the answer is no. I've done enough work already. I created a Monte Carlo simulation. I've found quotes from numerous credible sources which support exactly what I've been saying. If those quotes aren't mathematical enough for you, too bad. Find quotes you like then. They're out there, and they say the same thing I've been saying.

This is why you keep devolving into semantic arguments: you refuse to discuss THE MATH. 

It's you who keeps devolving into semantic arguments by constantly and incorrectly accusing me of not knowing the definitions of specific terms. My arguments have been strictly conceptual, not semantic. You say you want a math discussion, yet almost nothing you've attempted to contribute to this debate has been even remotely mathematical.

 

As far as explaining what my linked articles mean; I've been doing that for dozens of pages now, long before I even found the articles themselves. But I'll do it once again for your benefit. Suppose you're considering those individuals who scored 2 standard deviations above the population mean. Assume measurement error is normally distributed with a mean of zero. There is a probability of X that someone with a true score of 1.9 SDs will get lucky and score 2.0 SDs on the test. Therefore, a given individual with a true score of 2.1 SDs will also have a probability of X of getting unlucky and scoring 2.0 SDs on the test. With me so far? You know and I know that there are more people at 1.9 SDs above the population mean than there are at 2.1 SDs above the mean. The number of lucky 1.9s who scored 2.0 will be X * (the number of people 1.9 standard deviations above the mean). The number of unlucky 2.1s who scored 2.0 will be X * (the number of people 2.1 standard deviations above the mean). Do the same thing again, using a probability of Y, and comparing the lucky 1.8s to the unlucky 2.2s. Then use a probability of Z, and compare the lucky 1.7s to the unlucky 2.3s. In each case, you find that more lucky people are flowing in from below, than unlucky people are flowing in from above. In selecting the group that scored 2 standard deviations above the population mean, you're selecting a group that, collectively, has more people who got lucky on the first test than unlucky. Give those people a second test, and the group's score will be closer to the population's mean.

Link to comment
Share on other sites

In the absence of measurement error, the phenomenon I've described would disappear. Someone with a true I.Q. of 140 would score a 140 on an I.Q. test the first time, the second time, and the third time. Nobody who scored a 140 would be a lucky 130 or an unlucky 150.

875193[/snapback]

"Regression toward the mean" is caused by error being normally distributed. If error is not normally distributed, there's no guarantee that "regression toward the mean" will happen.

 

In other words, if error exists but is not normally distributed, "regression toward the mean" may not happen.

Link to comment
Share on other sites

If you understand the way my simulation was set up, and if you understand the definition of a Monte Carlo simulation, you'll know that what I did was in fact a Monte Carlo simulation.

In the absence of measurement error, the phenomenon I've described would disappear. Someone with a true I.Q. of 140 would score a 140 on an I.Q. test the first time, the second time, and the third time. Nobody who scored a 140 would be a lucky 130 or an unlucky 150.

 

It's you who keeps devolving into semantic arguments by constantly and incorrectly accusing me of not knowing the definitions of specific terms. My arguments have been strictly conceptual, not semantic. You say you want a math discussion, yet almost nothing you've attempted to contribute to this debate has been even remotely mathematical.

 

As far as explaining what my linked articles mean; I've been doing that for dozens of pages now, long before I even found the articles themselves. But I'll do it once again for your benefit. Suppose you're considering those individuals who scored 2 standard deviations above the population mean. Assume measurement error is normally distributed with a mean of zero. There is a probability of X that someone with a true score of 1.9 SDs will get lucky and score 2.0 SDs on the test. Therefore, a given individual with a true score of 2.1 SDs will also have a probability of X of getting unlucky and scoring 2.0 SDs on the test. With me so far? You know and I know that there are more people at 1.9 SDs above the population mean than there are at 2.1 SDs above the mean. The number of lucky 1.9s who scored 2.0 will be X * (the number of people 1.9 standard deviations above the mean). The number of unlucky 2.1s who scored 2.0 will be X * (the number of people 2.1 standard deviations above the mean). Do the same thing again, using a probability of Y, and comparing the lucky 1.8s to the unlucky 2.2s. Then use a probability of Z, and compare the lucky 1.7s to the unlucky 2.3s. In each case, you find that more lucky people are flowing in from below, than unlucky people are flowing in from above. In selecting the group that scored 2 standard deviations above the population mean, you're selecting a group that, collectively, has more people who got lucky on the first test than unlucky. Give those people a second test, and the group's score will be closer to the population's mean.

875193[/snapback]

 

You know what's fascinating? I don't even have to read this to know it's wrong. You keep repeating the same ignorant nonsense over and over, and the response is always the same: regression toward the mean occurs because of statistical variance, and error and variance are not the same thing, and until you can define "variance" (and from that, define "correllation", from which you can define "regression") you don't know what you're talking about. Period.

 

Someone may point out again that we're arguing semantics. We are. Because you continually insist on getting the semantics wrong, and they happen to be important: in getting the semantics wrong, you're confusing two different distributions, and attributing the behavior of one distribution to the other (or even dumber: somehow believing that the two distributions are dependent when they're completely independent). But like I've been saying: you're too much of an idiot to twig to any of this.

Link to comment
Share on other sites

"Regression toward the mean" is caused by error being normally distributed. If error is not normally distributed, there's no guarantee that "regression toward the mean" will happen.

 

In other words, if error exists but is not normally distributed, "regression toward the mean" may not happen.

875207[/snapback]

 

Only of the error. In other words, regression of the error toward the mean of the error is caused by the error being normally distributed (or, more accurately, being not evenly distributed).

 

In other words, an individual's error on a set of tests is normally distributed, and that uneven distribution causes extreme amounts of error by the individual to regress on retest to lesser error...but you can't extrapolate that individual performance to the entire population and say "A-ha! It's the individual's error that causes regression toward the average score of the entire population!" like HA is doing. Not the least of which is because anyone who can do simple integration can check that the sum of all the individuals' error over the entire set of individuals is exactly zero; by definition, error CANNOT cause regression toward the mean like HA is misled to believe.

Link to comment
Share on other sites

Only of the error.  In other words, regression of the error toward the mean of the error is caused by the error being normally distributed (or, more accurately, being not evenly distributed).

875234[/snapback]

Correct.

 

And I notice that in HA's post, he uses assumptions like "Assume measurement error is normally distributed with a mean of zero" to ignore that the "mean" in "regression toward the mean" is "the mean of error" and "regression toward the mean (of error)" may not occur when error is not normally distributed.

Link to comment
Share on other sites

Only of the error.  In other words, regression of the error toward the mean of the error is caused by the error being normally distributed (or, more accurately, being not evenly distributed).

 

In other words, an individual's error on a set of tests is normally distributed, and that uneven distribution causes extreme amounts of error by the individual to regress on retest to lesser error...but you can't extrapolate that individual performance to the entire population and say "A-ha!  It's the individual's error that causes regression toward the average score of the entire population!" like HA is doing.  Not the least of which is because anyone who can do simple integration can check that the sum of all the individuals' error over the entire set of individuals is exactly zero; by definition, error CANNOT cause regression toward the mean like HA is misled to believe.

875234[/snapback]

Wrong. If you look at those who scored, say, a 140 on an I.Q. test, they're going to be more lucky than average. Retest that particular group, and on average they'll score closer to the population's mean I.Q. Conversely, the group of people who scored 60s on the I.Q. test will contain more people who got unlucky than people who got lucky. Retest the 60s; and they'll score closer to the population's mean I.Q the second time around.

Link to comment
Share on other sites

HA, please answer the following two simplified questions:

 

(1) Regression toward the mean

 

If the real IQ is 140 and the error is normally distributed with mean of +5, what will the regression toward to?

 

(A) 140

(B) 145

 

 

(2) abnormally distributed error

 

Assume the real IQ is 140 and the test has an abnormally distributed error as following:

 

137-: 10%

138: 20%

139: 15%

140: 10%

141: 15%

142: 20%

143+: 10%

 

Will "regression toward the mean" happen?

 

(A) Yes

(B) No

 

My answers are (B) for both questions , what are you answers?

Link to comment
Share on other sites

HA, please answer the following two simplified questions:

 

(1) Regression toward the mean

 

If the real IQ is 140 and the error is normally distributed with mean of +5, what will the regression toward to?

 

(A) 140

(B) 145

You seem to be describing the following:

1. Every member of the population has a real I.Q. of 140.

2. Due to a bias in the I.Q. test, the average person taking the test scores 5 points too high.

3. Therefore, the apparent population mean is 145, and not the real 140.

 

I'll agree that under these circumstances, a test/retest situation would cause I.Q. scores to regress towards the population's measured mean of 145, and not its true mean of 140.

(2) abnormally distributed error

 

Assume the real IQ is 140 and the test has an abnormally distributed error as following:

 

137-: 10%

138: 20%

139: 15%

140: 10%

141: 15%

142: 20%

143+: 10%

In this case, you seem to be describing a situation where every member of the population has a true I.Q. of 140. However, due to measurement error, 10% of the population scored a 137 on the relevant I.Q. test, 20% scored a 138, etc.

 

Assuming this interpretation of your example is correct, those who scored a 137 on the test the first time around should expect to score a 140 on retaking the test. (This is an average expectation, as 10% of them will score a 137, 20% a 138, etc.) As a group, those who scored a 137 the first time around will regress toward the population's mean I.Q. of 140. Look at those who scored a 143 on that I.Q. test. The mean I.Q. for that group is also 140 (if I understand your example correctly). On being retested, 10% of them will score a 137, 20% of them will score a 138, etc. As a group, those who scored 143 the first time around will, on average, regress towards the population's mean I.Q. of 140 upon being retested.

Link to comment
Share on other sites

(HA, let me rephase the questions to see if they're more clear to you)

HA, please answer the following two simplified questions:

 

(1) Regression toward the mean

 

I see you agree with me on this case, so I don't repeat the question here. Can we conclude the "mean" in "regression toward the mean" is "the mean of error"?

 

 

(2) abnormally distributed error

 

Assume the real IQ is 140 and the test has an abnormally distributed error as following:

 

(Again, the real IQ is known by other more accurate tests. We then take another test with an abnormally distributed error. I change the error distribution again to make it extremely abnormal. )

 

135: 50%

145: 50%

 

Will "regression toward the mean" happen?

 

(A) Yes

(B) No

 

My answers are (B) for both questions , what are you answers?

Link to comment
Share on other sites

Given this (reasonable) assumption, someone who scores a 140 on an I.Q. test is more likely to be a lucky 130 than an unlucky 150. Therefore, people who score 140s on I.Q. tests are expected to, on average, obtain somewhat lower scores upon being retested.

875061[/snapback]

and you keep insisting this point? jeez. Could you picture what if the variance of the retests of these 140's were to be not regressing towards the mean?

......................

.....................

....................

...................

..................

.................

.................

................

...............

..............

.............

............

...........

..........

.........

........

.......

......

.....

....

...

..

.

 

Ok now that you imagined it. now realize that this is reality and take note of what the others are saying, and try to learn from your mistakes and their mistakes alike.

Link to comment
Share on other sites

My TSW Christmas Wish is for one of the mods to put a long over due merciful end to this madness :w00t:

875296[/snapback]

 

Screw that. I have said it before and I'll say it again, this thread is the gift that gives on giving. I have had a frustrating week at work and Tom has brightened my day each time I get caught up with the latest :P

Link to comment
Share on other sites

Wrong. If you look at those who scored, say, a 140 on an I.Q. test, they're going to be more lucky than average. Retest that particular group, and on average they'll score closer to the population's mean I.Q. Conversely, the group of people who scored 60s on the I.Q. test will contain more people who got unlucky than people who got lucky. Retest the 60s; and they'll score closer to the population's mean I.Q the second time around.

875266[/snapback]

 

Why do you not understand that just because a given value A regresses toward B which is conicidentally in the direction of another given value C, it does not mean it's regressing toward C? What is your mental deficiency that you don't understand this?

 

You see, "regression" in this context (statistics) has a very specific definition. Again, it has to do with "variance", that pesky little vocabularium you keep ignoring to your detriment (and yes, "vocabularium" is actually a real word.) And "variance" itself has a very specific definition, involving among other things the statistical distribution in which it exists. What you are doing is taking the variance of one distribution and attributing to a completely different distribution, in the mistaken belief that they're equivalent. But they're not...as I keep saying, measurement error and variance are NOT THE SAME THING.

 

Is this sinking in yet? Because after a hundred pages of watching you flounder, I've decided it's finally time to give you just a little hint at what MATHEMATICS is, versus stupid little vignettes from Stanford and Berkely. I've been hoping that just MAYBE you'd stumble across it yourself...but you're bound and determined to remain wilfully ignorant.

 

Not that that'll change with this post either...but you're getting boring. You were much more entertaining when you were varying your bull sh-- explanations between "a rubber band stretches because of error" and "a die has a true value of 3.5" and pretending they meant something.

Link to comment
Share on other sites

Correct.

 

And I notice that in HA's post, he uses assumptions like "Assume measurement error is normally distributed with a mean of zero" to ignore that the "mean" in "regression toward the mean" is "the mean of error" and "regression toward the mean (of error)" may not occur when error is not normally distributed.

875238[/snapback]

 

Which is a perfectly valid assumption. But as I say in the above post, he keeps confusing that variance with the population variance (more rigorously, he confuses the regression of an individual score with extreme error towards the mean error with the regression of an individual score toward the population mean...just because they're both in the same direction. Which is basically the same as confusing the two distributions...but FAR more idiotic.)

 

What's utterly phenomenal is that, after all this, he still can't see it.

Link to comment
Share on other sites

Which is a perfectly valid assumption.  But as I say in the above post, he keeps confusing that variance with the population variance (more rigorously, he confuses the regression of an individual score with extreme error towards the mean error with the regression of an individual score toward the population mean...just because they're both in the same direction.  Which is basically the same as confusing the two distributions...but FAR more idiotic.) 

 

What's utterly phenomenal is that, after all this, he still can't see it.

875341[/snapback]

It's a valid assumption, but may not be a good assumption to explain what you said above. To explain the difference to him, I think one of the better ways is to show him the case where mean of error is not zero. With the assumption of normally distributed error with mean of zero, it's kind of hard to explain the difference of the two.

Link to comment
Share on other sites

It's a valid assumption, but may not be a good assumption to explain what you said above. To explain the difference to him, I think one of the better ways is to show him the case where mean of error is not zero. With the assumption of normally distributed error with mean of zero, it's kind of hard to explain the difference of the two.

875356[/snapback]

 

great explanation in the posts above. But you know the moron is going to keep parroting his "there are more lucky 130's than unlucky 150's, so it will regress to the population mean B.S."

Link to comment
Share on other sites

×
×
  • Create New...