Thursday, September 6, 2007

Statistical Fallacies For Political Purposes

Most people have trouble interpreting statistics, and that's understandable. Statistical analysis is largely taught in college to a few students whose major requires it. But Dr. Ileana Arias, director of the CDC's National Center for Injury Prevention and Control should know better; I would assume that the top people at the CDC, dealing with statistics daily know how to interpret them, and are aware of what is significant and what is not. Guess again.
Today they came out with a story saying that:
The suicide rate among preteen and teenage girls rose to its highest level in 15 years, and hanging surpassed guns as the preferred method, federal health officials reported Thursday.
The biggest jump — about 76 percent — was in the suicide rate for girls ages 10-14 from 2003 to 2004. There were 94 suicides in that age group in 2004, compared to 56 in 2003. That's a rate of fewer than one per 100,000 population.
For full story:
http://www.foxnews.com/story/0,2933,295953,00.html

I know something about statistics, I took advanced courses in it as a graduate student, and applied it in my dissertation. When I read this, I knew immediately that something else is going on when they reported it, as the "jump" from 54 to 96 suicides in that population is insignificant. It is not even noteworthy, given that there are approximately 10,000,000 girls ages 10-14 in the US. To check my gut feeling, I ran the numbers testing for significance, using the t-test.
For those of you who know what it all means, the t value in this case is 116,343. Given a confidence level of .0005, this value does not even register as a minor blip.
To put it another way, the number of suicides is so small compared with the population that an increase or decrease of a similar number has no real meaning; it is well within the margin of error.
Furthermore teen girl suicide (while it is very real for the families who suffer it, and I commiserate with them) is quite rare if you put it into perspective. The average person personally knows about 500 people, so only one person in 200 even vaguely knows a girl who has committed suicide.
There is a further mistake in the article, though I suspect the journalist who wrote it is to blame, and that is the statement "..a rate of fewer than one per 100,000 population."
No, it is fewer than one per 100,000 girls ages 10-14. In the general population, it is one per 3 million population.

There are two explanations why this "jump" is being reported by the CDC. One, Dr. Arias is ignorant of statistical probability, and is shocked by the figure, or the CDC has an agenda behind reporting it.
Let's examine the article to see what we find in that direction.
"The CDC is advising health officials to consider focusing suicide-prevention programs on girls ages 10-19 and boys between 15-19 to reverse the trends."

One thing right off the bat: using the word "trend" in a statistical study is unprofessional. Either an increase or decrease is statistically significant, in which case you would say so, or it is not. The word "trend" is used by sloppy social scientists to denote a change in a number, but one they can't support with statistical analysis - in other words, it is a biased opinion. They want it to mean something that it is not.
Now I imagine this advice means for health officials to stick their noses in children's lives to prevent that one in 3 million suicide. Given that an individual health worker may have access to perhaps a couple hundred children, it would be quite unlikely, even if he or she were capable of spotting a potential suicide, that he or she would actually come across a particular one. Meanwhile, they would be fretting, opening case files, and intruding where they should not a dozen times a day, every time they saw a child who was sad because her best friend went out with a boy she likes.

No comments: