含有〈统计〉标签的文章(10)

会动摇多少结论呢

【2016-07-25】

@whigzhou: 以统计学方法为主导的研究有个问题是,容易让人忽视一些有着根本重要性但又缺乏统计差异的因素,比如身高,在一个儿童营养条件普遍得到保障的社会,研究者可能会得出『营养不是影响身高的重要因素』的结论,并且这一结论可能在很多年中都经受住了考验,直到有一天,某一人群经历了一次严重营养不良……

@whigzhou: 在可控实验中,此类问题可以通过对营养条件这一参数施加干预而得以避免,但社会科学领域常常不具备对参数进行任意干预的条件,只能用统计学方法来模拟可控实验,可是(more...)

标签: | |
7329
【2016-07-25】 @whigzhou: 以统计学方法为主导的研究有个问题是,容易让人忽视一些有着根本重要性但又缺乏统计差异的因素,比如身高,在一个儿童营养条件普遍得到保障的社会,研究者可能会得出『营养不是影响身高的重要因素』的结论,并且这一结论可能在很多年中都经受住了考验,直到有一天,某一人群经历了一次严重营养不良…… @whigzhou: 在可控实验中,此类问题可以通过对营养条件这一参数施加干预而得以避免,但社会科学领域常常不具备对参数进行任意干预的条件,只能用统计学方法来模拟可控实验,可是当某些变量的采样值缺乏多样性时,这一模拟便无法进行,于是便留下了盲点。 @whigzhou: 近年来有很多针对国别的政治学研究,量化了很多指标,统计学工具也用的挺熟练,但我总有种感觉,一些基本背景条件似乎没有得到足够关注,比如拿破仑战争之后各国政治的一个基本背景是英帝或美帝的存在,这一条件如此普遍而牢固乃至观察不到差异,一旦消除,会动摇多少结论呢? @whigzhou: 让问题变得更棘手的是那些存在足够大差异但『边际影响率从某个阈值开始骤减』的变量,比如钙摄入量与身高的关系,在『从零到适宜值』这个区间,钙摄入对身高影响显著,而从适宜值往上,边际影响率急减,几乎没影响,此时更容易得出错误结论。 @慕容飞宇gg: 是。类似的各种公立学校和私立学校的比较也存在类似问题,现有的结论都只适用于现在90%的学生上公立学校的基本背景。对李伯儒主导的学界来说这个基本背景是理所当然的。 @whigzhou: 嗯 @whigzhou: 我们经常听到诸如『某一特性差异60%归因于基因,40%归于环境』之类的说法,仿佛这一归因比例是某个固有值似的,而实际上,这些比例当然高度依赖于目标人群的生存条件,你把一个群体的铅污染全部消除,智力的环境影响『比重』立马就降低了。 @whigzhou: Taleb的《黑天鹅》想要谈论的就是这个主题,可是他太笨了,写了厚厚一本看起来很哲学的砖头书,结果也没说清楚。  
[译文]钟形曲线上的窄尾巴

Tails of Great Soccer Players
伟大足球运动员的窄尾分布

作者:Jacob @ 2015-11-19
译者:Veidt(@Veidt)
校对:Drunkplane(@Drunkplane-zny)
来源:Put A Number On It!,http://putanumonit.com/2015/11/10/003-soccer1/

Isn’t it strange that the Chinese aren’t world champions in every single team sport? Here’s why it’s strange: China has 19% of the world’s population. For individual sports that may not be a huge deal: if tennis ability and opportunity are distributed equally around the world, there would be only a 19% chance that the best tennis player hails from China and 81% that he is Swiss, Serbian, Spanish, Scottish or from any other country. It is somewhat surprising seeing the top 5 superior servers and strikers of soft springy spheres with swings of stringed racquets all come from sovereign states that start with “S”, but that’s a separate story.

中国没能在所有团队运动项目中成为世界冠军着实是件奇怪的事情。这之所以奇怪,是因为中国拥有着全世界19%的人口。对于个人运动项目来说,也许这个数字还并不算太大:如果打网球的能力和机会在全球均等地分布,那么全世界最好的网球运动员来自中国的概率仅有19%,而他来自于瑞士,塞尔维亚,西班牙,苏格兰或者任何其它国家的概率则有81%。全世界最具统治力的5名网球选手都来自于国名以“S”开头的国家这件事情的确有点令人吃惊,但那是另一件事情。

In team sports that should be different. If soccer talent was equally spread China should have on average 19 of the top 100 players in each generation, almost never less than 11. Countries like Spain, Germany and France on the other hand would expect to have 1 player in the top 100, maybe 2 or 3 if they’re lucky. That would be no match for the loaded Chinese squad. Even a top 3 player can’t dominate all by himself in a team-based sport like soccer, as evidenced by the below picture of sad Ronaldo.

在团队运动中情况则完全不同。如果踢足球的天赋在世界上均等地分布,那么平均而言,在每一代世界上最好的100名球员中,中国会拥有19个,而这个数字几乎绝不可能低于11。另一方面,西班牙,德国和法国这些国家则通常只会有1名球员进入全球前100名,即使幸运的话也最多只有2或3名。而他们的队伍应该完全无法与皆由精英组成的中国队抗衡。毕竟,即使是排名世界前3的球员也无法在足球这样的一项团队运动中靠一己之力统治比赛,下图中C罗悲伤的表情充分证明了这一点。

And yet, the Chinese team is not good at soccer, and I’m putting that milder than some. The Chinese men’s national soccer team is ranked 84th in the world, a few spots below Antigua and Barbuda – a nation with a population of 90,000. That’s roughly equal to a single neighborhood in Shanghai.

但实际上中国足球队的水平并不高,而我的这种表述方式已经比一些人温和得多了。中国男子国家足球队的世界排名是第84位,他们的积分比安提瓜和巴布达还要低上几分,而这个国家的人口仅有9万,几乎只相当于上海的一个街区。

Motivation is often brought up as an explanation: perhaps the Chinese have the talent and opportunity to play soccer, but all 1.3 billion of them choose not to. Perhaps instead of playing soccer they choose to study. Those that play soccer the least and study the most can go into medicine, and those that study hardest of all and have no room for soccer make it into top medical schools in the US.

常被提到的一个理由是动力不足:也许中国人拥有踢足球的天赋和机会,但是13亿中国人却选择不去踢。也许他们宁愿把时间花在学习上。那些踢球踢得少,读书读得多的孩子可以去学医,而那些在学习上最用功以至完全没时间踢球的(more...)

标签:
6560
Tails of Great Soccer Players 伟大足球运动员的窄尾分布 作者:Jacob @ 2015-11-19 译者:Veidt(@Veidt) 校对:Drunkplane(@Drunkplane-zny) 来源:Put A Number On It!,http://putanumonit.com/2015/11/10/003-soccer1/ Isn’t it strange that the Chinese aren’t world champions in every single team sport? Here’s why it’s strange: China has 19% of the world’s population. For individual sports that may not be a huge deal: if tennis ability and opportunity are distributed equally around the world, there would be only a 19% chance that the best tennis player hails from China and 81% that he is Swiss, Serbian, Spanish, Scottish or from any other country. It is somewhat surprising seeing the top 5 superior servers and strikers of soft springy spheres with swings of stringed racquets all come from sovereign states that start with “S”, but that’s a separate story. 中国没能在所有团队运动项目中成为世界冠军着实是件奇怪的事情。这之所以奇怪,是因为中国拥有着全世界19%的人口。对于个人运动项目来说,也许这个数字还并不算太大:如果打网球的能力和机会在全球均等地分布,那么全世界最好的网球运动员来自中国的概率仅有19%,而他来自于瑞士,塞尔维亚,西班牙,苏格兰或者任何其它国家的概率则有81%。全世界最具统治力的5名网球选手都来自于国名以“S”开头的国家这件事情的确有点令人吃惊,但那是另一件事情。 In team sports that should be different. If soccer talent was equally spread China should have on average 19 of the top 100 players in each generation, almost never less than 11. Countries like Spain, Germany and France on the other hand would expect to have 1 player in the top 100, maybe 2 or 3 if they’re lucky. That would be no match for the loaded Chinese squad. Even a top 3 player can’t dominate all by himself in a team-based sport like soccer, as evidenced by the below picture of sad Ronaldo. 在团队运动中情况则完全不同。如果踢足球的天赋在世界上均等地分布,那么平均而言,在每一代世界上最好的100名球员中,中国会拥有19个,而这个数字几乎绝不可能低于11。另一方面,西班牙,德国和法国这些国家则通常只会有1名球员进入全球前100名,即使幸运的话也最多只有2或3名。而他们的队伍应该完全无法与皆由精英组成的中国队抗衡。毕竟,即使是排名世界前3的球员也无法在足球这样的一项团队运动中靠一己之力统治比赛,下图中C罗悲伤的表情充分证明了这一点。 And yet, the Chinese team is not good at soccer, and I’m putting that milder than some. The Chinese men’s national soccer team is ranked 84th in the world, a few spots below Antigua and Barbuda – a nation with a population of 90,000. That’s roughly equal to a single neighborhood in Shanghai. 但实际上中国足球队的水平并不高,而我的这种表述方式已经比一些人温和得多了。中国男子国家足球队的世界排名是第84位,他们的积分比安提瓜和巴布达还要低上几分,而这个国家的人口仅有9万,几乎只相当于上海的一个街区。 Motivation is often brought up as an explanation: perhaps the Chinese have the talent and opportunity to play soccer, but all 1.3 billion of them choose not to. Perhaps instead of playing soccer they choose to study. Those that play soccer the least and study the most can go into medicine, and those that study hardest of all and have no room for soccer make it into top medical schools in the US. 常被提到的一个理由是动力不足:也许中国人拥有踢足球的天赋和机会,但是13亿中国人却选择不去踢。也许他们宁愿把时间花在学习上。那些踢球踢得少,读书读得多的孩子可以去学医,而那些在学习上最用功以至完全没时间踢球的孩子将在未来进入美国最好的医学院。 Certainly we don’t expect those Chinese to play soccer at all, and yet below is a group photo of the Emory University medical school soccer club. The summer I was there we played at least 4 hours a week. You can easily find me on the photo, I’m one of three non-Chinese people on the team. 显然我们不会相信中国人完全不踢足球,下面是一组埃默里大学医学院足球俱乐部的照片。在那个夏天,我每周至少在那里踢上4个小时足球。你可以轻松地在照片上找到我,我是那支球队里仅有的三名非华人球员之一。 The success of a national soccer team should depend on two factors: the pool of available players (population) and some combination of natural talent, infrastructure and opportunity that determine roughly how successful an average person in that country can be at soccer. I’ll call the combined second thing national soccer affinity, and will immediately note that it’s a huge simplification to throw so many disparate things into a single factor. 一支国家足球队的成功主要依赖于以下两个因素:可供他们选择的球员人数,还有某种天赋、基础设施和机遇的组合,这大体上决定了这个国家的普通人能在足球方面所能达到的平均高度。在后文中我会将这种组合称作一个国家的“足球亲和性”,并会很快提到将如此多不相关的东西整合到一个因子里实际上是一种极大的简化做法。 My goal is to separate the effects of population, so affinity is basically everything that’s independent of a country’s total size. I am making no guesses regarding the components of soccer affinity (maybe it’s all about having enough sunshine days for kids to play outdoors), only in the comparison between countries. The question I want to investigate is: 我的目标是将人口因素单独分离出来,所以“亲和性”这个概念基本就是所有与一个国家的人口数量不相关的因素。我也不会对“足球亲和性”这个概念的具体组成做任何的猜测(也许它只涉及有足够多晴朗的日子让孩子们在室外踢球),而仅仅是在国家之间进行比较。我想探索的问题是: Relative to their population, which countries are the best and worst at soccer? And why? 相对于其人口数量,哪些国家在足球方面做得最好?而哪些国家又做得最差?为什么? #165-3 If we imagine that soccer affinity is normally distributed, a country’s population is the size of the bell curve and the national affinity is how far to the right on the ability axis the center of the bell curve is. The level of a country’s national team is how far on the ability axis the best 11 men and women are. 如果我们假设“足球亲和性”这个因子服从正态分布,一国的人口就是钟形曲线的面积,而一个国家的“足球亲和性”则可以被定义为钟形曲线的中心线在能力轴上的投影与原点之间的距离。而该国国家队的水平则取决于该国最优秀的11名男球员和女球员在能力轴上所处的位置。 Clearly, having a larger bell curve (more people at every level of play) and shifting the curve to the right (better players on average) should both contribute to boosting the level of the national team. The fact that there are over 15,000 Chinese for each Antiguan, and yet the soccer teams are comparable in level, presents the following puzzle: 很显然,拥有一个面积更大的钟形曲线(在各种水平上都拥有更多的人口)以及让钟形曲线向右移动(更高的球员平均水平)都有助于提升一国国家队的水平。而中国的人口是安提瓜人口的15000倍,但这两国的国家队水平却处于同一档次这一事实则向我们提出了如下的难题: Why does it seem that national team level depends on affinity much more than on population? 为什么国家足球队的水平对“足球亲和性”的依赖程度要远远高于对人口的依赖程度? The answer to that puzzle is: Because the tails of a normal distribution fall much faster than you think. 而这个问题的答案是:因为一个正态分布的尾部下降的速率比你想象的要快得多。 In plain(er) English: every point on a bell curve is some distance away from the middle (the mean). The further away from the mean you go the less points there are (lower curve). These distances are often measured in standard deviations, or SD, shown by the vertical red lines on the picture. On a standard bell curve, just over 68% of the points are found a distance of less than 1 SD from the mean in either direction. 更直白的就是:钟形曲线上每个点和中心(也就是平均值)都存在一个距离。与平均值的距离越远,这个水平上的点数也就越少(在曲线上就越低)。而与中心的距离通常是以标准差计的(在图中用红色的垂直线条表示)。在一个标准的钟形曲线上,有68%的点都会落在均值两端一个标准差的距离之内。 #165-4 Looking naively at the familiar bell picture, it seems that the curve drops sharply over the first 2 or 3 SD to either side and then levels off around 0 when you move further away. That’s extremely misleading: the relative height of the curve actually drops faster the further out you go. It’s invisible on the chart because the line further than 3 SD out is squished very close to 0. The height of the curve at 1 SD is 4.5 times higher than that at 2 SD. The curve at 5 SD is 250 times higher than that at 6 SD and it keeps getting steeper and steeper. 如果我们直观地看一下这条熟悉的钟形曲线,看起来曲线两端在距离中心最初的两三个标准差内下降得非常快,而在之后更远的距离上就会在零附近以一种接近水平的方式缓慢下降。而这实际上会造成巨大的误导:事实上,距离中心越远,曲线的相对高度下降的速度越快。但由于在3个标准差之外,曲线被压缩到了非常接近0的高度,所以在图上我们看不到。曲线上1标准差处的高度是2标准差处的4.5倍,而5标准差处的高度则是6标准差处的250倍,而随着离中心越来越远,曲线的陡峭程度还在不断上升。 The best male soccer player in China (Zheng Zhi?) is almost literally one in a billion, which means that he’s almost 6 standard deviation better than the average Chinese. If the population of China doubled (they’re working on it!), there would be 2 players as good as Zheng is. However, if the population of China became just one standard deviation better at soccer, there would be over 200 players at least as good, and a few dozen who are much better. 中国最好的男性球员(是郑智吗?)在中国差不多是十亿里挑一了,这意味着他的水平比中国人的平均足球水平要高6个标准差。如果中国的人口增加一倍(他们的确在努力这么干!),那么中国将会出现两个和郑智一样优秀的球员。然而,如果中国人的平均足球水平能够提高一个标准差的话,那么中国就会有超过200名球员和郑智水平一样高了,而且还会有几十名球员的水平比他高得多。 It could be that a normally distributed soccer skill model is wholly wrong, but it does seem to explain some of what we see in reality. For anything that’s distributed roughly like a bell curve, the quality of the best people in a large enough group (like a country) depends much more on small differences in the average level than on large differences in total population. Hey, I wonder if that’s why so many Nobel prize winners are… *gets repeatedly electrocuted* 实际上这个正态分布的足球水平模型可能是完全错误的,但是它看起来的确解释了一些我们在现实中观察到的现象。对于任何一个分布接近钟形曲线的群体,在一个足够大的群体(比如一个国家)中,水平最高者的能力更多地取决于平均水平上的微小差异,而人口总数上的巨大差异所发挥作用则要小得多。嘿,现在我开始怀疑这就是为什么如此多的诺贝尔奖得主都死于触电的原因了。 Whoops, sorry about that. Let’s see this effect in action on the one trait that we can all agree is close to normally distributed and varies among nations: human height. 抱歉这个梗有点欠。让我们通过一个特征来看看这种效应的实际力量,该特征的近似正态分布得到了大家认可,而且在国家间存在差异:那就是人的身高。 The average Indian dude (sorry for the androcentrism, ladies, there’s just better data on male heights and male soccer teams) is 165 cm (5′ 5″) and there are roughly 630 million of them. The average Norwegian dude is 180 cm (5′ 11″) and there are 2.5 million. The standard deviation of male height is around 6 cm around the world. If heights were distributed in a perfect normal bell curve with those parameters they would look like: 印度6.3亿成年男性(女士们,抱歉了,这里看起来似乎有点大男子主义,但有关男性身高和男子足球队的数据质量的确更好)的平均身高是165厘米(5英尺5英寸)。而挪威250万成年男性的平均身高则是180厘米(5英尺11英寸)。全世界身高的标准差大约是6厘米。如果身高完全服从一个由这些参数构建的正态钟形分布,那么看起来将会像下图这样: #165-5 As we plot them side by side, the Indian curve completely dwarfs the Norwegian one, even for pretty tall dudes. There are 9 Indians who are exactly 180 cm (5′ 11″) tall for every Norwegian. 5′ 11″ is tall, but not super tall. The higher mean effect only kicks in for the real outliers, so let’s zoom the above plot in to the really tall dudes. 当我们把整个分布画在一起,印度的曲线看起来完全压倒了挪威的曲线,即使对于身高很高的成年男性也是这样。印度和挪威身高180(5英尺11英寸)厘米的人口数量比例是9比1。5英尺11英寸算是高了,但并不是非常高。高均值效应只有在那些真正的异常值上才会起作用,那么让我们将图上那些真的很高的成年男性所对应的部分放大看看。 #165-6 Here, the picture reverses completely. There are 100 times as many Norwegians above 195 cm (6′ 4″) as there are Indians. Under a normal distribution assumption, the tallest Indian at 6′ 7″ would only match the 1,000th tallest Norwegian. 在这里,情况完全颠倒了过来。身高超过195厘米(6英尺4英寸)的成年男性数量,挪威和印度的比例是100比1。在正态分布的假设之下,印度最高的成年男性的身高将是6英尺7英寸,而这个身高在挪威人中只能排在第1000位。 It’s important to remember that a normal bell curve is a very simplistic model, real life is messy, and Dharmendra Singh is 8′ 1″. Even inside the realm of mathematics, a normal distribution has narrower tails (the height drops faster as you get away from the mean) than most other widely used distributions that look sorta like a bell curve (like the student’s t or the gamma distributions). A normal model underestimates the number of outliers and overstates the importance of shifting the mean. 我们必须记住的是,正态分布的钟形曲线是一种非常简化的模型,真实情况要复杂得多,实际上印度最高的男性Dharmendra Singh的身高是8英尺1英寸。即使在数学王国中,相比其他大多数常用的看起来像钟形曲线的分布(例如学生t分布或gamma分布),正态分布也有着窄得多的尾部(这意味着在远离均值时,曲线下降的速度更快)。一个正态分布模型会低估异常值点的数量,同时会高估平均值移动的重要性。 With that said, my main point stands: it should not surprise anyone that the achievement of extreme performers doesn’t strongly depend on the population of a country but does on the average. There doesn’t have to be something horribly wrong with China to account for its disappointing soccer team, they could be just a little bit to the left of other countries on national soccer affinity. 但即使考虑到这些情况,我的主要观点仍然成立:那些表现极端出众的个体的出现并不太依赖于一国的人口数量,而非常依赖于该国在这方面的平均水平。中国国家足球队令人失望的表现背后也许并没有什么错得离谱的东西,这也许只是因为中国在“足球亲和性”的分布上稍微靠左了一些而已。 We still don’t know what makes up soccer affinity, just that it’s enough to explain the disconnect between populations and team performance. With the math lesson behind us comes the fun part: in the next posts we’ll rank the world’s countries by average soccer affinity, throw a bunch of data at it to see what it correlates with, and see if can get any insight into what makes countries good or bad at soccer. 我们仍然不知道“足球亲和性”是由哪些因素构成的,但它足以解释人口和团队表现之间脱节的现象。在我们的这节数学课之后才是真正有趣的部分:在接下来的几篇文章中,我们将会把世界各国按照平均的“足球亲和性”进行排名,通过一系列的数据来看看它与哪些因素相关,并试着获得一些关于是什么让一个国家在足球方面表现得好或不好的深入见解。 (编辑:辉格@whigzhou) *注:本译文未经原作者授权,本站对原文不持有也不主张任何权利,如果你恰好对原文拥有权益并希望我们移除相关内容,请私信联系,我们会立即作出响应。

——海德沙龙·翻译组,致力于将英文世界的好文章搬进中文世界——

[微言]贫富差距与统计幻觉

【2015-08-11】

@海德沙龙 【焦点议题】有关贫富差距的数字常令公众大吃一惊,但许多抓人眼球的惊人“差距、变化”,其实往往是统计假象,其背后根本没有人们以为它所揭示的事实,同一组数据,平凡还是惊艳,更多取决于如何组织和表述它,本文分析了其中一例,今后我们还会介绍更多 http://t.cn/RLmCHik

@whigzhou: 所以在对统计数字发出感慨之前,最好先弄清楚统计指标是怎么设计的,然后再想想差异或变化到底是不是你打算感叹的那个因素造成的。举个简单例子,假如(more...)

标签: | |
6382
【2015-08-11】 @海德沙龙 【焦点议题】有关贫富差距的数字常令公众大吃一惊,但许多抓人眼球的惊人“差距、变化”,其实往往是统计假象,其背后根本没有人们以为它所揭示的事实,同一组数据,平凡还是惊艳,更多取决于如何组织和表述它,本文分析了其中一例,今后我们还会介绍更多 http://t.cn/RLmCHik @whigzhou: 所以在对统计数字发出感慨之前,最好先弄清楚统计指标是怎么设计的,然后再想想差异或变化到底是不是你打算感叹的那个因素造成的。举个简单例子,假如收入基尼系数以家户为统计单位,那么仅仅年轻人提早离家单过、家户规模缩小这一个因素,即可显著提高基尼系数。 @用户3548260260:这篇。。。完全就是逆向恰亚诺夫循环呀。之前在秦晖桑书上说到一个俄国民粹派专家叫恰亚诺夫,他推崇俄国传统村社经济,要反驳西化派关于乡村不平等的批判,就找出了年岁变换这个要素。据他研究,一个家庭中,有小孩时就会成本上升, @whigzhou: 家户规模是我举的例子,原文是有关时间偏好的 @用户3548260260:是的,不过我想他们都指向一点,就是也许贫富差距不一定是意味着社会分层,也蕴含着其它可能,而常常被好心人士忽略咯,就像你说的一样,也像常说的一句话:统计是门语言,不是科学 @whigzhou: 嗯嗯  
[译文]贫富差距中的人口奥秘

Top 1%, across states
最富有的1%,州与州间的比较

作者:Salil Mehta @ 2015-1-31
译者: 一声叹息    校对:小册子(@昵称被抢的小册子)
来源:Statistical Ideas,http://statisticalideas.blogspot.com/2015/01/top-1-across-states.html

Short-term update: this article has been fancied by some of the country’s most esteemed economists and featured in popular outlets such as Marginal Revolution (by a frequent NYT Upshot writer), supported by EconLog’s Arnold Kling, favorably tweeted by research head at Oxfam, and aknowledged by the Deans of three schools (one in public policy, and two in business).  Within the first day generated >100 facebook shares, and >100 tweets, from various media here.
近期动态:本文深受本国一些声誉卓著的经济学家的喜爱,被“边际革命”等流行站点作为专题文章推荐。本文也得到了来自EconLog的Arnold Kling的大力支持,被乐施会的研究院领导热情转发,并得到三个学院(一个公共政策学院,两个商学院)院长的认可。本文发出第一天,就被这里的媒体在Facebook上分享100多次,在Twitter上推送100多次。

It’s an appealing chart from this week’s Economic Policy Institute’s report, leveraging the fashionable, French economist Piketty’s statistics, in order to illustrate how well the “top 1%” are doing in each of the 50 states. The report is provokingly titled: “The Increasingly Unequal States of America”.

经济政策研究所(EPI)本周的一份报告中,给出了一张富有吸引力的图表,引用了法国新潮的经济学家皮凯蒂的统计资料,用以说明“最富有的1%”在每个州过得有多滋润。这份报告有个煽动性的标题:美利坚大不平等国。

But the report creates distortions in the truth.  An important matter affecting hundreds of millions should also include a straight acknowledgement of probability theory.  We see through this article, that beyond the obvious national-level inequality (those at the top versus those at the bottom), targeting s(more...)

标签: |
5864
Top 1%, across states 最富有的1%,州与州间的比较 作者:Salil Mehta @ 2015-1-31 译者: 一声叹息    校对:小册子(@昵称被抢的小册子) 来源:Statistical Ideas,http://statisticalideas.blogspot.com/2015/01/top-1-across-states.html Short-term update: this article has been fancied by some of the country's most esteemed economists and featured in popular outlets such as Marginal Revolution (by a frequent NYT Upshot writer), supported by EconLog's Arnold Kling, favorably tweeted by research head at Oxfam, and aknowledged by the Deans of three schools (one in public policy, and two in business).  Within the first day generated >100 facebook shares, and >100 tweets, from various media here. 近期动态:本文深受本国一些声誉卓著的经济学家的喜爱,被“边际革命”等流行站点作为专题文章推荐。本文也得到了来自EconLog的Arnold Kling的大力支持,被乐施会的研究院领导热情转发,并得到三个学院(一个公共政策学院,两个商学院)院长的认可。本文发出第一天,就被这里的媒体在Facebook上分享100多次,在Twitter上推送100多次。 It's an appealing chart from this week's Economic Policy Institute's report, leveraging the fashionable, French economist Piketty's statistics, in order to illustrate how well the "top 1%" are doing in each of the 50 states. The report is provokingly titled: "The Increasingly Unequal States of America". 经济政策研究所(EPI)本周的一份报告中,给出了一张富有吸引力的图表,引用了法国新潮的经济学家皮凯蒂的统计资料,用以说明“最富有的1%”在每个州过得有多滋润。这份报告有个煽动性的标题:美利坚大不平等国。 But the report creates distortions in the truth.  An important matter affecting hundreds of millions should also include a straight acknowledgement of probability theory.  We see through this article, that beyond the obvious national-level inequality (those at the top versus those at the bottom), targeting state-level differences in values is perverse.  The latter is more a matter of probability theory, involving large sample sizes. 可是该报告扭曲了事实。做一件影响亿万人的事情显然应该考虑到概率论知识。通过本文,我们将会看到,越出全国层次上的显著不平等(顶层收入对底层收入)之外,而将焦点对准各州层次上的数值差异,是有悖常理的做法。在样本足够大的情况下,后者更多的只是一个概率问题。 Let's start by looking at this chart below.  It shows the differences in state-level ratios, contrasting the typical incomes at the top 1% versus the typical incomes at the bottom 1%【译注:原文如此,疑似99%之讹】: 让我们先从下表开始。该表显示了美国各州最富有的1%和其余99%的收入比率。 【插图-How unequal is your state 你所在的州有多不均等 The ratio between the average incomes of the top 1% and the bottom 99% in each state 最富的1%和其余99%的平均收入比率】 Everyone from the press, to news readers, gawk at how much each state's levels are, in relation to the levels of other arbitrary states.  But this is irrational.  Are "liberal" states such as California and New York, twice as biased (or twice as unfair) as "conservative" states such of Arkansas and Maine? 无论是媒体业者还是新闻阅读者,都会被表中所见任意两个州之间的数字差异吓得目瞪口呆。但这是非理性的。难道加利福尼亚和纽约这样的“进步”州,真的比阿肯色和缅因这样的“保守”州贫富悬殊程度高两倍,或者说不公平程度高两倍吗? Of course not.  But that's the poor logic one would convey from the former two states showing nearly twice the inequality values on the chart, versus the latter two states.  Ex-post examination of living costs doesn't fully explain things either, as expenses are generally higher in states such as Vermont, Alaska, and Hawaii, versus the expenses in states such as Texas, Illinois, and most of Appalachia. 当然不是。可按照表中数据,前两个州的不均等程度几乎是后两州的两倍,人们就很容易会得出这样简单的结论。即使把生活成本差异考虑在内,也不能完全解释上述现象,因为在诸如佛蒙特、阿拉斯加和夏威夷等州,生活成本通常还要高于德克萨斯、伊利诺斯和阿巴拉契亚地区的多数州。 This week I devoted a couple hours spelling out to a confused Wall Street Journal reporter how there is some pertinence here, related to probability theory.  Population size theoretically impacts these statistics, and only a small number of these states are grossly unequal enough to warrant exhibiting them through a charming, 50-state map.  Whenever we are forced to explore state-level analysis though, it should be done through the prism of simply explaining relative variation, well beyond what random luck would suggest. 本周,我花了几个小时向一位为此感到困惑的《华尔街日报》记者详细解释,为什么这件事和概率论有些关系。理论上,人口数量会影响这些统计数据,而仅有少数州的不均等程度会真的高到像这幅漂亮的50州地图中所显示的那样。然而当我们不得不在各州层次上做分析时,应该通过具有直白解释力的相对差异,而不是被随机巧合造成的结果所迷惑。 The most liberal people suggest that even thinking about this math is unnecessary.  Perhaps any glorification of wrongs that need to be righted, justifies the means that it would take to get there.  Over time this can conflate math ideas with one's ideological bias.  We must separate the discussion of national and structural inequality, among a population, from one where there is a perceived advantage for some groups relative to others. 最坚定的自由派认为根本没必要考虑这个数学问题。或许,那些有待纠正的错误,只是因为其结论被颂扬,得出这些结论所用的方法也就被认为是正当的了。长此以往,人们就会根据自己的意识形态偏见来选择性地理解运用数学概念了。我们必须将有关人口群体之间的全国性和结构性不均等的讨论,和大家所感觉到的部分人相对于其他人的优势区分开来。 We can't prove the inappropriateness of inequality, by looking at the differences in relative inequality between states.  We enjoy the right in the U.S. to pursue different outcomes.  To take risks on the margins.  We've been making many billions of these choices, across generations.  This means that we always enjoy some separation in outcomes, particularly among the largest populations. 我们不能仅仅依据相对不均等程度在各州之间的差异,来证明不均等状况的不恰当性。在美国,我们享有追求不同成就的权利。为了利润,我们甘愿冒险,代复一代,我们的冒险决定价值数以十亿计。这说明我们总是乐于接受各人的回报有所差异,特别是在我们这么庞大的人口之中。 That's how probability theory impacts all of our lives, even in "equal" conditions.  We would prefer a safety net against hard times.  Yet we will still take risks such as how much and what we learn, what we feed our bodies, what we do on vacation, what financial investments we accept, and when we plot a career change... the actions here can't be deemed some unfair inequality.  They are the mystical elements we call life! 所以即使是在完全“平等”的条件下,概率也会造成我们生活水平的差异。我们倾向于有一个安全网来渡过艰难时期。但是对于诸如学习什么、吃什么、如何度假、如何投资以及何时转换工作等等这些事情,我们仍然愿意承受风险。这些有差别的选择并不是不公平的。这些神秘元素,按我们的说法,恰恰就是生活! In a divergent context, we will always see these interesting differences (based upon population size alone), in areas that have nothing to do with the topic of inequality.  Such as the state-level distribution of newborn baby sizes, or the performance distributions of high-school athletes.  We'll mention others still later in this article.  And all of this collectively confirms our understanding that there is something important to the probability math, explaining the relative dispersions connected to population sizes.  All other hypotheses are secondary. 在一个千差万别的世界里,我们总能看到这些有趣的差异(样本总量够大就行了),哪怕在没有不平等的情况下也是如此。比如各州新生儿重量值分布,或者高中生的体育水平分布。后面我们还会提到其他例子。所有这些综合起来,支持了我们的这一观察发现:在这里所涉及的概率分析中,以人口规模解释相对离散度才是要点所在,所有其他假说都是次要的。 Before moving too far ahead, let's first show that the Economic Policy Institute (EPI) chart above has a clear concordance between income dispersion and the population size itself.  We'll show this with simple arithmetic(!) as well, substituting for a complex probability area known as copula math (here, and here).  If there were no connection between a state's inequality calculations and the population rank, then how many states would be in the top 10 of both?  What about in the bottom 10 of both?  The answers are quite low: (10/50)*(10/50)*50 = 2 states in the top 10 of both of both variables (10/50)*(10/50)*50 = 2 states in the bottom 10 of both variables 在此问题上,首先让我们来证明,上述EPI的图表显示了,收入离差度(dispersion)和人口规模有着明确的一致性。我们将用简单的算术(!),来代替被称为关联结构(又称耦合)的复杂概率论数学方法。如果一个州的收入不均与其人口总量排名无关的话,有多少个州能在两项同时排前十名?又有多少个州能在两项同时排后十名呢?答案是:数量相当低。 (10/50)*(10/50)*50 = 2 个州同时在两项排名进入前十 (10/50)*(10/50)*50 = 2个州同时在两项排名进入后十 So only 4 (2+2) states total.  But it's easy to certify from the chart that there is much more of a match among these variables than 4. 所以仅有四个州符合上述条件。但在上述EPI图表中,我们很容易找到多于4个州有这种配对关系。 Of the top 10 populated states, 5 were also among the top 10 "unequal" states: CA, TX, FL, NY, IL.  Of the 10 least populated states, 4 were also among the 10 least "unequal" states: VT, AK, ME, HI.  So instead of 4 overlapping states, we have a significantly higher 9 (5+4) states overlapping.  Additionally, there are no crossover states (e.g., a highly "unequal" less-populated state, nor a less "unequal" highly-populated state).  The easy math (9>4 with no crossovers) shows something, and it's not structural inequality. 在人口最多的十个州里面,其中五个同样也在“不均等”程度最高的十个州里:加利福尼亚、得克萨斯、佛罗里达、纽约和伊利诺伊。在人口最少的十个州里面,有四个同样也跻身于“不均等”程度最低的十个州之列:佛蒙特、阿拉斯加、缅因和夏威夷。所以,重叠的州不止四个,而有九个之多。并且,没有交叉的州出现(例如“不均等”程度高且人口少,或“不均等”程度低且人口多)。简单的数学(9>4且没有交叉)确实说明了些什么,这不是结构性不平等。 The only common variable between the selection of the top 10 (and in the selection of the bottom 10) populated states is just population size itself!  Does population size coerce inequality?  Again, no.  Otherwise we could just split California into two smaller states, making citizens suddenly feel there is somehow "greater equality".  Or we could reunite Virginia and West Virginia, making the new super-state's citizens feel there is magically "greater inequality".  But this sort of statistical reasoning is crazy.  It leads one to think inequality can be solved with scissors and glue. 在选择人口最多的10个州时(以及选择人口最少10个州时),唯一的共同变量就只是人口规模。难道人口规模大就催生不平等吗?也不是。否则,我们可以把加利福尼亚拆成两个较小的州,然后那里的居民不知怎么就突然感到更公平了。又或者我们把维吉尼亚和西维吉尼亚重新合并在一起,使得这个新超级大州的居民魔幻般地感觉到更不公平了。但这种统计推理是疯狂的,好像会使人以为不公平可以通过剪刀和胶水解决。 In our popular "Aristocrats in flyovers" article (a name suggesting easier state-level wealth in less-populated flyover states), we dig into the probability theory of extreme data.  And there we continue to see this pattern show up repeatedly in diverse datasets and distribution types.  Such as the wealthiest individual per state, or the number of cumulative Miss America winners per state.  Again this couldn't be coincidence, and we can also mathematically solve for the theoretical expected values for parametric most extreme individual, as we did in this article here. 在我们颇受欢迎的文章《内陆州的贵族》(标题暗示了在较少人口的内陆州里较易达到州内顶级财富水平【译注:原文这句话意思不太清楚,不过从所引文章内容看,好像是这个意思】)中,我们深入探讨了有关极端数据的概率论。我们能够看到这种模式在多样化的数据集和分布类型中不断出现。比如每个州最富有的人,或是每个州的美国小姐累计人数。这也不可能是巧合,我们可以在数学上解出对于极端个体参数的理论期望值,就像我们在这篇文章里做的那样。 Take a look at the bar chart below (of the top 1% income), and see in dots how tight the trend is for relative inequality, versus state population.  We would theoretically calculate that the more populated states to have considerably higher top 1% income (a double-digit percent increase!), versus the top 1% income in the less populated states- and this relates to the EPI chart above.  Connecticut was the single, unreliable outlier removed, using a parallel statistical process others also do (notice Wyoming is missing in the aforementioned chart.) 从下面(有关最富的1%)的柱状图我们可以看到,各州的相对不均等水平与其人口规模之间的联系相当紧密。我们能从理论上计算出,人口大州相对于人口小州来说,其最富的1%的收入也明显较高(两位数百分点的提高!),这与上述EPI图表密切相关。康涅狄格州是唯一的例外,当使用其他州同样使用的平行统计处理时,显示为异常样本,所以被剔除了(注意怀俄明州不在上述图表中)。 【图标 – 柱状图 – 相对不均与人口数量】 We also show a related transformed bar chart, below, instead fixating on changes in the the relative standard confidence interval (as one moves across the chart from the less populated states, to the most populated states).  We can now confirm that we mustn't ignore probability modeling as part of this story.  We can't persistently pretend that less-progressively larger (a generally concave inequality dispersion function, similar to how it is with most economic data) inequality doesn't exist, for the most populated states. 我们还在下面做了一张与此相关但经过整理的图表,在表中我们将各州按人口规模排序,固定标准置信区间的相对变化。于是我们可以确信不能忽略概率模型在此问题上的作用了。对于人口大州来说,我们再也不能假装逐渐变大(一个大致凹形的不均等分布函数,与其他经济数据类似)的不均等不存在。 【图标 – 柱状图 – 经整理后的相对不均与人口数量】 Don't assume -as many lay people and activists do- that these are sampling errors that must vanish, as the sample population sprouts into the millions.  This would be deceitful and cause most people to further jump on top of similar "research" as the EPI chart, falsely connecting most of the state-level calculation differences to genuine differences in inequality. 不要像很多外行人和活动家那样,以为这些只是因为样本太小带来的错误结果,然后随着样本大小增加到百万级后就会消失。这个设想是欺骗性的,而且将令大多数人追随EPI图表之类的“研究”结果,错误的把按州计算的差异同真正的不均等差异联系起来。 The conclusions of this article are again as pertinent for the top 1% in a population, as it is for the most extreme person in any group.  This is since the top 1% is still extreme enough along the probability distribution (from 0%, to 100%), so that larger populations will lead to less-progressively larger, top percentile thresholds.  Of course this is not true for sampling (Ch.5 in Statistics Topics) closer to the middle of a peer distribution (e.g., top 49%, or bottom 49%), where most of us in society have performed  through the ages. 本文的结论不仅适用于人口中最富有的1%,也同样适用于任何群体中的极端个体。从概率分布(从0%到100%)的角度来讲,最富有的1%已经足够极端,所以人口数量越大,在顶端部分的跳升就会越不渐进。当然,当取样规模(《统计问题》第五章)接近整体的中间位置时(例如最高的49%,或者最低的49%),结论就不一样了,自古以来我们当中的大多数人都在这一阶层里。 (编辑:辉格@whigzhou) *注:本译文未经原作者授权,本站对原文不持有也不主张任何权利,如果你恰好对原文拥有权益并希望我们移除相关内容,请私信联系,我们会立即作出响应。

——海德沙龙·翻译组,致力于将英文世界的好文章搬进中文世界——

[微言]脾气与工资

【2012-12-24】

@P_Slacker:美国康奈尔大学的一项调查称:容易相处的员工,薪酬明显低于脾气不太好的员工。

@宮鈴_胡同台妹: 這是要大家脾氣壞一點嗎?

@局外人c的空间 应该的,所谓“二球”,在中国就很占便宜。请教@高利明 @whigzhou

@whigzhou: 这种事情最好做跟踪研究,仅凭几个数字你不知道究竟发生了什么

@whigzhou: 可能之一:脾气差的晋升慢,因而比同等职位者年资长,所以工资高;可能之二:脾气差的失业率高,但没失业那些工资也偏高;可能之三:脾(more...)

标签: |
4804
【2012-12-24】 @P_Slacker:美国康奈尔大学的一项调查称:容易相处的员工,薪酬明显低于脾气不太好的员工。 @宮鈴_胡同台妹: 這是要大家脾氣壞一點嗎? @局外人c的空间 应该的,所谓“二球”,在中国就很占便宜。请教@高利明 @whigzhou @whigzhou: 这种事情最好做跟踪研究,仅凭几个数字你不知道究竟发生了什么 @whigzhou: 可能之一:脾气差的晋升慢,因而比同等职位者年资长,所以工资高;可能之二:脾气差的失业率高,但没失业那些工资也偏高;可能之三:脾气差的闯祸多,同时工资也高,但净收益并不高;可能之四:坏脾气是一种特殊资源,可以帮老板解决某些特定问题,因而获得额外报酬…… @被打飞:最可能的是坏脾气但还能留下来的人要么有啥独门绝技,要么是老板二奶之类的沾亲带故,所以工资高。又没关系又没本事的自然脾气得好点-然后挣得也少。 @whigzhou: 嗯 @高利明:1和2的逻辑是不是有个隐含前提,“脾气差是一种劣势”?如果把脾气差与脾气好都视为竞争策略,选择脾气差这个策略的考量是什么? @whigzhou: 各种性格类型应该都有策略起源,但错误搭配或情境变迁会让原本适应的策略变成劣势,坏脾气这个归类过于笼统,不同激怒条件和怒后反应组合起来是不同的策略 @whigzhou: 从策略角度看,“怒”本身只是个工具,只有被搭配组合进某个行为序列后才构成策略,用来虚张声势、恫吓、明确边界、报复、执行规则等  
[微言]恩格尔系数的启示

【2012-09-28】

@whigzhou: #读史笔记#在几十上百年这样的大跨度上,用GDP/收入/消费额之类的指标来衡量发展,除了用作横向比较外,意义似乎不大,恩格尔系数之类能体现消费结构变化的指标更能说明问题,比如可以设计这样一组指标:1)Ci是第i年中等收入消费者的典型消费组合,2)Pi=第i年的Cj价格/第i年中位收入(j=i-10)

@whigzhou: 计算第i年的Cj价格时,条件可放松为:买到的商品组合不必完全一致,功能上不差于它即可,因为10年前的东西可(more...)

标签: | | |
4548
【2012-09-28】 @whigzhou: #读史笔记#在几十上百年这样的大跨度上,用GDP/收入/消费额之类的指标来衡量发展,除了用作横向比较外,意义似乎不大,恩格尔系数之类能体现消费结构变化的指标更能说明问题,比如可以设计这样一组指标:1)Ci是第i年中等收入消费者的典型消费组合,2)Pi=第i年的Cj价格/第i年中位收入(j=i-10) @whigzhou: 计算第i年的Cj价格时,条件可放松为:买到的商品组合不必完全一致,功能上不差于它即可,因为10年前的东西可能已经是老古董,一般价钱买不到了 @whigzhou: 该指标大致体现了:对任意年份,一个中等收入者要想过上十年前那种生活,需要花掉他当前收入的多大比例,我觉得这是对长期发展的恰当度量 @whigzhou: 按传统指标,改进和发展最快的那些领域对生活的改善被远远低估了 @whigzhou: 比如某人每两年花3000换一部手机,每5年花10万换一部车,传统指标上显示不出什么变化,但10年前后的手机和车带给他的便利大为不同 【后记】更精细的度量可以针对各收入阶层分别进行。  
[微言]增值税与消费率

【2012-03-05】

@whigzhou: 刚才跟朋友讨论增值税问题,突然想到一点:私人消费/GDP只有百分之二十多,低的不可思议,或许跟避税有关?可能很多中小超市(或其供应商,原理相同)把开票额度卖掉了,于是许多零售额被统计成了企业开支?瞎猜

@小野猪君:恩,购物卡、在超市门口搜集小票的大妈、超市/宜家开发票处长长的队伍,都把私人消费变成企业支出了。。。现在最好的是网购,京东亚马逊等等,不管买什么都能开办公用品一项

@学经济家:共鸣。一直存(more...)

标签: | | |
4130
【2012-03-05】 @whigzhou: 刚才跟朋友讨论增值税问题,突然想到一点:私人消费/GDP只有百分之二十多,低的不可思议,或许跟避税有关?可能很多中小超市(或其供应商,原理相同)把开票额度卖掉了,于是许多零售额被统计成了企业开支?瞎猜 @小野猪君:恩,购物卡、在超市门口搜集小票的大妈、超市/宜家开发票处长长的队伍,都把私人消费变成企业支出了。。。现在最好的是网购,京东亚马逊等等,不管买什么都能开办公用品一项 @学经济家:共鸣。一直存了这个念头,想整理增值税制与私人消费/GDP比例等的关系脉络,但没有砥砺弄不出来,再欠一个坑,呵呵。 @西峯: 统计中,购买车房都计入投资项下。 @whigzhou: 这个倒不算错,假如在其存续期内分期折算进消费的话  
“XX差异中,50%可归因于YY”是什么意思?

我在上上个帖子里提到“性格差异中,50%左右可归因于遗传”,这里我所引用的是Matt Ridley的说法(见《先天,后天》),后来看到几位该领域的牛人说法类似,我由此判断该说法是主流。

李敖之提到的双胞胎研究,Matt Ridley在他的书里做了详细介绍,显然,他是不会错过这一类信息的。

这里我想澄清的是统计学概念,说“性格差异中,50%左右可归因于遗传”,并不意味着:假如两个人具有完(more...)

标签: |
395

我在上上个帖子里提到“性格差异中,50%左右可归因于遗传”,这里我所引用的是Matt Ridley的说法(见《先天,后天》),后来看到几位该领域的牛人说法类似,我由此判断该说法是主流。

李敖之提到的双胞胎研究,Matt Ridley在他的书里做了详细介绍,显然,他是不会错过这一类信息的。

这里我想澄清的是统计学概念,说“性格差异中,50%左右可归因于遗传”,并不意味着:假如两个人具有完全相同基因基础,其他条件随机,则其性格相似度为50%,或者性格相似度为50%+p,p=0-0.5的随机数。

不是这样的,“XX差异中,50%可归因于YY”这一命题,仅仅意味着:若剔除YY的影响,则样本集的均方差减小一半。

这个均方差,或许原本就很小,或许,人类的性格原本就共同特征远多于个体差异,所以,一对被分开领养的同卵双胞胎,其性格相似度远大于50%,这与我介绍的说法没有冲突。

关于哈佛医学院考题,赞牛友

我多次向朋友宣扬:平均来说,牛博读者的水平比牛博作者高,而且高不少。看来此言不虚——咦?貌似这句话不适合用在自己头上?呵呵,偶尔犯一下戒。

牛友们很少简单的回答0.95,这已经大大超出哈佛水平啦,而我个人认为,答得最好的是foo,他不仅给出了正确的解法和答案,还正确的指出了假阴性率也是个相关变量,我在看到他的答复之前就没意识到这一点,惭愧。(不过就本题而言,假阴性率的高低对计算结果影响极微。)

在屏蔽交流机会的闭卷考试中,大部分错误答案(more...)

标签: |
435

我多次向朋友宣扬:平均来说,牛博读者的水平比牛博作者高,而且高不少。看来此言不虚——咦?貌似这句话不适合用在自己头上?呵呵,偶尔犯一下戒。

牛友们很少简单的回答0.95,这已经大大超出哈佛水平啦,而我个人认为,答得最好的是foo,他不仅给出了正确的解法和答案,还正确的指出了假阴性率也是个相关变量,我在看到他的答复之前就没意识到这一点,惭愧。(不过就本题而言,假阴性率的高低对计算结果影响极微。)

在屏蔽交流机会的闭卷考试中,大部分错误答案都是0.95,得出这个错误的原因是,答题者没有意识到自己无意中错误地把假阳性率理解为“得到阳性结果的样本中实际无病样本的比例”(定义A),而不是它的正确定义“得到阳性结果且实际无病的样本占总无病样本数的比例”(定义B)。

对于我这个认定,有朋友可能会问:既然你没有给出假阳性率的定义,凭什么我不能这么理解?对此我的回答是:如果采用定义A,那么假阳性率这个概念就不可能具有任何统计学意义。试想:假如我用一个已知全部无病的样本集去做这个测试,得到一个阳性样本子集,那么,无论这个测试的误差程度如何,按定义A的假阳性率都将是100%。显然,这样定义的概念是无意义的,这很像罗素的理发师悖论里给出的那种定义。

基于此,我认为不需要流行病学知识,只需要统计学知识,就可以排除定义A,并且得到正确答案。

正如laoyao所说,这个例子表明,在概率问题上,我们的直觉往往会犯错,我记得另一个更有趣的例子是关于三扇门的后面的羊和车的题目,曾经骗倒大批聪明人,包括数学教授。

【花絮】:据说,即使在美国,许多医生在这个问题上都没有搞清楚,因而常常给病人传达错误信息,导致一些不必要的过度恐慌甚至自杀。

出给哈佛医学院60位师生的一道题

据说只有18%的人答对:

If a test [to detect a disease whose prevalence is 1/1000] has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you kn(more...)

标签: |
438

据说只有18%的人答对:

If a test [to detect a disease whose prevalence is 1/1000] has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming you know nothing about the person's symptoms or signs?

注:1)方括号是我加的,避免断句错误,2)false positive=假阳性。