The school did however fear a lawsuit, and so they had statistician Peter Bickel look at the data. The story usually goes that the school was sued for gender discrimination, although this isn’t actually true. At the beginning of the academic year in 1973, UC Berkeley’s graduate school had admitted roughly 44% of their male applicants and 35% of their female applicants. One of the most famous examples of Simpson’s paradox is UC Berkley’s suspected gender-bias. Simpson’s Paradox: A trend or result that is present when data is put into groups that reverses or disappears when the data is combined. The paradox is relatively simple to state, and is often a cause of confusion and misinformation for non-statistically trained audiences: Simpson’s paradox showcases the importance of skepticism and interpreting data with respect to the real world, and also the dangers of oversimplifying a more complex truth by trying to see the whole story from a single data-viewpoint. The art of data science is seeing beyond the data - using and developing methods and tools to get an idea of what that hidden reality looks like. Simpson’s paradox highlights one of my favourite things about data: the need for good intuition regarding the real world and how most data is a finite dimensional representation of a much larger, much more complex domain. The challenge of finding the right view through dataĮdward Hugh Simpson, a statistician and former cryptanalyst at Bletchley Park, described the statistical phenomenon that takes his name in a technical paper in 1951.
0 Comments
Leave a Reply. |