How Do You Know if It Is a Correlation
Correlation is not causation
Why the confusion of these concepts has profound implications, from healthcare to business management
Introduction
In correlated data, a pair of variables are related in that one thing is likely to change when the other does. This human relationship might lead the states to assume that a change to one matter causes the change in the other. This article clarifies that kind of faulty thinking by explaining correlation, causation, and the bias that frequently lumps the 2 together.
The homo encephalon simplifies incoming information, so we can make sense of it. Our brains often do that past making assumptions about things based on slight relationships, or bias. Only that thinking process isn't foolproof. An instance is when nosotros mistake correlation for causation. Bias can brand u.s. conclude that one thing must cause some other if both change in the same way at the aforementioned time. This article clears upward the misconception that correlation equals causation by exploring both of those subjects and the human encephalon'due south tendency toward bias.
About correlation and causation
Correlation is a relationship or connection between two variables where whenever one changes, the other is probable to also change. Merely a alter in one variable doesn't cause the other to change. That's a correlation, only information technology'due south not causation. Your growth from a kid to an adult is an example. When your height increased, your mass increased too. Getting taller didn't make you also become wider. Instead, maturing to machismo caused both variables to increase — that'south causation.
Causation in business
Let's say that we want to offer a promotion or disbelieve to some of our customers. Our marketing department wants to maximize the delta, in other words, the increase in sales every bit a result of the promotion. So we need to make up one's mind which customers volition give us the best return on our investment in the promotion or discount. Exercise nosotros want to offer it just to our top 10% of your clients? Or the bottom ten%?
You might assume that the users who drive more sales are the ones more responsible for your business success. Withal, this assumption could exist wrong. The best selection of which customers to offer the promotion to might be totally different. In the absence of valid experimentation or analytics, you don't take accurate answers to those questions.
Cognitive bias
There are many forms of cognitive bias or irrational thinking patterns that frequently lead to faulty conclusions and economic decisions. These types of cognitive bias are some reasons why people presume false causations in business and marketing:
- Confirmation bias. People desire to be right. They often can't admit or take that they're incorrect about something, even if that attitude causes eventual impairment and loss.
- The illusion of causality. Putting too much weight on your own personal beliefs, over-confidence, and other unproven sources of information frequently produce an illusion of casualty. An economical example is the contempo U.Southward. housing bubble. Millions of people believed that buying a home for much more than its bodily value would continue to result in a return on the investment just because that happened in the past.
- Money. You lot want to sell your product. You might spend more than your return on investment (ROI) on marketing and other business expenses if the want to brand coin clouds your logic.
- Major marketing implications. Marketing statistics and information are often complicated and confusing. It can be piece of cake to see relationships between changing sales numbers and the many other variables in your concern when no causation exists.
Experimentation
To know that something is valuable takes experimentation. Experimentation helps you lot empathize if you're making the right choices. Simply it has a cost. If you hold a workgroup back by not giving them a feature that brings in value, you lot'll lose coin. Just you lot'll larn the importance of that feature.
The value of an experiment lies in the accomplishment of these 2 things:
- Make up one's mind between different choices.
- Quantify the value of the best selection.
Experimental variables
A scientifically valid experiment needs to accept three types of variables: controlled, independent, and dependent:
- A controlled variable is kept constant, so other variables that change in relation to each other can exist measured in a static surround.
- An experiment's independent variable is the only one that can exist changed.
- Dependent variables are the results that are observed when changes are fabricated to independent variables.
Any uncontrolled variables, or mediator variables, can deject an experiment's accuracy. So they need to be identified and eliminated in order to properly appraise the experiment'southward results. Differences in uncontrolled variables tin also impact the human relationship between contained and dependent variables.
Uncontrolled variables add the influence of unrelated factors to an experiment's results. Correlations might be assumed, and an hypothesis might be formed where none exist. Accurate assay becomes hard or incommunicable. Examples of conclusions draw from uncontrolled variables are shown in the children'southward music lessons and mobile telephone cancer examples that follow.
How our brain tricks us
It's easy to watch correlated data change in tandem and assume that 1 thing causes the other. That'due south because our brains are wired for cause-relation cognitive bias. We demand to make sense of large amounts of incoming information, so our brain simplifies information technology. This procedure is called heuristics, and information technology's oftentimes useful and accurate. But not always. An example of where heuristics goes wrong is whenever y'all believe that correlation implies causation.
Spurious correlations
It is a mathematical human relationship in which two or more than events or variables are associated simply not causally related, due to either coincidence or the presence of a certain third, unseen factor
Children and music lessons
After a study of human brain development, researchers ended that kids between four and 6 years old who took music lessons showed evidence of boosted encephalon development in the areas related to retention and attention. Based on this study, our biased encephalon might connect the dots apace and conclude that music lessons improve brain development. But there are other variables to consider. The fact that the children took music lessons is an indicator of wealth. And then they probably had access to other resources that are known to heave brain evolution like adept nutrition.
The point of this example is that researchers can't assume from simply this much data that music lessons impact brain evolution. Yes, at that place's conspicuously a correlation, but there's no actual prove of causation. We demand more data to get a truthful causal caption.
Cancer and mobile phones
If you study a chart that shows both the number of cancer cases and the number of mobile phones, you'll notice that both numbers went upwards in the last 20 years. If your brain processes this information with crusade-relation cognitive bias, you might decide that mobile phones cause cancer. Merely that's ridiculous. There's no proof other than both datapoints happening to increment. A lot of other things have besides increased in the past 20 years, and they can't all-cause cancer or be caused past mobile phone use.
Explainability
To notice causation, nosotros need explainability. In the era of artificial intelligence and big data analysis, this topic becomes increasingly more than important. AIs make information-based recommendations. Sometimes, humans can't come across any reason for those recommendations except that an AI made them. In other words, they lack explainability.
Explainability in medicine
The FDA won't approve cancer treatments that lack explainability. Recall about this situation for a minute. Do you lot desire the best possible treatment for your cancer, based on an AI's assay of your genomes, your cancer Deoxyribonucleic acid, millions of other cases, and more data, even if yous can't explicate how the figurer's neural network came up with that exact treatment? Or would yous rather have a suboptimal handling that you can explain the reasoning for?
Medical explainability will be probably one of the biggest topics of this century.
One mode versus two fashion
Correlations get both ways. We can say that mobile phone usage correlates to increased cancer risk and that cancer cases correlate to the number of mobile phones. Basically, you can swap the correlation. In causation relationships, nosotros tin say that a new marketing campaign caused an increase in sales. But saying that the increase in sales (after the campaign ran) caused the marketing campaign doesn't brand any sense.
Any causal statement, by definition, is i way. That's a big inkling about whether y'all're dealing with correlation or causation.
The big dilemma
In "The causal effect of education on earnings," David Carte du jour says that meliorate education is correlated to higher earnings. But the near of import thing he says is that if nosotros tin't do an experiment, with all our
variables abiding, we can't infer causation from a correlation. We can always bring explainability to the table. Only in real life and with big plenty issues, causations based on explainability are hard to prove. From a scientific viewpoint, they can't be called annihilation more than than a theory.
In the absence of experimental evidence, it is very difficult to know whether the higher earnings observed better-educated workers are caused past their higher education, or whether individuals with greater earning capacity have chosen to acquire more schooling.
— David Card, The causal effect of education in earnings
Does higher-earning cause higher education? Does higher education cause college earning potential? We don't know. However, nosotros tin brand predictions. We can utilize this correlation to predict the earning potential of an individual based on his instruction. We can as well predict his education based on his earnings.
Skillful predictions are based on correlations
It sounds similar a contradiction, given the context of this article. Correlation is about analyzing static historical datasets and considering the correlations that might exist between observations and outcomes. Still, predictions don't change a system. That's decision making. To make software development decisions, nosotros need to empathise the difference it would make in how a system evolves if you take an action or don't take action. Conclusion making requires a casual agreement of the impact of an activity.
What are predictions?
We don't make amend predictions past developing a ameliorate coincidental understanding. Instead, we need to know the precise limits of the techniques we employ to brand predictions and what each method tin can exercise for us.
References
Lovestats (2019). "Cartoons." The LoveStats Blog. Retrieved from lovestats.wordpress.com.
Menu, D.. (1999). "The causal outcome of education on earnings." Handbook of Labor Economics, vol 3.
kirklandoverearrever.blogspot.com
Source: https://towardsdatascience.com/correlation-is-not-causation-ae05d03c1f53
0 Response to "How Do You Know if It Is a Correlation"
Post a Comment