What is the Difference Between a Statistician and a Data Scientist?


On the surface, it can sound as though the role of statisticians and data scientists are similar. Both are concerned with gathering data, analyzing it, and using what they find to make predictions.

In reality, although some of their core skills are similar, statisticians and data scientists can work in quite different ways, which can sometimes lead to conflict when they are working together.

What is a statistician?

A statistician is principally a mathematician. Their role is to gather large amounts of data and then apply mathematical theories in order to spot patterns and trends and make predictions.

Statisticians employ mathematical disciplines such as algebra and probability theory to take information that they have gathered about a sample set of people or things and then use that information in order to make predictions about the behaviors of the wider group.

A statistician working for a business might gather data about the spending habits of a sample set of a segment of the population. They could, for example, gather information about how a group of one hundred people aged twenty to twenty-five in a certain city spend their money. They would then use this information to make predictions about how the rest of the population within that age range will behave.

This prediction can be used by businesses to drive marketing decisions.

What is a data scientist?

Data scientists apply similar principles to those used by statisticians, but their methods are different. Data scientists are concerned with gathering large amounts of data, and they generally employ the use of powerful computer algorithms to do this.

Once the data is gathered, it must be cataloged and organized in such a way that it can be worked with. Then algorithms are run on the data to find patterns and trends and make predictions about the future.

Data science is powerful in a business setting because it allows companies to model staggering amounts of scenarios, which means that they have the best possible chance of making good quality business decisions.

Where are their similarities?

Both statisticians and data scientists are concerned with gathering accurate data and using this data to create models about past behaviors. These models are then used to try and understand the reasons for the behaviors.

In the context of business, you might try to understand why there are peaks and troughs in sales at certain times, for example.

These ideas can then be applied and used to make more informed decisions going forward. The data can be used by both scientists and statisticians to create theories about what has driven past behavior, and therefore what can be done in the future so that the outcome is more favorable to the company.

Where are their differences?

The differences between computer scientists and statisticians arise in their ways of working, which is why they can sometimes struggle to see eye to eye if they are working together.

Statisticians put a great deal of emphasis on rigorously analyzing their data and on incorporating mathematical theory into their problem-solving methods. These are well-established theories that help to lend scientific credibility and robustness to their predictions. Statisticians will tend to have a robust model formed based on mathematical theory, which they will then test using the data.

Data scientists tend to focus more on the methods that they use to retrieve their data, designing algorithms to mine for data as efficiently as possible. They are also concerned with the quality of their data and will devote a lot of their time to making sure that their data is ‘clean’ and accurate. There is also a strong emphasis on machine learning, which means designing algorithms to run and find patterns in data. Data scientists tend to work in a more data-led way, allowing them to inform the models they create.

How can data scientists and statisticians work together?

Data scientists and statisticians can sometimes struggle when they need to work together.

Statisticians can feel that data scientists aren’t paying enough attention to the mathematical theory when they are developing their models and that it’s difficult to robustly test a constantly evolving model as more data is acquired.

Data scientists can feel as though statisticians are too concerned with mathematical theory, that they aren’t allowing the data to inform their models enough, and that this approach can be a little slow.

In order for them to work together effectively, it’s important that data scientists and statisticians each have an understanding of the merits of the approach of the other. There will be situations where each approach will be more effective, depending on the problem that needs to be solved and the data that is available.

With a full understanding of both disciplines, greater outcomes can be reached for businesses.

For more information on how data scientists and statisticians can work together more effectively, please click here.

How does this benefit businesses?

There are a number of ways in which having data scientists and statisticians working together may benefit a business:

  • Better decision-making. Using a combination of statistical and data-driven methods to make business decisions ultimately means that the quality of those decisions will be better. Not only that, but the evidence that you have to back up those decisions will be more robust, which is essential when you are presenting findings to your stakeholders or board members.
  • More relevant products. Better quality analysis of the market and how consumers behave ultimately means that you can ensure that you are providing a product that people will actually want to buy! You can use data to inform your product design, distribution and to monitor uptake amongst your customer base.
  • Recruitment. Data can be a great tool when you’re trying to recruit talented staff for your business. Good quality data on potential candidates is widely available on social media and in recruitment databases. You can use this to decide who will be the best fit with your existing team based on past experiences and likely future outcomes.