The types of Data Scientists going around. No offence meant, all types play a crucial role in their own way.
What they Do: The word “theory” itself can mean many things to many people. Generally, in the machine learning community, theorists are computer scientists, mathematicians, and statisticians, who primarily study algorithms that are provably efficient and provably correct, even if they must rely on unrealistically strong assumptions. Theory papers contain proofs correctness, proofs of convergence, and guarantees on performance.
Tools of the Trade: Theorists rely mostly upon paper, writing utensils, and occasionally email and Matlab to be productive.
Where they Work: Most pure theorists aim for academic jobs. However, large private research institutions like Microsoft research and IBM research employee a large number of top researchers. Some large companies, such as Google, do have problems so novel as to erode the disconnect between theory and practice.
Machine Learning Scientists
What They Do: Machine learning scientists sit somewhere between theorists and data miners. For these scientists, a single method whose behavior is understood is preferable to system which wins a Kaggle competition by cobbling together a gaggle of algorithms into an ensemble. Generally, these scientists develop new algorithms which may be heuristic are theoretically motivated. They also care about empirical performance on real-world tasks.
Tools of the Trade: Implementation is a significant part of machine learning work and machine learning scientists should have strong coding skills in both high and low-level languages, as well as the ability to rapidly prototype with existing machine learning frameworks like scikit-learn.
Where they Work: While academia is a siren that calls many in machine learning, university jobs are hard to score. Fortunately, a healthy Silicon Valley job market has gobbled up the vast majority of machine learning scientists in recent years. Major employers include Google, Microsoft, Amazon, Facebook, and more. Finance companies also employ a substantial number of these research engineers.
What They Do: Unlike machine learning researchers who consider many abstract tasks, such as the active learning paradigm, and are often content to show state of the art performance on widely-studied datasets, data miners work on two types of problems. The first: “here is a dataset, produce insights”. The second: “here is a dataset and a task, win.” Generally, these are the folks who win competitions.
Tools of the Trade: These engineers are often strong programmers and combine domain-specific intuition with a knowledge of algorithms to generate valuable insights. They have a strong knowledge of available libraries and implement quickly.
Where they Work: Data miners work at a broader range of companies than pure machine learning workers. They can be found at the traditional silicon valley powerhouses, but also in the health space, or mining data for companies that may not be primarily in the business of building high-tech solutions.
What They Do: Like their security hacker counterparts, script kiddies are the end users of data science products. They may know roughly what a support vector machine does, but wouldn’t code one from scratch.
Tools of the Trade: Azure ML, IBM Watson, KNIME.
Where they Work: Everywhere.
What They Do: As with all trends, everyone wants in. Just as everyone wants to work at a startup, traditional business analysts want technical-sounding job titles. These individuals may have no coding skills or mathematical background, but why should qualifications stand in the way of ambition?
Tools of the Trade: Powerpoint, excel, laser pointers, buzzwords.
Where they Work: Everywhere, but are most celebrated at management consultancies.