Data Science Tools

In this section, we'll have guides to a wide variety of tools used by data scientists.

Data scientists are inquisitive and often seek out new tools that help them find answers. They also need to be proficient in using the tools of the trade, even though there are dozens upon dozens of them. Data scientists should have a working knowledge of statistical programming languages for constructing data processing systems, databases, and visualization tools. Many in the field also deem a knowledge of programming an integral part of data science. Yet, not all data science students study programming. Thus it is helpful to be aware of tools that circumvent programming and include a graphical interface. Then data scientists’ knowledge of algorithms is enough to help them build predictive models.

What is great about data science is that there are many pathways to becoming a data scientist. You don't have to have a degree in computer science or mathematics. With subject matter expertise, such as in biostatistics, geography or political science, you can get the skills to use data science in many ways. There are a plethora of online resources, boot camps and local meetups. There you can immerse yourself in the data science community (see resources below).

There are a few tools that you can start learning to get into data science. R remains the leading tool, with 49% share. Python language is growing fast and is approaching the popularity of R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40% practitioners, and Deep Learning usage doubles.

Data Science is OSEMN (Obtain, Scrub, Model, iNterpret) the Data.

[Resources for Data Science]

Contributing to the Guide

This open source guide is curated by thousands of contributors. You can help by researching, writing and updating these articles. It is an easy and fun way to get started with contributing to open source.