Introduction to data visualization

A picture is worth a thousand words.

Data visualization is the quickest and most powerful technique to understand new and existing information. During an initial exploration phase data scientists try to reveal the underlying features of a dataset like different distributions, correlations or other visible patterns. This process is also called exploratory data analysis (EDA) and marks the starting point of each data science project.

The graphs produced during the EDA show the data scientist the directions of the journey ahead. Revealed patterns can inspire hypothesis about the underlying processes, features of the dataset to be extracted or modelling techniques to be tested. Last but not least, visualizations uncover outliers and data errors which the data scientist needs to take care about.

The biggest role for data visualization is the communication of data science findings to colleagues and customers through presentations, reports or dashboards. Effort used for EDA and visualizations is time well spent since results can be directly used to communicate findings.

Why data visualization is important