Available Plot Types

There are many plot types available which help to understand different features and relationships in the dataset.

During the exploratory data analysis phase we typically want to detect the most obvious patterns by looking at each variable in isolation or by detecting relationships of variables against others. The used plot type is also determined by the data type of the input variables like numeric or categorical.

Scatter Plots

Scatter plots are used to visualize the relationship between two numeric variables. The position of each point represents the value of the variables on the x and y-axis.

Line Graphs

Line graphs are used to visualize the trajectory of one numeric variable against another which are connected through lines. They are well suited if values only change continuously - like temperature over time.

Bar Charts and Histograms

Bar charts visualize numeric values grouped by categories. Each category is represented by one bar with a height defined by each numeric value. Histograms are specific bar charts to summarize the number of occurrences of numeric values over a set of value ranges (or bins). They are typically used to determine the distribution of numeric values.

Others

Other frequently used plot types in data science include:

  • Box plots: Show distributional information of numeric values grouped in categories as boxes. Great to quickly compare multiple distributions.
  • Violin plots: Same as box plots but show distributions as violins.
  • Heat Maps: Show interactions of variables - typically correlations - as rastered image highlighting areas of high interaction.
  • Network Graphs: Show connections between nodes
Why data visualization is important