MLA 008 Exploratory Data Analysis
Oct 26, 2018
Click to Play Episode

EDA + charting. DataFrame info/describe, imputing strategies. Useful charts like histograms and correlation matrices.

Show Notes

Nulls, mean, median

  • df.info() - dtypes, nulls
  • df.describe(): count, mean, std, min/max, quartiles

Line, scatter

Outliers: histogram, box plots

  • Remove outliers? RobustScaler?

Correlation matrices