Reading the dataset and preparing cleaning functions