![]() This project is a classification problem and so the SGDClassifier is used from sklearn.linear_model. Dummy code categorical variables using the OneHotEncoder function from sklearn.preprocessing making sure to use the parameter drop = first so that if there are 4 categories in a variable, for example, only 3 categories are retained and the 4th category is inferred by all others being 0.Standardise continuous variables using the StandardScaler function from sklearn.preprocessing.Feature engineering – after some initial data exploration, new features can be created or existing features can be modified. ![]() ![]() To do this, the train_test_split function is used from sklearn.model_selection (making sure that shuffle = True as this is a requirement of SGD)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |