I learned how to split a dataset into training, validation, and test sets using random sampling.
I practiced implementing the kNN algorithm using the kknn package and classified data points based on their nearest neighbors.
Learned classification model evaluation with confusion matrices, understanding true positives, true negatives, false positives, and false negatives.
I explored different values of the hyperparameter k in kNN to find the optimal value that minimizes the missclassification error.
I used cross-entropy to compare different values of k in kNN, providing an alternative method for hyperparameter tuning.
I applied Generalized Linear Models, specifically logistic regression, for binary classification tasks.
Expanded feature space using new columns based on various powers of existing features for input in the logistic regression model.
Explored L1 regularization (Lasso) and determined optimal lambda using cross-validation.
Worked with decision trees, optimized them using pruning techniques.
Applied Principal Component Analysis to reduce data dimensionality while preserving information.
Scaling features in machine learning enhances optimization, prevents bias towards high-magnitude values, and improves model performance.
I learned to create a temperature forecast using a kernel method with three Gaussian kernels (physical distance, time distance, and hour distance).
I practiced selecting appropriate smoothing factors for each kernel, such as 100 km for distance, 10 days for date, and 2 hours for time, to achieve accurate forecasts.
I practiced comparing forecast results obtained using a summation kernel and a product kernel to determine which one is more accurate and reliable.
I learned to evaluate different Support vector machines(SVM:s) models by choosing the appropriate regularization parameter C to achieve the best generalization performance.
I gained the ability to implement SVM predictions manually by calculating the dot product between the support vectors and new data points.
Learned neural network implementation, prediction, visualization, and analyzed limitations in predicting the inverse sine function with multiple inputs producing the same output.