Hugo Bjork

Machine Learning Labs

Lab 1

Data Partitioning

I learned how to split a dataset into training, validation, and test sets using random sampling.

k-Nearest Neighbors

I practiced implementing the kNN algorithm using the kknn package and classified data points based on their nearest neighbors.

Confusion Matrix

Learned classification model evaluation with confusion matrices, understanding true positives, true negatives, false positives, and false negatives.

Hyperparameter Tuning

I explored different values of the hyperparameter k in kNN to find the optimal value that minimizes the missclassification error.

Cross-Entropy

I used cross-entropy to compare different values of k in kNN, providing an alternative method for hyperparameter tuning.

Linear Models

I applied Generalized Linear Models, specifically logistic regression, for binary classification tasks.

Feature Engineering

Expanded feature space using new columns based on various powers of existing features for input in the logistic regression model.

Lab 2

Lasso Regression & CV

Explored L1 regularization (Lasso) and determined optimal lambda using cross-validation.

Decision Trees

Worked with decision trees, optimized them using pruning techniques.

PCA

Applied Principal Component Analysis to reduce data dimensionality while preserving information.

Data Scaling

Scaling features in machine learning enhances optimization, prevents bias towards high-magnitude values, and improves model performance.

Lab 3

Kernels

I learned to create a temperature forecast using a kernel method with three Gaussian kernels (physical distance, time distance, and hour distance).

Smoothing Factors

I practiced selecting appropriate smoothing factors for each kernel, such as 100 km for distance, 10 days for date, and 2 hours for time, to achieve accurate forecasts.

Evaluating Forecast

I practiced comparing forecast results obtained using a summation kernel and a product kernel to determine which one is more accurate and reliable.

SVM

I learned to evaluate different Support vector machines(SVM:s) models by choosing the appropriate regularization parameter C to achieve the best generalization performance.

SVM prediction

I gained the ability to implement SVM predictions manually by calculating the dot product between the support vectors and new data points.

Neural Networks

Learned neural network implementation, prediction, visualization, and analyzed limitations in predicting the inverse sine function with multiple inputs producing the same output.