Model Selection

I’ve done a couple blogposts in the past on Statistical learning, see here if you havn’t read them yet. In this blog post I’ll explain the most popular way to compare models and decide which one is best. It’s known as the test-train split. This is really only useful for supervised problems. The test set approach So the test-set approach is quite intuitive when you hear about it. You have your \(n\) data-points you observed each of which has explanatory \(x_i\) and response \(y_i\) and our end goal is to predict the \(y_i\). »

An Intro to Classification

So, this is a follow on from the Supervised or not blog post where I looked at how to decide if a problem is supervised or unsupervised and looked at a simple example on the iris dataset. Similar to that post, here I’ll look at classification again, but we’ll go more in-depth into some issues with classification. Linear Discriminant Analysis In the previous post we’ve used K-nn, here we’ll use Linear discriminant analysis (LDA) which is slightly more complicated. »

AI ain't here yet!

This blog post is about a talk given by Prof. Michael Jordan at SysML conference on the 15th of February.. He’s a professor of Statistics in the Department of Electrical Engineering and Computer Science and the Department of Statistics at the University of California, Berkeley. He’s extremely well-known, and has over 130,000 citations on google scholar. This is very much a follow on post from my previous blog post The Two Cultures of Data Analysis. »

Author image Mike

The Two Cultures of Data Analysis

Much of the reading I do tends to end up leading me to many papers which seem to be carried out in the machine learning field rather than statistics. I always ask myself what’s the difference. There’s so much blurring between the two areas like topic modelling which is based off of Bayesian statistics but still is worked on primarily by people in the machine learning community. Then, there’re areas which fall more in the statistics field like Expectation-Maximization; and the computer science field like neural networks. »

Author image Mike

Supervised or not?

MACHINE LEARNING!!! So, if you’ve not heard of machine learning yet, you probably haven’t been watching any TV the last decade. Problem is machine learning is absolutely massive field. This is the intro to a series of blog posts I plan on doing on various areas in machine learning, more specifically statistically backed methods therefore I call it statistical learning. In this blog post I aim to break down the two main areas that are generally focused on. »