• Econometric Analysis of Suicide Rates

    An exploration of suicide rates and how they vary across demographic cohorts. Linear regression is used to model the relationship between suicide rates and per capita GDP, and a small positive effect is found.
  • Overfitting in Neural Networks

    A brief illustration of the problem of overfitting in neural network classification, showing that dense-er is not always better. The 'Human Activity Recognition' dataset is used, composed of smartphone accelerometer readings from different activities.
  • A Survey of Shrinkage Methods

    I examine some of the common shrinkage methods employed to combat the problem of overfitting. Specifically, the LASSO, ridge regression, and the elastic-net are detailed. The techniques are motivated by common issues that arise in the estimation of a known real-world parameter.
  • Speed Dating and Revealed Preferences

    The classification technique of logistic regression is introduced, alongside a discussion of revealed preferences. This is done using a dataset on speed dating, generated experimentally as part of a paper by two professors at Columbia University.
  • Topic Modelling with LSA and LDA

    Two typical NLP techniques are explored in relation to the problem of topic modelling. These are applied to the 'A Million News Headlines' dataset, which is a corpus of over one million news article headlines published by the ABC.

subscribe via RSS