Tag: Machine Learning
-
Sketulate: Sketchable Functions & Densities for Data Science

I simulate a lot of data to test my ideas, particularly the more complex ones. With non-standard stuff, it can be pretty time consuming to find the right function and/or parameters to do what I want. If the shape you want isn’t a common function or distribution, you can spend too much time searching for…
-
Calibration Plots with Generalized Additive Models (GAMs)

Calibration of probabilistic predictions is a crucial task in many machine learning applications, especially when the outputs are used downstream for decision-making. It’s actually a simple concept, with calibration just ensuring that the predicted probabilities align with the actual outcomes. Despite that, I’ve noticed many people neglect the step in their modelling process! Typically, when…
-
XGBoost Can’t Extrapolate

A common pattern that I observe among inexperienced Data Scientists is the following – they often default to XGBoost or a similar Gradient Boosted Model for their problem with any thought as to whether its the right choice for the job. Given how powerful these methods are, this isn’t the most egregious mistake one can…
-
Small Data: Creativity, Explainability & Precision

This post discusses the significance of small data in the context of data science and modeling, detailing tips and tricks for working with small data sets.
-
Genetic Algorithms with PyGAD and PyTorch

Deep dive into Genetic Algorithms (GAs), an optimization algorithm inspired by the concept of natural evolution, including using a GA to train a Pytorch model with the Pygad library.
-
Stochastic Time Delay in Regression Analysis

I revisit a previous article on designing a regression model for stochastic time delay problems, where input-output delays vary randomly. The proposed model treats time delay components as part of the analysis, achieving improved results over standard regression methods in simulated experiments. Potential applications include marketing and medical settings. Future extensions might tackle multiple regression…