The central limit theorem (CLT) is a powerful tool that allows us to make inferences about populations, even if we don’t know the exact distribution of the population. Just wave your wand (i.e., use the CLT) over a sample of data, and the distribution of the sample means will be...
[Read More]
The Law of Large Numbers!
How to Make Sense of Randomness.
The law of large numbers is a statistical theorem that states that as the number of identically distributed, randomly generated variables increases, their sample mean approaches their theoretical mean.
[Read More]
The Art of Comparing Apples to Oranges!
Relative Estimation for Data Science teams.
Relative estimation is a concept that simply means “comparing two things to each other.” If you’ve ever said something like, “Hey, this tree is twice as tall as that tree,” you already know how to do it.
[Read More]
AdaBoost for the win!
Adaptive Boosting - The first of its kind!
AdaBoost (Adaptive Boosting) is a popular boosting algorithm that combines multiple weak classifiers to create a strong classifier.
The algorithm works by iteratively adjusting the weights of the training instances and focusing more on the misclassified instances in each iteration using a weighted sampled distribution.
[Read More]
An overview of variable correlations!
Hey, variable - are we (co)related?
Correlations are like two peas in a pod - they just can’t be separated. In machine learning, correlations are like the secret ingredient that makes our models stand out. They help us identify patterns and relationships between features, and guide us in selecting the most relevant variables to predict our...
[Read More]