Top posts

The Business Intelligence Project

~ 08 Sep 2020

Business Intelligence (BI) customizes Data and Business Analytics and combines it with Data tools and visualizations for analysing data and presenting meaningful information that helps modern-day businesses and organisations to make informed and data-driven decisions


~ 28 Jul 2020

Sometime back we started with our Voyage-à-Machine-Learning series to provide an opportunity for everyone to know all about Machine Learning and its applications. It culminated with a small challenge involving the use of various techniques covered in the series.


~ 30 Jun 2020

Ever heard things like “number of astronauts dying in space is directly correlated with no. of people not wearing a seatbelt in cars” or “no. of students against DU conducting online exams is directly correlated with no. of people using Netflix”, you may find these things funny and hilarious but there exists a statistical concept related with it. It is known as Spurious Correlation, let’s find out what interesting things this concept has for us in the box.

All posts

Benford's Law

~ 10 Dec 2021

What if you are told that you could catch manipulations in your dataset just by observing the occurrences of numbers 1 to 9. Sounds intriguing, doesn’t it? Read on to find out how.

Ethics in Data Science

~ 29 Nov 2021

With the ubiquitous use of data science in nearly every aspect of life, it is crucial for the minds behind this powerful instrument to be mindful of the influence that their models are having on every individual who comes in touch with their models and algorithms.

Cool Python Libraries

~ 11 Nov 2021

Python is a high level, interpreted and interactive scripting language. It is used in many firms and companies because it supports multiple programming methods. It also performs automatic memory management. It was developed by Guido van Rossum in the late 80s.

Getting Started with Support Vector Machine

~ 29 Oct 2021

SVM is built to work on smaller, but highly complex data used in the creation of stronger, more efficient models. Support Vector Machine, or SVM, is a supervised ML algorithm useful for both regression and classification problems.

Law of Large Numbers

~ 14 Oct 2021

We all have an inquisitiveness of experimenting and finding new and intriguing ideas in different parts of our life and studies. But one question we must ask ourselves is that are we able to get accurate results from the limited experiments we do on a particular subject?

The Gambler's Ruin

~ 30 Sep 2021

The Gambler’s Ruin Problem, a famous statistical scenario centered around probabilities and experimental outcomes. This problem is also illustrated as an application of unique Markov chains with interesting properties.

The Potato Paradox

~ 16 Sep 2021

One less potato from the potatoes we have is half the amount of potato we had originally. Already confused? Well, don't be. The potato paradox will help you understand how doubling one portion of the whole affects the whole in drastic ways. It requires the other one’s size to be reduced by half of the whole, whether it's doubling from 1% or .00001%. Or 10%.

An Introduction To The Confusion Matrix

~ 02 Sep 2021

A confusion matrix is a tool used to summarize the performance and accuracy of a classification model in machine learning. It comes in handy after the data has been cleaned, processed, fed into a model, and has given out results. It helps in determining the effectiveness of the classification.

Gradient Descent

~ 19 Aug 2021

Over the years, many optimization techniques have been developed for machine learning algorithms and neural networks. One such sound technique is Gradient Descent. Read along to learn about this amazing algorithm and how it can be implemented to help us solve complex optimization problems.

The Turing Test

~ 05 Aug 2021

Have you ever imagined a world where machines think, act and make decisions just the way humans do? The question of whether it is possible for machines to think like humans has a very long history. The Turing Test, originally known as The Imitation Game, was introduced in 1950 by the English Mathematician Alan M. Turing to determine whether a computer can "think" the way humans do. 

Logistic Regression And Surviving The Titanic

~ 22 Jul 2021

Logistic Regression is the go to method for binary classification in machine learning. In the article, we'll be learning about the logit function, and using it to solve the one of the most popular problems on Kaggle, the Titanic Dataset.

Linear Discriminant Analysis

~ 08 Jul 2021

Linear Discriminant Analysis is a dimensionality reduction technique used as a preprocessing step in Machine Learning and pattern classification applications.

Conjoint Analysis

~ 17 Jun 2021

Conjoint analysis is a survey-based statistical technique used in market research that helps in determining how people value different attributes that make up an individual product or service.

Fibonacci and The Golden Ratio

~ 02 Jun 2021

People rarely speak of Mathematics and Fun in the same sentence. The first three suggestions when searching ‘why is maths’ on Google are - ‘so boring’, ‘so difficult’ and ‘important’.


~ 26 May 2021

The logistic map is a polynomial mapping (equivalently, recurrence relation) of degree 2 which exhibits how from very simple non-linear dynamical equations; complex, chaotic behavior can arise.

The Monty Hall Problem

~ 19 May 2021

You might be familiar with this famously regarded paradox before in your life, The Monty Hall Problem.


~ 13 May 2021

Stock Markets, the most volatile rather fascinating place, is a place where some create a fortune at one end and get broke on the other.

The Birthday Problem

~ 05 May 2021

What is the probability of two people sharing the same birthday in a room of n people?

Behavioral Analytics

~ 21 Apr 2021

Did you know that from the moment you click on websites or apps, your every activity is monitored?

Quantitative Forecasting

~ 14 Apr 2021

As the term Quantitative suggests, we rely on mathematical data or numbers to forecast.

Principal Component Analysis

~ 16 Dec 2020

When someone discovers that you are writing a research paper, one of the many questions asked is “Why are you writing on this particular topic?”. The question seems fairly easy to answer. You are writing a research paper because you are not entirely satisfied with the available reading on that particular topic. The same logic has been applied here. Principal Component Analysis is an important technique to understand in the field of statistics and data science but the resources available for it were too technical to comprehend. Therefore, we’ll have a look at What, Why and How part concerning PCA to further grasp the topic


~ 02 Dec 2020

Autocorrelation, the word itself is suggestive of its very nature, signifying a mathematical representation of relationship between the same time series in its original form and as it’s lagged form.

Random Forest

~ 27 Oct 2020

So, you must have read one of our last blogs on decision trees in ML; here in this blog, we will talk about a much wider concept in terms of application as well as scope. Random Forest has been considered as the panacea of all the data science problems. On a lighter note, when one can’t think of any algorithm, use random forest (irrespective of the situation)!

Decoding APIs?

~ 08 Sep 2020

Let’s decode one of the fanciest terminology in data science, the API.

Data Visualization in R using ggplot2 (PART 2)

~ 14 Jul 2020

In the previous part of the series of blogs, we have covered the basics of visualization through ggplot2. But visualization is something which is not always simple and easy, there are many more dimensions to it. Read along as we explore some of them in today’s blog.

Data Visualization in R using ggplot2 (PART 1)

~ 16 Jun 2020

How difficult is it, to look at loads of data in front of you and not being able to summarize it Insightful visualization is an important aspect of Data Analytics. But, don’t worry that’s what we are here for, to demonstrate to you one of the best Data Visualization tools in R which will make your life easier. By now, you must have known the answer yourself. Yes, GGPLOT it is. It is said to be based on the Grammar of Graphics which aims at creating Plots layers by layers and each layer refines the plot in its own way. Let’s have a look at these layers before proceeding further.

Accounting Analytics

~ 03 Jun 2020

One of the buzz-words in business schools in current times is data analytics or, in an accounting school, accounting analytics.

Modelling the 4P Mix

~ 30 May 2020

How far should I hike the price of my product? Is my distribution strategy working as it should? Should I spend more on traditional media or social media? Am I getting the right ROI for my marketing expenses?

K-Means Clustering

~ 20 May 2020

In our real lives we often face problems while choosing a perfect group for us. We look for various attributes and features among our group members that suits us the most and then decide whether to be a part of it or not. This process is often time consuming and may lead to inappropriate results. But in programming languages like R a simple few line code forms best clusters for you in say less than a minute. This is another feature that makes machines more efficient and unbiased in many areas including classification and clustering. So, let us start our tour of applications of machine learning from a simple but useful algorithm- K- means clustering.

Python Basics- Get them Intact

~ 13 May 2020

Only two days matter! The day you said ‘hello’ to the world and the day you printed ‘hello world’ in python. But we’re sure you wanted to say more than just a ‘hello’ to the world and wanted to print more than just ‘hello world’.


~ 06 May 2020

Simulation is basically a process of designing a model and conducting simultaneous experiments within the model to understand the behaviour of the system.

Supply Chain Analytics

~ 30 Apr 2020

Wondering how your nearest grocery store is still full of your favourite black bourbon, despite the ongoing lockdown? Not to worry, we are here to clear all your doubts. The answer is Supply Chain Management, a simple concept with dynamic applications.


~ 24 Apr 2020

Every human being needs a sense of freedom, some sort of independence in their lives. But did you know that the same is the case with statistics? Many measures in statistics are defined by their degree of freedom. Let’s find out what this term means.

Decision Trees in Machine Learning

~ 18 Apr 2020

So, in the last blog, we introduced you to the world of Machine Learning; lets dive into a very important aspect of it, that is, ‘Decision Trees’. A tree has many analogies in real life, and it turns out that it has influenced a wide area of machine learning.


~ 13 Apr 2020

Machine learning, as the name sounds, means “educating” a machine to behave, adapt and perform on its own. Machine learning focuses on building a computer program, that can learn and adapt to new data without human interference.

Marketing Analytics

~ 06 Apr 2020

Marketing analytics is the practice of measuring, managing and analyzing marketing performance to maximize its effectiveness and optimize the return on investment (ROI).

Setting up the environment for Python

~ 30 Mar 2020

When you get sick of theory, it’s time to leave get some hands on experience with practical that excite.

Diving Into Data- Statistical Analysis and Reproducibility of Research

~ 23 Mar 2020


Decoding Analytics Buzzwords

~ 16 Mar 2020

We thought that before starting in detail, let’s become familiar with the jargon commonly used in the analytics world.