Correlation Vs Causation

Correlation Vs Causation

The subtle difference in ‘Correlation Vs Causation’ is very important for budding data analysts. Often we get so excited with the patterns in the data that we forget to evaluate if it is a mere correlation or if there is a definite cause. It is very easy to get carried away with the idea of […]


Your Ultimate Guide for PCA with DataPandit

Principal component analysis (PCA)  is an unsupervised classification method. However, the PCA method in DataPandit cannot be called an unsupervised data analysis technique as the user interface is defined to make it semi-supervised. Therefore, let’s look at how to perform and analyze PCA in DataPandit with the help of the Iris dataset. Arranging the data  […]

Linear regression with examples

Linear Regression with Examples

Introduction to linear regression Whenever you come across a few variables that seem to be dependent on each other, you might want to explore the linear regression relationship between the variables.  linear regression relationship can help you assess: The strength of the relationship between the variables  Possibility of using predictive analytics to measure future outcomes […]

Linear Regression Assumptions

Top 7 Linear Regression Assumptions You Must Know

The theory of linear regression is based on certain statistical assumptions. It is crucial to check these regression assumptions before modeling the data using the linear regression approach. In this blog post, we describe the top 7 assumptions and you should check in DataPandit before analyzing your data using linear regression. Let’s take a look […]

Data Visualization

Data Visualization using Box-Plot

Data visualization is the first step in data analysis. DataPandit allows you to visualize boxplots as soon as you segregate categorical data from the numerical data. However, the box plot does not appear until you uncheck  ‘Is this spectroscopic data?’ option in the sidebar layout as shown in Figure 1.  The box plot is also […]

Correlation Matrix

How to use the Correlation Matrix?

The correlation matrix in DataPandit shows the relationship of each variable in the dataset with every other variable in the dataset. It is basically, a heatmap of Pearson correlation values between corresponding variables. For example, in the correlation matrix above, the first element on X-axis is high_blood_pressure while that on the Y-axis is high_blood_pressure too. […]

Pearson's correlation Matrix

What is Pearson’s Correlation Co-efficient?

Introduction Pearson’s correlation is a statistical measure of the linear relationship between two variables. Mathematically,  it is the ratio of covariances of the two variables And the product of their standard deviations. Therefore the formula for Pearson’s correlation can be written as follows: The result for Pearson’s correlation always varies between -1 and + 1. […]

Finding the Data Analytics Method that Works for You

Last week I met John, a process expert who works at a renowned cosmetic manufacturing company. John was pretty frustrated over a data scientist who could not give him a plot using the data analytics technique of his choice. He was interested in showing grouping patterns in his data using PCA plots. When I got […]

What is Data Analytics as a Service?

Introduction Data Analytics is very diverse in the solutions it offers. It covers a range of activities that add value to businesses. It has secured a foothold in every industry that ever existed. Eventually carving a niche for itself known as Data Analytics as a Service (DAaaS) DAaaS is an operating model platform where a service […]

Internet Of Things-Few Insightful Facts

Introduction The internet has revolutionized our modern society. It has simplified everything that we do. It has brought us all the good things of the world at our fingertips. There has been a wave of internet transformation lately. The traditional internet has evolved into the Internet of Things(IoT) by convergence into diversified technologies. This evolution […]