data-driven culture

How to Create a Data-Driven Culture in Your Organization

Introduction Data-driven structure drives the conditions, influences, and dynamics that impact the organization’s functioning, performance, and overall success. Therefore many organizations are working towards implementing data-driven culture in their day-to-day operations.In any organization, the basic functional and operative environment shapes the context in which an organization works. It refers to the specific conditions, factors, and […]

Correlation Vs Causation

Correlation Vs Causation

The subtle difference in ‘Correlation Vs Causation’ is very important for budding data analysts. Often we get so excited with the patterns in the data that we forget to evaluate if it is a mere correlation or if there is a definite cause. It is very easy to get carried away with the idea of […]

PCA

Your Ultimate Guide for PCA with DataPandit

Principal component analysis (PCA)  is an unsupervised classification method. However, the PCA method in DataPandit cannot be called an unsupervised data analysis technique as the user interface is defined to make it semi-supervised. Therefore, let’s look at how to perform and analyze PCA in DataPandit with the help of the Iris dataset. Arranging the data  […]

Linear regression with examples

Linear Regression with Examples

Introduction to linear regression Whenever you come across a few variables that seem to be dependent on each other, you might want to explore the linear regression relationship between the variables.  linear regression relationship can help you assess: The strength of the relationship between the variables  Possibility of using predictive analytics to measure future outcomes […]

Linear Regression Assumptions

Top 7 Linear Regression Assumptions You Must Know

The theory of linear regression is based on certain statistical assumptions. It is crucial to check these regression assumptions before modeling the data using the linear regression approach. In this blog post, we describe the top 7 assumptions and you should check in DataPandit before analyzing your data using linear regression. Let’s take a look […]

Data Visualization

Data Visualization using Box-Plot

Data visualization is the first step in data analysis. DataPandit allows you to visualize boxplots as soon as you segregate categorical data from numerical data. However, the box plot does not appear until you uncheck  ‘Is this spectroscopic data?’ option in the sidebar layout, as shown in Figure 1.  The box plot is also known […]

Correlation Matrix

How to use the Correlation Matrix?

The correlation matrix in DataPandit shows the relationship of each variable in the dataset with every other variable in the dataset. It is basically, a heatmap of Pearson correlation values between corresponding variables. For example, in the correlation matrix above, the first element on X-axis is high_blood_pressure while that on the Y-axis is high_blood_pressure too. […]

Pearson's correlation Matrix

What is Pearson’s Correlation Co-efficient?

Introduction Pearson’s correlation is a statistical measure of the linear relationship between two variables. Mathematically,  it is the ratio of covariances of the two variables And the product of their standard deviations. Therefore the formula for Pearson’s correlation can be written as follows: The result for Pearson’s correlation always varies between -1 and + 1. […]