Comprehensive Data Exploration Process with One-Click

EDA overview (image by author from )

Exploratory Data Analysis, also known as EDA, has become an increasingly hot topic in data science. Just as the name suggests, it is the process of trial and error in an uncertain space, with the goal of finding insights. It usually happens at the early stage of the data science…

Line chart, bar chart, pie chart … they tell different stories

Chart Type Summary Mindmap (image by author)

In this information rich age, data visualizations are designed to make the knowledge transfer between deliverers and receivers easier. Therefore, it is crucial for the dashboard creators to know which chart is aligned with the key delivery objectives. On the other hand, having a basic understanding of the underlying meaning…

A Step by Step Guide to K-Means Clustering

clustering analysis infographic (image by author from website)

What is Clustering Algorithm?

In a business context: Clustering algorithm is a technique that assists customer segmentation which is a process of classifying similar customers into the same segment. Clustering algorithm helps to better understand customers, in terms of both static demographics and dynamic behaviors. …

How to Use Data Visualization to Guide Feature Selection

Feature Selection and EDA Cheatsheet (image by author, from website)

In Machine Learning Lifecycle, feature selection is a critical process that selects a subset of input features that would be relevant to the prediction. Including irrelevant variables, especially those with bad data quality, can often contaminate the model output.

Additionally, feature selection has following advantages:

1) avoid the curse of…

Step-by-Step Guide from Data Preprocessing to Model Evaluation

logistic regression python cheatsheet (image by author from

What is Logistic Regression?

Don’t let the name logistic regression tricks you, it usually falls under the category of the classification algorithm instead of regression algorithm.

Then, what is a classification model? Simply put, the prediction generated by a classification model would be a categorical value, e.g. …

Machine Learning and Predictive Modelling in BigQuery

How to Build ML Model using BigQuery — image by author

While taking the first step into the field of machine learning, it is so easy to get overwhelmed by all kinds of complex algorithms and ugly symbols. Therefore, hopefully, this article can lower the entry barrier by providing a beginner-friendly guide. Allow you to get a sense of achievement by…

