Overview: Generally speaking, Variational Inference(VI) is a specific Statistical inference method that attempts to find the best tractable distribution to replace the intractable distribution when the unknown density function is intractable. In this way, a typical inference problem can be converted to a typical optimization problem. The best tractable distribution we look for is theContinue reading “The simplest way to understand Variational Inference”

# Author Archives: frank xu

## Data scientist essential tools setup(single machine)

As a data scientist, you should have a toolbox with some essential tools to perform data analysis and model development. Due to the popularity of Python in data science, an increasing number of analysts choose Python-related software as their primary analytics tools. In this article, I introduced a popular toolbox for basic data analysis andContinue reading “Data scientist essential tools setup(single machine)”

## My new paper: Anomaly Detection based on LDA, Autoencoder and GMM

Recently my team has finished a paper about Anomaly Detection. We proposed a novel unsupervised Anomaly Detection model (LAG) based on LDA, Autoencoder, and GMM. Our model can be used on both structured and unstructured data and provides a comprehensive solution for various Anomaly Detection tasks in different industries. Particularly, we provide a way toContinue reading “My new paper: Anomaly Detection based on LDA, Autoencoder and GMM”

## Shape Data for RNN by a single function

When we attempt to construct an RNN model in the Tensorflow. We will always come across a task that is reshaping the data to meet the input format of the RNN model. This step is essential but a little confusing. To simplify the work of reshaping data for RNN, I develop an RNN sample generatorContinue reading “Shape Data for RNN by a single function”

## The simplest way to understand SVM

Recently, many friends asked me how to understand SVM intuitively. They were confused by the complicated mathematical formulas. Today I will thoroughly explain what is SVM through eight questions. For those who need to know the mathematical proofs, please refer to the appendix below. Q1: How to understand SVM most intuitively？ Ans1: Firstly, take aContinue reading “The simplest way to understand SVM”

## A Better OrdinalEncoder for Scikit-learn

If you ever used Encoder class in Python Sklearn package, you will probably know LabelEncoder, OrdinalEnocder and OneHotEncoder. These Encoders are for transforming categorical data into numerical data. In this blog, I develop a new Ordinal Encoder which makes up the shortcomings of the current Ordinal Encoder in the sklearn. Also, it can be usedContinue reading “A Better OrdinalEncoder for Scikit-learn”

## Suspicious customer detection with LDA + Auto-Encoder part1

Suspicious customer detection is one of the applications of Fraud Detection or Anomaly Detection. There are many data-driven approaches in this area. This blog will introduce a very novel approach as a good alternative solution for suspicious customer detection: LDA + Auto-Encoder. Different business units or business lines have different perspectives to detect anomaly customers.Continue reading “Suspicious customer detection with LDA + Auto-Encoder part1”

## A workflow design with multiple inheritances

Many languages have limitations on multiple inheritances, such as Objective-C, Ruby, Java, because inheriting multiple implementation-oriented classes is prone to diamond inheritance, that is, two-parent classes inherit from the same base class. Then the subclass will contain the contents of two grandparents. Probably you have been always told that do not involve too many superclassContinue reading “A workflow design with multiple inheritances”