Training ML Algo for Classification

Neural Network Binary Classification

  1. Initialize the weights to 0 or small random numbers
  2. For each training sample x, perform the following steps:
    1. Computer the output value γ
    2. Update the weights

Update of weight

w¹ = w¹ + Δw

and

Δw = n ( y – y¹ ) * x

where n is learning rate between 0.0 and 1.0

Convergence of perceptron is only guaranteed if the 2 classes are linearly separable and learning rate is sufficiently small. Otherwise set a max number of passes over training dataset (epochs) or a threshold for the number of tolerated misclassifications.

Adaptive Linear Neurons (Adaline)

1 of the key ingredient of a supervised machine learning algo is to define an objective function that is to be optimized during learning process. Often, it is a cost function to be minimized. In Adaline, it is the reduce the sum of squared errors.

Also, the weights are updated based on a linear activation function rather than unit step function.

Let cost function defined as J. Differentiate J to find the global minimum.

Weight update is based on all samples in training set instead of incrementally after each sample, hence this is called batch gradient descent.

Stochastic gradient descent

Reaches convergence faster because of frequent weight updates.

Advertisements

Author: Zac

Think & Do

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s