Logistic回归分类器

Logistic回归分类器

logistic回归是一种广义的线性回归分析模型,logistic回归的模型与线性回归分析模型基本相同,对于自变量 x x x和因变量 w ∗ x + b w * x + b w∗x+b,通过逻辑回归函数将因变量的值映射到 ( 0 , 1 ) (0, 1) (0,1),logistic回归模型试图学得一个通过属性的线性组合来进行预测的函数
f ( x ) = x T + b f(\pmb{x}) = \pmb{x}^T + b f(xxx)=xxxT+b

Logistic回归试图学得合适的权重向量 w \pmb{w} www和实数 b b b,对于标记向量 y \pmb{y} y​y​​y,使得 f ( x ) ≈ y f(\pmb{x}) \approx \pmb{y} f(xxx)≈y​y​​y

import numpy as np
from numpy.core.fromnumeric import shape
import matplotlib.pyplot as plt

读取训练数据集

def load_data_set():data_matrix = []label_matrix = []with open('testSet.txt', "r+") as file:for line in file.readlines():data = line.strip().split()data_matrix.append([1.0, float(data[0]), float(data[1])])label_matrix.append(int(data[2]))return data_matrix, label_matrix

Sigmoid函数

sigmoid函数也叫Logistic函数,可以将一个实数映射到(0,1)的区间,用来而分类

S ( x ) = 1 1 + e − x S(x) = \frac{1}{1 + e^{-x} } S(x)=1+e−x1​

sigmoid函数的导数为

S ( x ) = S ( x ) ∗ S ( 1 − x ) S(x) = S(x) * S(1 - x) S(x)=S(x)∗S(1−x)

def sigmoid(X):return 1.0 / (1 + np.exp(-X))
data_matrix, label_matrix = load_data_set()
data_matrix = np.mat(data_matrix)
label_matrix = np.mat(label_matrix).transpose()
m, n = shape(data_matrix)
m, n
(100, 3)
alpha = 0.001
max_cycles = 500
weights = np.ones((n, 1))
weights
array([[1.],[1.],[1.]])
h = sigmoid(data_matrix * weights)
h
matrix([[0.9999997 ],[0.98616889],[0.99887232],[0.99892083],[0.99999619],[0.99979122],[0.99999945],[0.99553342],[0.99998516],[0.99998882],[0.99984482],[0.99999982],[0.99524519],[0.99975551],[0.99793879],[0.97128332],[0.99919801],[0.97477903],[0.77681757],[0.99957748],[0.9980066 ],[0.22252829],[0.99999498],[0.26394949],[0.8246228 ],[0.99999261],[0.99991432],[0.01392443],[0.99215449],[0.99999407],[0.99007735],[0.99994736],[0.999999  ],[0.05986936],[0.99921454],[0.99997998],[0.99997966],[0.99982544],[0.99999104],[0.99998525],[0.97919678],[0.99971059],[0.99997751],[0.93705909],[0.9890627 ],[0.99996675],[0.1359093 ],[0.99921684],[0.99999079],[0.99999622],[0.99995015],[0.99998279],[0.99982675],[0.99999982],[0.9994663 ],[0.99964232],[0.9999885 ],[0.99997259],[0.99999121],[0.99542831],[0.98631076],[0.96925991],[0.99995761],[0.99999899],[0.99999879],[0.21808844],[0.99995494],[0.99999908],[0.99999865],[0.99998668],[0.99999443],[0.53267014],[0.99999957],[0.97651256],[0.99998887],[0.99993141],[0.33507029],[0.98891672],[0.99968925],[0.98927143],[0.99613509],[0.03702176],[0.99999797],[0.99999593],[0.83044946],[0.17239595],[0.9820568 ],[0.9999997 ],[0.99973113],[0.75736609],[0.59244738],[0.99999982],[0.9999823 ],[0.88578868],[0.82357126],[0.98572192],[0.9999961 ],[0.26402371],[0.99999196],[0.99999989]])

梯度上升算法

梯度上升算法,用来求解函数的最大值,沿着梯度的方向上升的速度最快

对于一个函数 y = f ( x ) y = f(\pmb{x}) y=f(xxx),这个函数的导数(derivative)记为 f ′ ( x ) f\prime(x) f′(x)或 d y d x \frac{dy}{dx} dxdy​,导数代表f(x)在点x处的斜率,表明如何缩放输入的小变化才能在输出获得相应的变化:

f ( x + ϵ ) ≈ f ( x ) + ϵ d y d x f(x+ \epsilon) \approx f(x) + \epsilon \frac{dy}{dx} f(x+ϵ)≈f(x)+ϵdxdy​

针对具有多位输入的函数,需要用到偏导数(partial derivative),偏导数 ∂ f ( ( x ) ) ∂ x i \frac{\partial f (\pmb(x))}{\partial x_i} ∂xi​∂f((​(​​(x))​衡量点x处只有 x i x_i xi​增加时f(x)如何变化,梯度(gradient)是相对一个向量求导的导数,f的导数是包含所有偏导数的向量

梯度向量指向上坡,在梯度方向上移动增加f,称为最速上升法(method of steepest descent)或梯度上升(gradient descent)算法

梯度上升算法建议新的点为

x ′ = x + ϵ ∂ f ( x ) ∂ x i x' = x + \epsilon \frac{\partial f( \pmb{x} )}{\partial x_i} x′=x+ϵ∂xi​∂f(xxx)​

ε指的是学习率,一个确定步长大小的正标量,通常选择一个较小的常数

sigmoid函数的输入为 z = w 0 x 0 + w 1 x 1 + . . . + w n x n z = w_0 x_0 + w_1 x_1 + ... + w_n x_n z=w0​x0​+w1​x1​+...+wn​xn​

通过多次迭代不断更新权重向量 w \pmb{w} www和 b b b,使得 f ( x ) f(\pmb{x}) f(xxx)接近 y \pmb{y} y​y​​y

for i in range(max_cycles):h = sigmoid(data_matrix * weights)error = (label_matrix - h)weights += alpha * data_matrix.transpose() * error
weights
array([[ 4.12414349],[ 0.48007329],[-0.6168482 ]])
weights * [1.0, -0.017612, 14.053064]
array([[ 4.12414349e+00, -7.26344151e-02,  5.79568524e+01],[ 4.80073293e-01, -8.45505083e-03,  6.74650071e+00],[-6.16848197e-01,  1.08639304e-02, -8.66860719e+00]])
def test(x):return sigmoid(np.sum(weights.transpose() * list(x))) > 0.5

测试模型准确率

accuracy = 0.0
for x, y in zip(data_matrix, label_matrix):if test(x) == True and y == 1:accuracy += 1elif test(x) == False and y == 0:accuracy += 1
accuracy / len(label_matrix)
0.96

引用

周志华. 机器学习 : = Machine learning[M]. 清华大学出版社, 2016.
[美] 伊恩·古德费洛 / [加] 约书亚·本吉奥 / [加] 亚伦·库维尔. 深度学习. 人民邮电出版社, 2017.
哈林顿李锐. 机器学习实战 : Machine learning in action[M]. 人民邮电出版社, 2013.

最后

  • 由于博主水平有限,不免有疏漏之处,欢迎读者随时批评指正,以免造成不必要的误解!