PyTorch (4) Logistic Regression
次ã¯ããã¸ã¹ãã£ãã¯å帰ï¼Logistic Regressionï¼ï¼ãã¸ã¹ãã£ãã¯å帰ã¯ãå帰ã¨ã¤ããã©åé¡ã®ã¢ã«ã´ãªãºã ã§ãé ã層ããªããæ´»æ§å颿°ã«ã·ã°ã¢ã¤ã颿°ï¼2ã¯ã©ã¹åé¡ã®ã¨ãï¼ãã½ããããã¯ã¹é¢æ°ï¼å¤ã¯ã©ã¹åé¡ã®ã¨ãï¼ã使ã£ããã¥ã¼ã©ã«ãããã¨ãã¦ã¢ãã«åã§ãããIrisã¨MNISTï¼Notebookåç §ï¼ã®ãã¼ã¿ã»ããã§å®è£ ãã¦ã¿ãã
Irisãã¼ã¿ã»ãã
import torch import torch.nn as nn import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler
PyTorchã¨ã¨ãã«scikit-learnã®é¢æ°ãããããæ´»ç¨ããã®ã§ã¤ã³ãã¼ãã
# hyperparameters input_size = 4 num_classes = 3 num_epochs = 10000 learning_rate = 0.01
Irisãã¼ã¿ã»ããã¯ç¹å¾´éã4ã¤ï¼sepal lengthãsepal widthãpetal lengthãpetal widthï¼ãªã®ã§å ¥åã¦ãããæ°ã¯4ã«ãããã¾ãã¯ã©ã¹æ°ã3ã¤ï¼SetosaãVersicolourãVirginicaï¼ãªã®ã§åºåã¦ãããæ°ã¯3ã«ããã
iris = load_iris() X = iris.data y = iris.target print(X.shape) # (150, 4) print(y.shape) # (150, )
ãã¼ã¿ã®ãã¼ãã¯scikit-learnã®load_iris()颿°ã§ç°¡åã«ã§ãããè¾æ¸ã§è¿ã£ã¦ããã data
ã§ãã¼ã¿æ¬ä½ã target
ã§ã¯ã©ã¹ã©ãã«ãåå¾ã§ãããã¯ã©ã¹ã©ãã«ã¯1-of-Kã«ãªã£ã¦ããªãã®ã§æ³¨æï¼PyTorchã¯ã¯ã©ã¹ã©ãã«ãèªåã§1-of-Kã«å¤æããªãã¦ãã¯ã©ã¹ã©ãã«ã®ã¾ã¾æ±ããã
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=5) print(X_train.shape) # (100, 4) print(X_test.shape) # (50, 4) print(y_train.shape) # (100, ) print(y_test.shape) # (50, )
è¨ç·´ãã¼ã¿ã¨ããªãã¼ã·ã§ã³ãã¼ã¿ã«åå²ããããscikit-learnã®é¢æ°ã使ãã°ç°¡åã«ã§ããã
# ãã¼ã¿ã®æ¨æºå scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) # print(np.mean(X_train, axis=0)) # [ -2.47274423e-15 3.85247390e-16 -4.26603197e-16 -7.66053887e-17] # print(np.std(X_train, axis=0)) # [ 1. 1. 1. 1.]
ãã¼ã¿ã®åç¹å¾´éãã¨ã«å¹³å0ãæ¨æºåå·®1ã«ãªãããã«ãã¼ã¿ãæ¨æºåãããIrisãã¼ã¿ã§ã¯æ¨æºåããªãã¦ãå¦ç¿ã¯ã§ãããã©ãã£ãã»ããå¦ç¿ãå®å®ããã¨æããprint
ãã¦ã¿ãã¨4ã¤ã®ç¹å¾´éããããã§å¹³åã0ãæ¨æºåå·®ã1ã«ãªã£ã¦ãã®ããããã
class LogisticRegression(nn.Module): def __init__(self, input_size, num_classes): super(LogisticRegression, self).__init__() self.linear = nn.Linear(input_size, num_classes) def forward(self, x): out = self.linear(x) return out
ãã¸ã¹ãã£ãã¯å帰ã¢ãã«ã®å®ç¾©ãè¦ãç®ã¯ç·å½¢å帰ã®ã¨ãã¨ã¾ã£ããåããå®éãå¤ã¯ã©ã¹ã®ãã¸ã¹ãã£ãã¯å帰㯠linear
ãéãããã¨ã« softmax
ãéãã®ã ãPyTorchã¯ã¢ãã«ã«ã¯å«ããªãã§ãã¸ããããã®ã¾ã¾è¿ãã®ãæµåã®ããã ããªããã¨ããã¨æå¤±é¢æ°ãè¨ç®ãã torch.nn.CrossEntropyLoss
ã®ä¸ã« softmax
ã®è¨ç®ãå«ã¾ãã¦ããããã
ãªããããªä»æ§ãªã®ãï¼ã¨æã£ã¦èª¿ã¹ã¦ã¿ãã softmax
ãå¿
è¦ãªã®ã¯è¨ç·´æã®æå¤±è¨ç®ã®ã¨ãã ãã§ãæ¨è«æã«ã¯å¿
è¦ãªãã®ã§å
¥ããªãæ¹ãå¹çãããã¨ã®ãã¨ãæ¨è«æã¯ãããã softmax
ãéãã¦ãããã確çã«ããªãã¦ããã¸ããã®ã¾ã¾å¤§å°æ¯è¼ãã§ããããã ã
Kerasã ã¨ã¢ãã«ã®æå¾ã« softmax
ã®æ´»æ§å颿°ãå«ãã¦ã¢ãã«åºåã¯ç¢ºçã«ãã¦ããããããªæãã§ã
# Kerasã®ä¾ model = Sequential() model.add(Dense(512, activation='relu', input_shape=(784,))) model.add(Dropout(0.2)) model.add(Dense(512, activation='relu')) model.add(Dropout(0.2)) model.add(Dense(num_classes, activation='softmax')) <= ããï¼
ãªã®ã§æ¨è«ã®ã¨ããforwardããã ãã§ç¢ºçã§åºã¦ããã
ä¸ã®ãããªPyTorchã®ã¢ãã«ã ã¨forwardã®åºåã¯ç¢ºçã«ãªã£ã¦ãªãã®ã§è¦æ³¨æï¼ç¢ºçã«ãããã¨ãã¯èªåã§ nn.functional.softmax()
ã使ãå¿
è¦ããããä¸ã®Forumã«ãããããã«PyTorchã®åçã°ã©ãã®ç¹æ§ããããã¦è¨ç·´æã¨æ¨è«æã§åããã®ããããããã
if self.training: # code for training else: # code for inference
次ã¯ã¢ãã«ã®ãªãã¸ã§ã¯ããä½ã£ã¦lossã¨optimizerãå®ç¾©ã
model = LogisticRegression(input_size, num_classes) criterion = nn.CrossEntropyLoss() optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
確èªã®ããnn.CrossEntropyLoss()
ã®ã½ã¼ã¹ã³ã¼ããè¦ã¦ã¿ã㨠softmax
ãã¡ããã¨å«ã¾ãã¦ããã®ã確èªã§ããã
def cross_entropy(input, target, weight=None, size_average=True, ignore_index=-100, reduce=True): return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
次ã¯ããããè¨ç·´ã«ã¼ãï¼ããã¯ååã¨å¤§ä½åãã
def train(X_train, y_train): inputs = torch.from_numpy(X_train).float() targets = torch.from_numpy(y_train).long() optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, targets) loss.backward() optimizer.step() return loss.item() def valid(X_test, y_test): inputs = torch.from_numpy(X_test).float() targets = torch.from_numpy(y_test).long() outputs = model(inputs) val_loss = criterion(outputs, targets) # ç²¾åº¦ãæ±ãã _, predicted = torch.max(outputs, 1) correct = (predicted == targets).sum().item() val_acc = float(correct) / targets.size(0) return val_loss.item(), val_acc loss_list = [] val_loss_list = [] val_acc_list = [] for epoch in range(num_epochs): perm = np.arange(X_train.shape[0]) np.random.shuffle(perm) X_train = X_train[perm] y_train = y_train[perm] loss = train(X_train, y_train) val_loss, val_acc = valid(X_test, y_test) if epoch % 1000 == 0: print('epoch %d, loss: %.4f val_loss: %.4f val_acc: %.4f' % (epoch, loss, val_loss, val_acc)) # logging loss_list.append(loss) val_loss_list.append(val_loss) val_acc_list.append(val_acc)
ããã¤ã注æç¹
- ã¨ããã¯ãã¨ã«ãã¼ã¿ãã·ã£ããã«ãã
- PyTorchã§ã¯ãã¼ã¿ã¯
FloatTensor
ã§ã©ãã«ã¯LongTensor
ã«ããå¿ è¦ããããIrisãã¼ã¿ã®ç¹å¾´éã¯float64 (double)
åã«ãªã£ã¦ãããã ãã³ã½ã«ãfloat()
ã§ãã£ã¹ãããå¿ è¦ããããã©ãã«ã¯ãã¨ãã¨int64 (long)
åãªã®ã§ãã£ã¹ãã¯ä¸è¦ã ã£ãã念ã®ããlong()
ã§ãã£ã¹ã criterion = nn.CrossEntropyLoss
ã«æ¸¡ãæ£è§£ã©ãã«ã¯ 1-of-Kã«ããå¿ è¦ããªãï¼ 0, 1, 2, 3ã¨ããã©ãã«ã®ã«ãã´ãªã®ã¾ã¾æ¸¡ãã
æå¾ã«lossã¨accã®ã°ã©ãæãã¨ãããªæãã§ãããã«å¦ç¿ã§ãã¦ããã®ã確èªã§ããã
# plot learning curve plt.figure() plt.plot(range(num_epochs), loss_list, 'r-', label='train_loss') plt.plot(range(num_epochs), val_loss_list, 'b-', label='val_loss') plt.legend() plt.figure() plt.plot(range(num_epochs), val_acc_list, 'g-', label='val_acc') plt.legend()