관리 메뉴

너와 나의 스토리

Random forest regression 실습 1 본문

Data Analysis/Machine learning

Random forest regression 실습 1

노는게제일좋아! 2019. 8. 19. 20:17
반응형

[출처] 

[Dataset]

 

 

1. importing

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df=pd.read_csv('./input/Position_Salaries.csv')

 

2. Asigning the input and output values

X=df.iloc[:,1].values
y=df.iloc[:,2].values

 

 

3. Fitting Random Forest Regression to the dataset

from sklearn.ensemble import RandomForestRegressor
regressor = RandomForestRegressor(n_estimators = 10, random_state = 0)
regressor.fit(X.reshape(-1,1), y.reshape(-1, 1))

n_estimators: 모형 개수  -> 트리 개수

random_state: If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. (참조)

 

 

 

4. Predicting a new result

y_pred = regressor.predict([[6.5]])
y_pred

출력: array([167000.])

 

* [[6.5]]가 의미하는게 뭐지.....?

 

 

 

5. Visualising the Random Forest Regression results (higher resolution)

X_grid = np.arange(min(X),max(X),0.01)  # X의 최소값부터 X의 최대값까지 0.01 단위로 값 채우기
X_grid=X_grid.reshape((len(X_grid),1))  # n X 1 행렬로 변환
plt.scatter(X,y, color='red')
plt.plot(X_grid, regressor.predict(X_grid),color='blue')
plt.title('Truth or Bluff (Random Forest Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

 

반응형
Comments