In this section, we shall build the gradient boosted trees model for detecting exoplanets using the Kepler dataset. Let us follow these steps in the Jupyter Notebook to build and train the exoplanet finder model:
- We will save the names of all the features in a vector with the following code:
numeric_column_headers = x_train.columns.values.tolist()
- We will then bucketize the feature columns into two buckets around the mean since the TFBT estimator only takes bucketed features with the following code:
bc_fn = tf.feature_column.bucketized_column nc_fn = tf.feature_column.numeric_column bucketized_features = [bc_fn(source_column=nc_fn(key=column), boundaries=[x_train[column].mean()]) for column in numeric_column_headers]
- Since we only have numeric bucketized features and no other kinds of features, we store them in the
all_features
variable with the following code:
all_features = bucketized_features
- We will then define the batch size and create a function...