def load_data(): """ Loading Data """ input_file = os.path.join(TEXT_SAVE_DIR) with open(input_file, "r") as f: data = f.read() return data
- Implement
define_tokens
, as defined in the Pre-processing the data section of this chapter. This will help us create a dictionary of the key words and their corresponding tokens:
def define_tokens(): """ Generate a dict to turn punctuation into a token. Note that Sym before each text denotes Symbol :return: Tokenize dictionary where the key is the punctuation and the value is the token """ dict = {'.':'_Sym_Period_', ',':'_Sym_Comma_', '"':'_Sym_Quote_', ';':'_Sym_Semicolon_', '!':'_Sym_Exclamation_', '?':'_Sym_Question_', '(':'_Sym_Left_Parentheses_', ')':'_Sym_Right_Parentheses_', '--':'_Sym_Dash_', '\n':'_Sym_Return_', } return dict
The dictionary that we've created will be used to replace...