-
Book Overview & Buying
-
Table Of Contents
Time Series with PyTorch
By :
We have covered optimization in general in this chapter, and it is important to highlight that there is some nuance in applying optimization to different neural architectures. For example, with dropout we apply it to nodes in any layer of an FFN/MLP, but with LSTMs we typically apply dropout only when we have more than one layer. You will develop your understanding by experimenting with network designs and building experiments. Of course, for those that just want to forecast, we can use Darts or Nixtla’s libraries, but these do have set design choices, even as they make your task easier.
For those building neural architectures from scratch, we would suggest you optimize in the following order:
This ordering follows the chain of dependency: architectural choices (activation functions, layer counts, node counts) constrain which optimizers make sense, and both architecture...