5 (4)

5 (4)

#### Overview of this book

Welcome to the Robot World … and start building intelligent software now! Through his best-selling video courses, Hadelin de Ponteves has taught hundreds of thousands of people to write AI software. Now, for the first time, his hands-on, energetic approach is available as a book. Starting with the basics before easing you into more complicated formulas and notation, AI Crash Course gives you everything you need to build AI systems with reinforcement learning and deep learning. Five full working projects put the ideas into action, showing step-by-step how to build intelligent software using the best and easiest tools for AI programming, including Python, TensorFlow, Keras, and PyTorch. AI Crash Course teaches everyone to build an AI to work in their applications. Once you've read this book, you're only limited by your imagination.
Preface
Free Chapter
Welcome to the Robot World
Python Fundamentals – Learn How to Code in Python
AI Foundation Techniques
Your First AI Model – Beware the Bandits!
AI for Sales and Advertising – Sell like the Wolf of AI Street
Welcome to Q-Learning
AI for Logistics – Robots in a Warehouse
Going Pro with Artificial Brains – Deep Q-Learning
AI for Autonomous Vehicles – Build a Self-Driving Car
AI for Business – Minimize Costs with Deep Q-Learning
Deep Convolutional Q-Learning
AI for Games – Become the Master at Snake
Recap and Conclusion
Other Books You May Enjoy
Index

# AI solution refresher

Let's refresh our memory by reminding ourselves of the steps of the deep Q-learning process, while adapting them to our self-driving car application.

Initialization:

1. The memory of the experience replay is initialized to an empty list, called memory in the code.
2. The maximum size of the memory is set, called capacity in the code.

At each time t, the AI repeats the following process, until the end of the epoch:

1. The AI predicts the Q-values of the current state St. Therefore, since three actions can be played (0 <-> 0°, 1 <-> 20°, or 2 <-> -20°), it gets three predicted Q-values.
2. The AI performs an action selected by the Softmax method (see Chapter 5, Your First AI Model – Beware the Bandits!):
3. The AI receives a reward , which is one of -1, -0.2 or +0.1.
4. The AI reaches the next state , which is composed of the next three signals from the three sensors, plus the orientation of...