5 (4)

5 (4)

#### Overview of this book

Welcome to the Robot World … and start building intelligent software now! Through his best-selling video courses, Hadelin de Ponteves has taught hundreds of thousands of people to write AI software. Now, for the first time, his hands-on, energetic approach is available as a book. Starting with the basics before easing you into more complicated formulas and notation, AI Crash Course gives you everything you need to build AI systems with reinforcement learning and deep learning. Five full working projects put the ideas into action, showing step-by-step how to build intelligent software using the best and easiest tools for AI programming, including Python, TensorFlow, Keras, and PyTorch. AI Crash Course teaches everyone to build an AI to work in their applications. Once you've read this book, you're only limited by your imagination.
Preface
Free Chapter
Welcome to the Robot World
Python Fundamentals – Learn How to Code in Python
AI Foundation Techniques
Your First AI Model – Beware the Bandits!
AI for Sales and Advertising – Sell like the Wolf of AI Street
Welcome to Q-Learning
AI for Logistics – Robots in a Warehouse
Going Pro with Artificial Brains – Deep Q-Learning
AI for Autonomous Vehicles – Build a Self-Driving Car
AI for Business – Minimize Costs with Deep Q-Learning
Deep Convolutional Q-Learning
AI for Games – Become the Master at Snake
Recap and Conclusion
Other Books You May Enjoy
Index

# The Thompson Sampling model

You're going to build this model straight away. Right now, you'll build a simple implementation of this method, and later you will be shown the theory behind it. Let's get right into it!

As we defined previously, our problem is trying to find the best slot machine with the highest winning chance out of many. A not-so-optimal solution would be to play 100 rounds on each of our slot machines and see which one has the highest winning rate. A better solution is a method called Thompson Sampling.

I won't go too deeply into the theory behind it; we'll cover that later. For now, it is enough to say that Thompson Sampling uses a distribution function (distributions will be explained further in this chapter), called Beta, that takes two arguments. For simplicity's sake, let's say that the higher the first argument is, the better our slot machine is, and the higher the second argument is, the worse our slot machine...