Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying TensorFlow 2.0 Computer Vision Cookbook
  • Table Of Contents Toc
TensorFlow 2.0 Computer Vision Cookbook

TensorFlow 2.0 Computer Vision Cookbook

By : Martínez
4.3 (7)
close
close
TensorFlow 2.0 Computer Vision Cookbook

TensorFlow 2.0 Computer Vision Cookbook

4.3 (7)
By: Martínez

Overview of this book

Computer vision is a scientific field that enables machines to identify and process digital images and videos. This book focuses on independent recipes to help you perform various computer vision tasks using TensorFlow. The book begins by taking you through the basics of deep learning for computer vision, along with covering TensorFlow 2.x’s key features, such as the Keras and tf.data.Dataset APIs. You’ll then learn about the ins and outs of common computer vision tasks, such as image classification, transfer learning, image enhancing and styling, and object detection. The book also covers autoencoders in domains such as inverse image search indexes and image denoising, while offering insights into various architectures used in the recipes, such as convolutional neural networks (CNNs), region-based CNNs (R-CNNs), VGGNet, and You Only Look Once (YOLO). Moving on, you’ll discover tips and tricks to solve any problems faced while building various computer vision applications. Finally, you’ll delve into more advanced topics such as Generative Adversarial Networks (GANs), video processing, and AutoML, concluding with a section focused on techniques to help you boost the performance of your networks. By the end of this TensorFlow book, you’ll be able to confidently tackle a wide range of computer vision problems using TensorFlow 2.x.
Table of Contents (14 chapters)
close
close

Chapter 7: Captioning Images with CNNs and RNNs

Equipping neural networks with the ability to describe visual scenes in a human-readable fashion has to be one of the most interesting yet challenging applications of deep learning. The main difficulty arises from the fact that this problem combines two major subfields of artificial intelligence: Computer Vision (CV) and Natural Language Processing (NLP).

The architectures of most image captioning networks use a Convolutional Neural Network (CNN) to encode images in a numeric format so that they're suitable for the consumption of the decoder, which is typically a Recurrent Neural Network (RNN). This is a kind of network specialized in learning from sequential data, such as time series, video, and text.

As we'll see in this chapter, the challenges of building a system with these capabilities start with preparing the data, which we'll cover in the first recipe. Then, we'll implement an image captioning solution...

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
TensorFlow 2.0 Computer Vision Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon