Book Image

Artificial Intelligence with Python - Second Edition

By : Alberto Artasanchez, Prateek Joshi
Book Image

Artificial Intelligence with Python - Second Edition

By: Alberto Artasanchez, Prateek Joshi

Overview of this book

Artificial Intelligence with Python, Second Edition is an updated and expanded version of the bestselling guide to artificial intelligence using the latest version of Python 3.x. Not only does it provide you an introduction to artificial intelligence, this new edition goes further by giving you the tools you need to explore the amazing world of intelligent apps and create your own applications. This edition also includes seven new chapters on more advanced concepts of Artificial Intelligence, including fundamental use cases of AI; machine learning data pipelines; feature selection and feature engineering; AI on the cloud; the basics of chatbots; RNNs and DL models; and AI and Big Data. Finally, this new edition explores various real-world scenarios and teaches you how to apply relevant AI algorithms to a wide swath of problems, starting with the most basic AI concepts and progressively building from there to solve more difficult challenges so that by the end, you will have gained a solid understanding of, and when best to use, these many artificial intelligence techniques.
Table of Contents (26 chapters)
24
Other Books You May Enjoy
25
Index

Data ingestion

Once you have crafted and polished your question to a degree to which you are satisfied with, it is now time to gather the raw data that will help you answer the question. This doesn't mean that your question cannot be changed once you go on to the next steps of the pipeline. You should continuously refine your problem statement and adjust it as necessary.

Collecting the right data for your pipeline might be a tremendous undertaking. Depending on the problem you are trying to solve, obtaining relevant datasets might be quite difficult.

Another important consideration is to decide how will the data be sourced, ingested, and stored:

  • What data provider or vendor should we use? Can they be trusted?
  • How will it be ingested? Hadoop, Impala, Spark, just Python, and so on?
  • Should it be stored as a file or in a database?
  • What type of database? Traditional RDBMS, NoSQL, graph.
  • Should it even be stored? If we have a real-time feed into the...