Book Image

Dancing with Python

By : Robert S. Sutor
Book Image

Dancing with Python

By: Robert S. Sutor

Overview of this book

Dancing with Python helps you learn Python and quantum computing in a practical way. It will help you explore how to work with numbers, strings, collections, iterators, and files. The book goes beyond functions and classes and teaches you to use Python and Qiskit to create gates and circuits for classical and quantum computing. Learn how quantum extends traditional techniques using the Grover Search Algorithm and the code that implements it. Dive into some advanced and widely used applications of Python and revisit strings with more sophisticated tools, such as regular expressions and basic natural language processing (NLP). The final chapters introduce you to data analysis, visualizations, and supervised and unsupervised machine learning. By the end of the book, you will be proficient in programming the latest and most powerful quantum computers, the Pythonic way.
Table of Contents (29 chapters)
Part I: Getting to Know Python
PART II: Algorithms and Circuits
PART III: Advanced Features and Libraries
Other Books You May Enjoy
Appendix C: The Complete UniPoly Class
Appendix D: The Complete Guitar Class Hierarchy
Appendix F: Production Notes

15.5 Clustering

Suppose I have a CSV dataset containing 75 (x, y) geometric coordinates. I load these into the xy_df pandas DataFrame and look at its descriptive statistical summary:

xy_df = pd.read_csv("src/examples/clustering-xy.csv")
            x       y
count 75.0000 75.0000
mean   7.5733  4.5401
std    4.0102  2.1265
min    1.9796  1.1947
25%    3.4182  2.8896
50%    7.0173  3.6819
75%   12.2170  6.9615
max   13.5643  8.2785

Here is the usual sample of the first five points:

        x      y
0 13.4832 3.2657
1  7.6388 7.0170
2  2.9279 2.9603
3  7.4514 6.4439
4  3.3011 2.4642

How are these points spread out geometrically? Are they uniformly distributed within their minimum and maximum ranges?

A scatter plot would help us see the distribution because we are in two dimensions, but let’s try to collect or cluster the points into k groups first. Here, k is...