Book Image

Learning NumPy Array

By : Ivan Idris
Book Image

Learning NumPy Array

By: Ivan Idris

Overview of this book

Table of Contents (14 chapters)
Learning NumPy Array
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Describing data with pandas DataFrames


Luckily, pandas has descriptive statistics utilities. We will read the average wind speed, temperature, and pressure values from the KNMI De Bilt data file into a pandas DataFrame. This object is similar to the R dataframe, which is like a data table in a spreadsheet or a database. The columns are labeled, the data can be indexed, and you can run computations on the data. We will then print out descriptive statistics and a correlation matrix as shown in the following steps:

  1. Read the CSV file with the pandas read_csv function. This function works in a similar fashion to the NumPy load_txt function:

    to_float = lambda x: .1 * float(x.strip() or np.nan)
    to_date = lambda x: dt.strptime(x, "%Y%m%d")
    cols = [4, 11, 25]
    conv_dict = dict( (col, to_float) for col in cols) 
    
    conv_dict[1] = to_date
    cols.append(1)
     
    headers = ['dates', 'avg_ws', 'avg_temp', 'avg_pres']
    df = pd.read_csv(sys.argv[1], usecols=cols, names=headers, index_col=[0], converters=conv_dict)
  2. Print...