Book Image

Python Data Analysis

By : Ivan Idris
Book Image

Python Data Analysis

By: Ivan Idris

Overview of this book

Table of Contents (22 chapters)
Python Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Key Concepts
Online Resources
Index

Calling C code


We can call C functions from Cython. The C string strlen() function is the equivalent of the Python len() function. Call this function from a Cython .pyx file by importing it as follows:

from libc.string cimport strlen

We can then call strlen() from somewhere else in the .pyx file. The .pyx file can contain any Python code. Have a look at the cython_module.pyx file in this book's code bundle:

from collections import defaultdict
from nltk.corpus import stopwords
from nltk.corpus import names
from libc.string cimport strlen

sw = set(stopwords.words('english'))
all_names = set([name.lower() for name in names.words()])

def isStopWord(w):
    return w in sw or strlen(w) == 1 or not w.isalpha() or w in all_names

def filter_sw(words):
    return [w.lower() for w in words if not isStopWord(w.lower())]

def freq_dict(words):
    dd = defaultdict(int)

    for word in words:
        dd[word] += 1

    return dd

To compile this code we need a setup.py file with the following contents...