Now that we have gone through the effort of defining a detailed coding standard, how do we make sure that everybody on the team adheres to it? Even if you are the only one writing code, it would help to have a way of checking that you are following your own standard. Looking through every single file and inspecting it for coding standard adherence would do the job, but it would also be mind-numbingly boring and repetitive. Luckily, repeating things is something at which computers excel.
PHP_CodeSniffer is a free package that parses PHP source files and checks them for compliance with pre-defined coding standards. The software comes pre-defined with some common coding standards, namely the PEAR, Zend, Squiz, MySource, and PHPCS standards. But luckily, the author made the package easily extensible. Defining your own coding standard against which PHP_CodeSniffer can check is simply a matter of extending some classes and implementing some methods. Naturally, PHP_CodeSniffer is written in object-oriented PHP.
Note
In their own words: "PEAR is a framework and distribution system for reusable PHP components."
What this means is that PHP comes with a simple installer that can be used to automatically install any of the various libraries that are categorized, organized, documented, and made available for download from the PEAR site:
Let's start by installing PHP_CodeSniffer directly from the PEAR site using their handy command line installer. When I built and installed PHP on my system, I had it put everything in a sub-folder of my local Apache installation. In my case, the PHP root folder is therefore in /usr/local/apache2/php/
. This is where you can find the PEAR executable among other handy PHP utilities. The most recent version at the time of this writing is 1.2.1.
Installing the package is simply a matter of telling the pear executable to download and install it. You should see something similar to the following lines cascading down your screen.
Afterwards, listing the contents of your PHP installations bin
directory should show the phpcs
executable in addition to everything else you had in that directory.
Note
Just like PHP itself, the PEAR installer supports many operating systems. I installed PHP_CodeSniffer on Mac OS X, but the same procedure will work on any of the operating systems on which PHP itself runs. Also, if you are running into problems during the installation, I urge you to visit the PEAR site's support section and FAQ page.
Using PHP_CodeSniffer to validate one or more files against a coding standard is pretty straightforward. You simply invoke the phpcs
executable and pass the file or directory to check as an argument.
Here is some sample output from checking one of the listings for this chapter:
As you can see from the output, checking against the default coding standard (Zend), phpcs
has found three errors and one warning. For each of the problems it finds, phpcs
reports the line number of the source file, the severity of the issue (error, warning, and so on) and a description. The three errors it found aren't really part of the coding standard that we defined earlier in this chapter. However, the warning indicates that the maximum line number, which we defined in our own standard to be 80 characters, was exceeded. We could have suppressed the warnings by adding the n
switch to the command line. At this point, we should fix the issue and re-run phpcs
to confirm that there are no additional issues detected.
Rather than duplicating the PHP_CodeSniffer documentation here, I want to use this section to briefly list the available runtime options and highlight some of the more useful ones.
There are various command line options that let you customize what to check, how to check, and how to format the ouput. Type phpcs --help
to see all of the arguments understood by the script.
PHP_CodeSniffer comes with several predefined coding standards. Typing phpcs i
will tell which coding standards are installed. The version of the tool I installed came with the following standards: MySource, PEAR, PHPCS, Squiz, and Zend. To tell phpcs
to check against a specific standard, simply add the following option to the command line: --standard=Zend
.
The last useful command line switch I would like to point out is the --report=summary
argument. When recursively checking a directory of source files, the output can get rather long. This switch prints a summary report with one line per file rather than outputting each individual issue it identified. Here is the summary report for all code listings in this chapter up to this point.
Here is a list of additional features you can customize from the command line. Please refer to the PHP_CodeSniffer online documentation for details.
Including/excluding files based on their extension
Excluding files or directories based on their name
Limiting the checking to select sub-sections of the coding standard (sniffs)
Verbosity of the output
Permanently setting / deleting configuration options for future invocations of
phpcs
including defaults for:
Being able to check code against an existing and established coding standard, such as Zend, is useful, but the real power of PHP_CodeSniffer lies in its extensibility. By defining our own coding standard in a format PHP_CodeSniffer understands, we can use it to check conformance to our own standard.
Defining your own coding standard involves a three-step process. The steps are as follows:
Note
Terminology: In PHP_CodeSniffer parlance, Sniffs are individual class files that contain the logic to validate a single coding standard rule. For example, you might have a file called LineLengthSniff.php
that contains the definition for class LineLengthSniff
. The logic in this class will be invoked each time PHP_CodeSniffer wants to check whether the length of a line of code conforms to the defined standard.
The directory structure for each coding standard definition is simple. From the command line, the following set of commands can be used to create the initial set of directories and files of a hypothetical coding PHP_CodeSniffer standard definition.
We start by creating a directory to hold all the files associated with our coding standard definition, ProjectStandard
. In that directory, we create a class file called ProjectStandardCodingStandard.php
that identifies our directory as one containing a PHP_CodeSniffer coding standard definition. This class is also responsible for communicating with the main phpcs
executable regarding some of the details of our coding standard (more about that later).
Next we create a directory called Sniff
, which will contain descriptions of individual coding standard rules, or collections (directories thereof). In our case, we plan on creating rules regarding naming conventions and line length, which is why we are creating directories NamingConventions
and LineLength
inside the Sniffs
directory. Each of these sub-directories then contains one or more individual sniff files: LimitLineLengthSniff.php, ClassNamesSniff.php, VariableNamesSniff.php
, and PropertyNamesSniff.php
.
In the previous step, we have already created a placeholder for the main class file in our ProjectStandard directory. Now let's put some code in there and identify ourselves as a PHP_CodeSniffer coding standard.
<?php // make sure the parent class is in our include path if (class_exists('PHP_CodeSniffer_Standards_CodingStandard', true) === false) { throw new PHP_CodeSniffer_Exception('Class PHP_CodeSniffer_Standards_CodingStandard not found'); } // our main coding standard class definition class PHP_CodeSniffer_Standards_ProjectStandard_ProjectStandardCodingStandard extends PHP_CodeSniffer_Standards_CodingStandard { // include sniffs from other directories or even whole coding standards // great way to create your standard and build on it public function getIncludedSniffs() { // return an array of sniffs, directories of sniffs, // or coding standards to include return array( 'Generic' ); } // exclude sniffs from previously included ones public function getExcludedSniffs() { // return a list of sniffs or directories of sniffs to exclude return array( 'Generic/Sniffs/LineLengthSniff.php' ); } } ?>
Our main class extends PHP_CodeSniffer_Standards_CodingStandard
. This is required of all classes identifying a coding standard to be used with phpcs
. However, the two methods we are implementing, getIncludedSniffs()
and getExcludedSniffs()
are pretty important because they let us assemble our own coding standard from parts of existing standards, thus saving us a lot of time since we don't have to write all the sniffs ourselves. Both classes return simple arrays. The items in the array are either names of existing coding standards, paths to directories of sniffs of existing coding standards, or paths to individual sniff files. For example, it turns out that our own coding standard is pretty close to the "Generic" coding standard included with PHP_CodeSniffer. Therefore, to make things easier for us, we include the whole "Generic" coding standard in the getIncludedSniffs()
method, but choose to exclude that standard's LineLengthSniff.php
in the getExcludedSniffs()
method.
Each of the rules we formulated to express our coding standard earlier in the chapter, can be described using a sniff class file that PHP_CodeSniffer can use. However, before we jump in and really get our hands dirty, it is a good idea to review how tokenization works. After all, PHP_CodeSniffer builds on and expands PHP's inbuilt tokenizer extension.
Note
Tokenization is the process of breaking input text into meaningful parts. When combined with a classification and/or description, each such part is considered a token.
PHP uses the Zend Engine's tokenizer at its core to parse and interpret PHP source files. Lucky for us, the tokenizer is also directly accessible via two functions in the language's top-level namespace: token_get_all()
and token_get_name().
Tokenization consists of taking input text and breaking it up into meaningful segments. Each segment can optionally carry a label, explanation, or additional detail. Let's look at an example of PHP code being tokenized by the Zend Engine.
<?php // get the contents of this file into a variable $thisFile = file_get_contents(__FILE__); // get the token stack $tokenStack = token_get_all($thisFile); $currentLine = 0; // output each token & look up the corresponding name foreach ($tokenStack as $token) { // most tokens are arrays if (is_array($token)) { if ($currentLine < $token[2]) { $currentLine++; echo "Line $currentLine:\n"; } echo "\t" . token_name($token[0]) . ': ' . rtrim($token[1]) . "\n"; // some tokens are just strings } else { echo "\tString: " . rtrim($token) . "\n"; } } ?>
The above code snippet runs itself through the tokenizer, which results in the following output:
We formatted our output a littler nicer, but the first token essentially looks like this:
Array( [0] => 367 [1] => <?php [2] => 1 )
In this case, 367
is the value of the parser token, which corresponds to T_OPEN_TAG
when we look it up with the token_name()
function.<?php
is the actual text of the token and 1 is the line number on which the token occurs. You can look up the complete list of tokenizer token constants in the online PHP manual, or the following code snippet will list the ones that are defined for your version of PHP.
<?php // get all constants organized by category $allTokens = get_defined_constants(true); // we're only interested in tokenizer constants print_r($allTokens["tokenizer"]); ?>
As you can see for yourself, tokens contain a lot of information that is useful to programmatically understand what an analyzed portion of code is doing. PHP_CodeSniffer builds upon the existing tokenization extension and built-in tokens by providing additional tokens to provide even finer granularity when examining PHP code.
Now that you know what tokens are, it will be much easier to understand what the individual sniffs are doing. First of all, a sniff registers with the main executable the tokens in which it is interested using the register()
method. That way, the main code can hand over execution to the sniff's process()
method whenever it encounters such a token. For example, a sniff trying to validate that a code file has the proper PHP opening and/or closing tag might register interest in the T_OPEN_TAG
with the parent code. That is exactly what we're doing in the following listing:
<?php // sniff class definition must implement the // PHP_CodeSniffer_Sniff interface class ProjectStandard_Sniffs_Syntax_FullPhpTagsSniff implements PHP_CodeSniffer_Sniff { // register for the tokens we're interested in public function register() { return array(T_OPEN_TAG); } // process each occurrence of the token in this method public function process(PHP_CodeSniffer_File $phpcsFile, $stackPtr) { $tokens = $phpcsFile->getTokens(); // warn if the opening PHP tag is not the first token in the file if ($stackPtr != 0) { $phpcsFile->addWarning('Nothing should precede the PHP open tag.', $stackPtr); } // error if full PHP open tag is not used if ($tokens[$stackPtr]['content'] != '<?php') { $phpcsFile->addError('Only full PHP opening tags are allowed.', $stackPtr); } // all files must have closing tag if ($token[sizeof($tokens) - 1]['type'] != T_CLOSE_TAG) { $phpcsFile->addError('All files must end with a closing PHP tag.', $stackPtr); } } sniffssniffswriting} ?>
Let's take a closer look at the process()
method, which takes two parameters. The first one is a reference to a PHP_CodeSniffer_File
object, which we can use to access the token stack. The second argument is an array index to the current token in the stack.
Armed with that information, we can start validating the code via the token stack. First, we use the PHP_CodeSniffer_File's addWarning()
method to display a warning message whenever the very first token is not the PHP open tag. Next, we use the addError()
method to display an error message to the user whenever the opening tag doesn't match the string "<?php" since that is the only opening tag that our coding standard allows. Lastly, we display an error message if the last token in the stack is anything other than the closing PHP tag.
That's it. The main phpcs
executable does the rest. It tokenizes the input file(s), calls all registered sniffs for each occurrence of their respective token, and displays nicely formatted output to the user.
You may have noticed in the above listing that we used the values of the token's 'content' and 'type' attributes. If you recall, the tokens returned by the standard PHP tokenizer did not have those attributes. Instead, PHP_CodeSniffer
adds those and other attributes. Following is a list of token attributes that are always available. Depending on the type of token, additional attributes might be available. You should consult the PHP_CodeSniffer
API documentation for details.
Attribute name |
Example |
Description |
---|---|---|
code |
301 |
The token type code (see |
content |
if |
The token content |
type |
T_IF |
The token name |
line |
56 |
The line number when the token is located |
column |
12 |
The column in the line where this token starts (starts from 1) |
level |
2 |
The depth a token is within the scopes open |
Conditions |
Array( 2 => 50, 9 => 353 ) |
A list of scope condition token positions => codes that opened the scopes that this token exists in (see conditional tokens) |
We have already seen that we can include sniffs from other coding standards in our own. However, we can take it a step further and make an existing sniff do all the work while still implementing our own standard. For example, the "Generic" coding standard includes a sniff to check for maximum line length. As it happens, the suggested maximum line length is 80 characters— the same as in our own standard. However, the absolute maximum line length is 100; whereas, our standard allows for up to 120 characters per line. Therefore, all we have to do is extend the existing sniff and overwrite the protected property $absoluteLineLimit
as in the following listing.
<?php if (class_exists('Generic_Sniffs_Files_LineLengthSniff', true) === false) { throw new PHP_CodeSniffer_Exception('Class Generic_Sniffs_Files_LineLengthSniff not found'); } // class to check line length in number of characters // note: we're overwriting an existing sniff from the generic coding standard class Zend_Sniffs_Files_LineLengthSniff extends Generic_Sniffs_Files_LineLengthSniff { // we generate an error when exceeding the absolute // maximum line length protected $absoluteLineLimit = 120; } ?>
Even though PHP_CodeSniffer is available, there is no guarantee that individual developers will actually take advantage of it. However, in a team environment, the lead developer can take several steps to make sure the team members adhere to the chosen common standard. First, the code base should be scheduled for an automated check once a day during active development. A simple (insert you favorite scheduler utility here) job can process all the source files and send an email to everybody in the team.
However, it is possible to take things a step further. Assuming that you are using a source code control system, most of these systems provide hooks at various stages of checking out or committing source code. The most commonly used hook is the pre-commit hook. In other words, the source code control system executes any number of user-configurable steps before committing the code. The outcome of these steps impact whether the use is allowed to commit the code or not. In the case of our coding standard, we can configure the pre-commit hook to run any PHP source files being committed through PHP_CodeSniffer and only proceed if no errors and/or warnings are being generated. In essence, this is a way that your team only accepts contributions from individual developers if they adhere to the team's coding standard.
For a detailed example of how to configure the Subversion source code control system with a PHP_CodeSniffer pre-commit hook, please consult the chapter on source code and version control.