PHP is an extremely versatile language; however, there are times when it is useful to modify the capabilities of the runtime itself by interfacing with native libraries to do tasks that are outside of the pre-existing capabilities of PHP, or to perform a new task at speeds that an interpreted language is generally incapable of reaching (e.g., highly optimized numerical/scientific code). Fortunately, the PHP ecosystem has resources in place that make this possible for the suitably intrepid developer, but unfortunately this is not a particularly well-documented or common field of endeavor. Having recently worked on a project enhancing some PHP extensions, I thought it would be helpful to round up the most useful resources I found while embarking on this journey, and I hope this will help shed light on the process for others who are interested.

Prerequisite knowledge

The ecosystem of PHP extension is heavily centered around C, so you will need a solid grasp of the language. The canonical reference here would of course be the K&R C book (ISBN: 978-0131103627), but there are many references and tutorial resources available out there. You will also likely need to be very familiar with your development environment’s debugger (see section below for gdb-specific tips).

Extending and Embedding PHP by Sara Golemon

Sara Golemon’s book (ISBN-13: 978-0672327049) is probably the single best printed resource available for modern (PHP 5+) extension development. This is not to say that it is a book without flaws, but it is something that I found myself referencing quite a bit during a recent PECL project. At seven years old as of this writing, it does miss some of the most recent developments in the PHP internals space (e.g., the mechanisms for calling user-defined PHP functions from C, as you would need to invoke a user-provided callback, changed in 5.3).

Marcus Börger’s Extension Talks

http://talks.somabo.de/ contains an archive of various PHP conference talks given by Marcus Börger; the extension-related ones are good summaries of the topics to hand and may serve as a useful adjunct to the other resources listed here. In particular, his coverage of PHP5 object/class support from within a C extension is useful as a supplement to the coverage in Extending and Embedding PHP.

Kristina Chodorow's PHPLovecraftian Blog Posts

http://www.kchodorow.com/blog/2011/08/11/php-extensions-made-eldrich-installing-php/ A series of several blog posts introducing writing PHP extensions; very worthwhile reads (the above is the first article in the series). Heed her advice particularly in the first one on how to set up PHP for extension development and debugging, because it will make your life much easier. Honestly, it is worth spinning up a VM to isolate all this from your regular environments for this reason alone. (VirtualBox, vagrant, etc., are your friends here, but I imagine as a working developer in 2013 you probably already are familiar with using VMs in some fashion.)

LXR: sometimes the source is the only documentation you’ll have

Sadly, there will be parts of the PHP extension process that are completely undocumented in any form. When you run into something like this, the best resource is http://lxr.php.net to conveniently read the source to PHP itself. Particularly, you will be making heavy use of sometimes opaque C macros, and LXR is the best place to turn to find out what they actually do behind the scenes (e.g., MAKE_STD_ZVAL).

GDB and .gdbinit

If you spend time writing PHP extension code, it is a virtual certainty that you will spend time trying to understand what’s happening inside of the PHP interpreter, and your extension code, using gdb. If you are not familiar with gdb, there are a wide number of gdb tutorials available, and I wouldn't hazard a guess as to the best one. That said, what I can highly recommend is the use (or inclusion, if you already have one) of: https://github.com/php/php-src/blob/master/.gdbinit

If you put that in your home directory the next time you start gdb you’ll have all manner of extremely handy debugging macros at your fingertips. “printzv” in particular for easily displaying a ZVAL will save you from having to type things like “print *((zval *)(*((*((*((zval *)(*(thing_ctx*)foo)->extended_value)).value.ht)).pListHead)).pDataPtr)”

Be wary of Zend’s built-in data structures

Just a general warning; it might seem like the path of least resistance to use the pre-existing Zend data structures to pass data around within the domain of your extension (“meh, it’s a hashtable, beats having to write one of my own. . .”). In my experience however, this way lies more headaches than you would think (e.g., add_assoc_* will nul-terminate your keys behind the scenes, which may or may not be what you want). It’s easier in the long run to route around this for your own internal C-land use and use more transparent mechanisms for data storage (i.e., use your own structs/arrays/ pointers/hashtables/linked lists/etc. and don’t rely too heavily on Zend’s), converting back into whatever type of ZVAL is appropriate only when needed to give values back to the PHP side of the fence.

If you do spend any significant amount of time needing to interact with HashTable ZVALs in particular, you will probably want to write utility functions along the lines for add_assoc_* for the other HashTable operations (get current key/value, etc.) simply for sake of saving keystrokes. You likely will need to spend time getting familiar with HashTables as they are the underlying data structure for all arrays in PHP, both numerically indexed and associative, and are common return values.

Testing

PHPT tests are the easiest by far to distribute with your PECL extension as they integrate natively with “make test.” http://qa.php.net/write-test.php gives an overview of the file format and suggestions for how to write good ones.

C++

As mentioned earlier, the PHP extension ecosystem is heavily centered around C. There is some, but not much, information available online for interfacing PHP and C++. “Wrapping C++ classes in a PHP extension” by Paul Osman: http://devzone.zend.com/1435/wrapping-c-classes-in-a-php-extension/ appears to be one of the better resources.

PECL

Once you’ve gone through the trouble of writing and debugging your extension, you may want to share the fruits of your labor with the rest of the world. PECL (http://pecl.php.net/) is the extension repository companion to PEAR, which you may already be familiar with in the context of pure PHP packages, and would be a good place to look for information on how best to do this.

Good luck and Godspeed

Writing a PHP extension can be intensely frustrating and confusing at times, but it’s magical once you’ve gotten things working for the first time. It can also give you a much better understanding of how PHP works by becoming familiar with the internals of the language, which is a worthy goal in it's own right. I hope the resources above help get you through the rough patches along the way.