How to create PHP extensions – part II: Introduction to PHP variables

In one of my previous articles I explained how to build a PHP extension, and how to declare a simple function in C. I demonstrated how to return data from the C function to the script in PHP. But I have not explained how to pass the arguments to the function, nor how to change the values of existing PHP variables.

In order to do so, I have to introduce you the internal ZVAL data structure used by PHP engine to store the PHP variables.

What you should know about weak data types?

One of the most serious drawbacks of PHP, mentioned by developers who use other, “enterprise” programming languages, ​​is the lack of strong data typing. A weakly typed language does not necessarily enforce types. This is usually accompanied by autoconversion of variables to appropriate types.

For some, this is a drawback, for others (like me) it’s an advantage. When writing PHP extensions it is however… a nuisance.

It’s because of the fact that the C language (as opposite to PHP) uses strong data types, which is why PHP variables must be stored in a special ZVAL data structures, so operations on these variables must be performed by a set of dedicated functions and macros provided by the PHP engine.

In this case ZVAL variables are represented by the following C structure:

struct _zval_struct {
    /* Variable information */
    zvalue_value value; /* value of the PHP variable (C union) */
    zend_uint refcount; /* reference counter of the value associated with this variable */
    zend_uchar type; /* current type of this PHP variable */
    zend_uchar is_ref; /* 0 if this structure holds the original value, or 1 if it's the reference to other structure */

…where zvalue_value is represented by the following C union:

typedef union _zvalue_value {
    long lval; /* long value */
    double dval; /* double value */
    struct {
        char *val; /* string value  */
        int len; /* length of the string (because it must be binary safe) */
    } str; /* string value */
    HashTable *ht; /* hashtable value  */
    zend_object_value obj; /* handle to an object  */
} zvalue_value;

Looks scary? Fortunetely it isn’t! In everyday programming, you do not have to worry about these structures at all; most low-level operations performed on these structures are hidden behind various built-in macros and functions provided by the PHP engine itself.

Description of _zval_struct structure

For your information, here is a description of all fields in the above data structure:

  1. _zval_struct.value – stores the current value of the variable. This is a union of all the possible base types for a variable in PHP: integers, doubles, strings, arrays, and object handles, but not PHP resources (read the description of a zvalue_value union below for futher info).
  2. _zval_struct.refcount – stores the reference counter for the value associated with this variable.

    As you may already know, if you create a copy of PHP variable, the zval structure containing its value has its reference count incremented, rather than its value copied physically into a second variable’s structure:

    $firstVar = "test"; 
    // now $firstVar has refcount = 1 and is_ref = 0
    $secondVar = $firstVar; 
    // now $firstVar has refcount = 2 and is_ref = 0, 
    // while $secondVar has refcount = 1 and is_ref = 1;
    $secondVar = "test2"; 
    // now $firstVar has refcount = 1 (again) and is_ref = 0,
    // while $secondVar has refcount = 1 and is_ref = 0;

    As you can see, any newly created PHP variable has its refcount set to 1 and then increased by one on every “copy”, or decreased by one on every unset/out of scope event.

    The refcount is also decreased when the reference count is larger than 1 and the value of one of the referencing variables has been changed to something different.

    The contents of variables with refcount = 0 are automatically freed from memory by the garbage collector.

  3. _zval_struct.type – stores the current type of the PHP variable (whetever it is a string, object, array, etc). This flag indicates which part of the zvalue_value union should be looked at for its value (for example in case of strings, PHP will operate on the value stored in a zvalue_value.str.val variable, while for float/double PHP numbers, it will use the zvalue_value.dval value).
  4. _zval_struct.is_ref – indicates whetever this PHP variable stores the original value or is just a reference to the value stored in another PHP variable (see refcount property above).

    0 = this variable stores the value,
    1 = this variable is a reference to another variable.

Description of zvalue_value union

As I mentioned before, the zvalue_value union is where the data for a ZVAL is actually stored:

  1. zvalue_value.lval – stores the long integer representation of this PHP variable.
  2. zvalue_value.dval – stores the double representation of this PHP variable.
  3. zvalue_value.str.val – stores the string representation of this PHP variable.

    This string is binary safe, that means: it may contain NULL characters inside (so it cannot be safely processed by various C functions like strlen() or strdup())

  4. zvalue_value.str.len – stores the length of the string value.

    It is a very important value for PHP, because it is used in binary safe operations, where C functions like strlen() would fail after encountering NULL bytes inside of the PHP strings.

  5. – stores the hashtable / array representation of this PHP variable.
  6. zvalue_value.obj – stores a handle to the object representation of this PHP variable.

How to create a PHP variable in a PHP extension?

When you want to create a variable that will be manipulated within PHP, that variable needs to be a ZVAL structure. In order to so, you must declare it and allocate it with a built-in macro, as in the following example:

zval *variable;

After the ZVAL variable has been created, you can assign to it. For scalar data types like numbers, strings, and integers you can use the following set of PHP macros:

ZVAL_NULL(zval *variable)
ZVAL_BOOL(zval *variable, zend_bool value)
ZVAL_LONG(zval *variable, long value)
ZVAL_DOUBLE(zval *variable, double value)
ZVAL_EMPTY_STRING(zval *variable)
ZVAL_STRING(zval *variable, char *string, int copy)
ZVAL_STRINGL(zval *variable, char *string, int length, int copy)

As you may already noticed, there is no macros to create arrays, objects, or PHP resources (I will describe last two data types in another article). In order to create an array, you have to use the array_init(variable) function, like so:

zval *array_variable;

How to convert between different PHP data types?

Because zvalue_value is a union, only a single representation of its value is valid at one time. Fortunately PHP engine provides a set of built-in functions to convert any ZVAL structure into different data types:

convert_to_null(zval *variable);
convert_to_boolean(zval *variable);
convert_to_long(zval *variable);
convert_to_double(zval *variable);
convert_to_string(zval *variable);
convert_to_array(zval *variable);
convert_to_object(zval *variable);

Here is an example:

// ... PHP extension's framework goes here...
// ...
// ... PHP extension's framework ends here...
    zval *test;
    long number = 10;
    ZVAL_LONG(test, number);
    return_value = test;

How to make a ZVAL to C data type conversion?

To access the value of a PHP variable, you can use the following macros:

Z_BVAL (boolean_zval_variable)
Z_LVAL (long_zval_variable)
Z_LVAL (double_zval_variable)
Z_STRVAL (string_zval_variable)
Z_STRLEN (string_zval_variable)
Z_ARRVAL (array_zval_variable)
Z_RESVAL (resource_variable)

Additionaly, there are two other sets of similar macros used in a data type conversions. These macros are named identically as mentioned above, but with an appended _P or _PP (for instance: Z_STRVAL_PP). The difference between them is that they accept zval * and zval ** pointers.

I do not describe these macros in detail, because I assume, that their definitions are self-explanatory.

How to check the ZVAL’s current data type?

You can use the Z_TYPE_P() macro to check the ZVAL’s current type, like so:

<pre lang="c">
// ... PHP extension's framework goes here...
// ...
// ... PHP extension's framework ends here...
    zval *test;
    long number = 10;
    ZVAL_LONG(test, number);
    if(Z_TYPE_P(test) == IS_LONG) {
        php_printf("test variable is a long data type");

Possible data types are:


How to allocate memory in a PHP extension?

Occasionally, you might need to allocate memory that is needed to process various operations in your PHP extension. To do this, there is a list of built-in PHP functions:

void *emalloc(size_t size)
void *pemalloc(size_t size, int persistent)
void efree(void *ptr)
void pefree(void *ptr, int persistent)
void *erealloc(void *ptr, size_t size)
void *perealloc(void *ptr, size_t size, int persistent)
char *estrndup(char *str)
char *pestrndup(char *str, int persistent)

It is important to use these functions (instead of the malloc, free, etc), because they utilize the engine’s memory system which destroys all of its memory pools at the end of every request.

What’s next?

In next articles I will give you the (still incomplete) documentation of some useful PHP macros, demonstrate how to extract PHP arguments passed to the C functions, and how to use the PHP variables and return their values back to the PHP script.


One Response to “How to create PHP extensions – part II: Introduction to PHP variables”

  1. Wouter says:

    Great article. Is there a sequel somewhere? I’d love to learn more on this topic.

Leave a Reply