Input validation with filter functions

Although PHP has a lot of filter functions available, I found that still to many people are using (often incorrect) regular expressions to validate user input. The filter extension is simple, standard available and will fulfill the common validations. Below some pratical examples and things to consider when working with PHP filter functions.

Which are available?
Below a shameless copy paste of the PHP documentation.

  • filter_has_var — Checks if variable of specified type exists
  • filter_id — Returns the filter ID belonging to a named filter
  • filter_input_array — Gets external variables and optionally filters them
  • filter_input — Gets a specific external variable by name and optionally filters it
  • filter_list — Returns a list of all supported filters
  • filter_var_array — Gets multiple variables and optionally filters them
  • filter_var — Filters a variable with a specified filter

Pratical use

“Filter input escape output” every developer knows this but it is a repetitive job but with the filter extension filterering input became a lot easier. When you correctly filter input you drastically lower the change of application vulnerabilities.

Sanitizing a single variable

  2. $sText = ‘ <script type="text/javascript">alert("comment from scriptkiddie");</script>’;
  3. $sText = filter_var($sText, FILTER_SANITIZE_STRING);
  4. echo $sText; // This is a comment from a alert(&#34;scriptkiddie&#34;);

Sanitizing multiple variables, same principle as above but with an array, the filter will sanitize all values inside the array

  2. filter_var_array($_POST, FILTER_SANITIZE_STRING);

Validating an email address

  2. if(filter_var($sEmail, FILTER_VALIDATE_EMAIL) === false) {
  3.      $this->addError(‘Invalid email address’, $sEmail);
  4. }

Validation a complete array
Validating all your data at once with a single filter will make your code clear, all in one place and is more easy to maintain an example below.

  2. $aData = array(
  3.         ‘student’       => ‘Sjoerd Maessen’,
  4.         ‘class’         => ’21′,
  5.         ‘grades’ => array(
  6.                         ‘math’ => 9,
  7.                         ‘geography’ => 66,
  8.                         ‘gymnastics’ => 7.5
  9.         )
  10. );
  12. $aValidation = array(
  13.         ‘student’       => FILTER_SANITIZE_STRING,
  14.         ‘class’         => FILTER_VALIDATE_INT,
  15.         ‘grades’        => array(
  16.                                 ‘filter’ => FILTER_VALIDATE_INT,
  17.                                 ‘flags’  => FILTER_FORCE_ARRAY,
  18.                                 ‘options’=> array(‘min_range’=>0, ‘max_range’=>10))
  19. );
  21. echo ‘<pre>’;
  22. var_dump(filter_var_array($aData, $aValidation));
  24. /*array(3) {
  25.   ["student"]=>
  26.   string(14) "Sjoerd Maessen"
  27.   ["class"]=>
  28.   int(21) // Thats strange, my string is converted
  29.   ["grades"]=>
  30.   array(3) {
  31.     ["math"]=>
  32.     int(9)
  33.     ["geography"]=>
  34.     bool(false) // 66 is > 10
  35.     ["gymnastics"]=>
  36.     bool(false) // 7.5 is not an int
  37.   }
  38. }*/

Note: okay I did not expect that the string ’21′ would validate true against FILTER_VALIDATE_INT, after some more testing I also noticed that min_range and max_range only work with FILTER_VALIDATE_INT, when using floats or scalars the options are just ignored, so be aware!

The sanitizing examples above can be made easily more restrictive by adding flags like FILTER_FLAG_STRIP_LOW to the sanitize filter, FILTER_FLAG_STRIP_LOW will for example strip all characters that have a numerical value below 32.

Things to consider
Although the filter functions are some time available some of them aren’t flawless, at some points the documentation is missing or very unclear. Another example is the filter_var validation for IPv6 addresses. (see bug report #50117). So it is always a good thing to check if the filter is really doing what you expect it does. Write testcases before using. If you use it correctly you can write your validations in the blink of an eye, and this extension will be your new best friend.

June 3rd, 2010 at 8:27 am

Posted in Security

  1. These filter functions are definitely a step forward! But they should not be mis-used.

    Don’t use filters to check whether URLs are valid, and don’t use them to validate e-mail addresses. Those entities just shouldn’t be validated that harsh, it’s pointless.

    But as you already pointed out correctly, it’s odd the FILTER_VALIDATE_INT also allows numbers as strings; maybe it’s out of the scope of the filter function set, since values from $_GET variables are mostly string values, which means that if FILTER_VALIDATE_INT would only allow integers, $_GET variables would fail this validation, and that would be odd too…

    iwan Luijks

    7 Jun 10 at 10:47 pm

  2. Be ware that the filters are not bug-free. For example, FILTER_VALIDATE_URL has a huge bug (says – is not invalid character)…


    14 Jun 10 at 8:52 am

