Hi Padraic,
This looks good. I think it will be faster than what we are currently
using in-house.
Your exception message on line 59 seems wrong -- it appears to allow
null, just not empty string.
I'm curious why you are using sprintf, seems like simple concatenation
would work here. Also curious why you are using substr in that manner
instead of str_pad.
For detecting PCRE unicode support I found this, but there were some
reports of it not working 100% of the time:
<?php
if (@preg_match('/\pL/u', 'a') == 1) {
echo "PCRE unicode support is turned on.\n";
} else {
echo "PCRE unicode support is turned off.\n";
}
?>
I will be interested to see how you tweak the regexes.
Stew
On 3/12/12 4:47 AM, Pádraic Brady wrote:
> Hi all,
>
> Escaper RFC:
http://framework.zend.com/wiki/display/ZFDEV2/RFC+-+Escaper+Class>
> I've implemented a minimal Zend\Escaper\Escaper class in line with the
> RFC at
https://github.com/padraic/zf2/tree/rfc/escaper. One small task
> remaining for basic escaping is to swap the regexes used with the
> correct hex value ranges (where necessary) so we're not too greedy
> about what needs escaping.
>
> We had a few discussions once the RFC went up as to dependency
> requirements so I've settled on the following pending debate here.
>
> 1. It will require PHP to have access to a PCRE version compiled with
> UTF-8 support.
> 2. For all character encodings except UTF-8, an exception will be
> thrown where there is no access to either iconv or mbstring.
>
> Personally, I'd expect a production server to easily meet both of
> these requirements for PHP 5.3 but there are always exceptions to the
> rule. Rather than attempt to create a fallback position which may or
> may not have implications on the escaping security, I've opted to just
> throw an exception and let users arrange their own backup plan. The
> alternative is to replicate iconv's functionality which will likely be
> a) a huge amount of code, b) prone to errors, and c) very very slow.
> We could also make the component explicitly UTF-8 only however that
> seems a bit silly and we have no means to enforce it.
>
> Besides that, I am curious whether anyone has a good method of
> detecting when PCRE has no unicode support? It would be nice to throw
> an exception for that instead of relying on whatever happens otherwise
> (presumably a pcre compile error or something).
>
> Any other feedback welcome too.
>
> Paddy
>