Ampersand bug in Zend\Uri\Uri::normalize()?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Ampersand bug in Zend\Uri\Uri::normalize()?

demiankatz

Hello,

 

There seems to be a bug in Zend\Uri\Uri::normalize() (as of ZF 2.2.4), but I want to make sure I’m not misunderstanding something before I dig deeper…

 

Here’s the problem behavior I’m seeing:

 

$uri = new \Zend\Uri\Uri();

$uri->setQuery(array('q' => 'this & that'));

echo $uri->normalize()->toString();

 

This outputs “?q=this%20&%20that” which is of course not valid – that ampersand should be encoded, and without encoding, it causes my parameter to get split in half!

 

If I skip the normalize() step, I do get a valid result: “?q=this%20%26%20that”

 

The reason I ran into this is that I’m using $this->redirect()->toRoute() in a controller and utilizing the ‘query’ parameter of the $options array…  needless to say, this is causing corruption of my parameters!

 

Would everyone agree that this is a problem?  Any suggestions/insights?  I’ll be happy to spend a little more time digging into this and possibly putting together a PR to fix it, but I would appreciate a little input from the community before I go down the wrong path.

 

thanks,

Demian

Reply | Threaded
Open this post in threaded view
|

Re: Ampersand bug in Zend\Uri\Uri::normalize()?

Shahar Evron-2
On 9/3/13 4:34 PM, Demian Katz wrote:

Hello,

 

There seems to be a bug in Zend\Uri\Uri::normalize() (as of ZF 2.2.4), but I want to make sure I’m not misunderstanding something before I dig deeper…

 

Here’s the problem behavior I’m seeing:

 

$uri = new \Zend\Uri\Uri();

$uri->setQuery(array('q' => 'this & that'));

echo $uri->normalize()->toString();

 

This outputs “?q=this%20&%20that” which is of course not valid – that ampersand should be encoded, and without encoding, it causes my parameter to get split in half!

 

If I skip the normalize() step, I do get a valid result: “?q=this%20%26%20that”

 

The reason I ran into this is that I’m using $this->redirect()->toRoute() in a controller and utilizing the ‘query’ parameter of the $options array…  needless to say, this is causing corruption of my parameters!

 

Would everyone agree that this is a problem?  Any suggestions/insights?  I’ll be happy to spend a little more time digging into this and possibly putting together a PR to fix it, but I would appreciate a little input from the community before I go down the wrong path.

 

thanks,

Demian

Hi,

I agree its a problem for the common use case, although RFC purists might disagree. According to the URI RFC () a part of the normalization process includes decoding any unnecessarily encoded characters. In the query part, an ampersand is allowed literally (and of course it is used extensively). The basic syntax does not include, IIRC, the standard for ampersand separated form data encoding which is typically used in HTTP, which requires further encoding of non-separator characters such as ampersands and maybe equal signs.

That said, clearly the component is not behaving as expected. What I can suggest is one of two things:

1. Extend normalize() in Zend\Uri\Http to not encode ampresands
2. Add an optional $flags parameter to normalize to support this use case.

I am not entirely sure which approach is better.

Shahar.
Reply | Threaded
Open this post in threaded view
|

RE: Ampersand bug in Zend\Uri\Uri::normalize()?

demiankatz

That makes sense – an uglier and more complex situation than I had realized (though I suspected it was something along these lines).

 

I’ve simply worked around the problem by doing my query encoding outside of the framework – that seems easier than potentially getting involved in a philosophical debate about RFCs – but I remain willing to help if a consensus emerges about the best path forward.

 

Regarding your proposed solutions, it seems that the two could be complementary – implement flags in the base class, and set different default flag values for Zend\Uri\Http.

 

- Demian

 

From: Shahar Evron [mailto:[hidden email]]
Sent: Tuesday, September 03, 2013 12:43 PM
To: [hidden email]
Subject: Re: [zf-contributors] Ampersand bug in Zend\Uri\Uri::normalize()?

 

Hi,

I agree its a problem for the common use case, although RFC purists might disagree. According to the URI RFC () a part of the normalization process includes decoding any unnecessarily encoded characters. In the query part, an ampersand is allowed literally (and of course it is used extensively). The basic syntax does not include, IIRC, the standard for ampersand separated form data encoding which is typically used in HTTP, which requires further encoding of non-separator characters such as ampersands and maybe equal signs.

That said, clearly the component is not behaving as expected. What I can suggest is one of two things:

1. Extend normalize() in Zend\Uri\Http to not encode ampresands
2. Add an optional $flags parameter to normalize to support this use case.

I am not entirely sure which approach is better.

Shahar.