|
Hi,
today I added some new functions to the repository(concat, equals, lastIndexOf, replace, substring) and I came up with a new idea. How about adding UTF8 support for numbers? It would look like this: $number1 = Zend_Locale_UTF8::integer('۱۲'); //should equal 12 in Arabic echo $number1->add(5); //outputs: ۱۷ (17), depending on the selected locale //other possible functions might be: toInt(), divideBy(), etc. That way Zend_Currency could output numbers locale aware. Maybe we could also add a hook into Zend_Translate. Any thoughts? Thomas? PS: I don't know if it makes sense to separate integer and float/double(probably not), so Zend_Locale_UTF8::number() might be a better choice. -- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
Hi Andre,
good idea, I think something like this will be so useful. +1 for Zend_Locale_UTF8::number() Best Regards, On 10/15/06, André Hoffmann
<[hidden email]> wrote: Hi, -- Ahmed Shreef Web Developer Egypt |
|
Hi Ahmed,
are you aware of any problems that I might come across? After taking a look at it I think it won't be a problem at all to translate every numeric character into the ASCII digits that PHP5 needs to work with. What I didn't find yet is a way to determine what locale uses what set of digits. But I don't think that the UCD missed out on that, so we'll see about that after some investigation. For now I'd just happy about some general feedback on usability. Ahmed, I know it might be clear to me and others, but maybe you as a person that actually needs this feature(I don't see myself needing it yet, but who knows) could state where you think this might be useful and what problems you had so far where you wished this would be integrated in PHP. I'm just curious to know if there might be some stuff that I wasn't even aware of yet. Pardon me if everything I want to know is obvious, but maybe it's not, just want to double check to make things right. So for example: You don't write your numbers backwards, do you? How do Arabic websites(especially shops) look like today, do they use the standard ASCII digits or are prices, weights, etc. already converted into Arabic digits? Might there even be solutions out there that actually do exactly that? I think it's just nice to be able to have these sort of questions answered by a person that has been facing this kind of problems. Thanks. On 10/16/06, Ahmed Shreef <[hidden email]> wrote: Hi Andre, -- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
Though I'm strictly an English speaker, I have spent a lot of time
with Unicode over the last couple of years. I'll take a pass at the questions I can answer from a technical perspective. Hopefully Ahmed can shed some more light from a practical sense... On Oct 16, 2006, at 2:31 PM, André Hoffmann wrote: > After taking a look at it I think it won't be a problem at all to > translate every numeric character into the ASCII digits that PHP5 > needs to work with. What I didn't find yet is a way to determine > what locale uses what set of digits. But I don't think that the UCD > missed out on that, so we'll see about that after some investigation. Converting Unicode strings to numbers is a (relatively) straightforward process, and you don't need to know anything about the locale. Every character that represents a number is identified in the UCD, specifically the UnicodeData.txt file, fields 6-8: http://www.unicode.org/Public/UNIDATA/UCD.html#UnicodeData.txt http://www.unicode.org/Public/UNIDATA/UnicodeData.txt There are three kinds of numeric characters. Decimal digits (0-9) are the easiest to convert and their semantic meaning is exactly the same as the Arabic numerals (U+0030 - U+0038). Digit characters are still numbers, but they're usually used for special purposes. For example, the circled number 14 (U+246D). It should still be converted to the number 14, but it usually stands alone. Finally, there are the numeric characters, which are not composed of single digits. For example, the 3/8 fraction character (U+215C). These will need special handling, as they may need to be combined with previous digits. You will also need to watch for certain combining characters such as the fraction slash (U+2044). Instead of using the 3/8 fraction character as above, somebody may have directly used the digits and the slash instead: U+0033 U+2044 U+0038 Some examples of Unicode numeric characters: Subscript digits (U+2080 - U+2089), circled, parenthesized, and dotted digits (U+2460 - U +249B, U+2776 - U+2793), CJK ideographs (such as U+2F0B, the Kangxi radical eight), full-width digits (U+FF10 - U+FF19). > You don't write your numbers backwards, do you? No. Numbers may be displayed from right-to-left (or top-to-bottom or bottom-to-top for that matter) but they are always stored in logical order in the Unicode string. > Might there even be solutions out there that actually do exactly that? I don't know of any, but here is everything you need to know, straight from the Unicode Consortium: Unicode Standard (Section 4.6 - Numeric Value) <http:// www.unicode.org/versions/Unicode4.0.0/ch04.pdf> Unicode Standard (Section 5.5 - Handling Numbers) <http:// www.unicode.org/versions/Unicode4.0.0/ch05.pdf> TR-25: Unicode and Mathematics <http://unicode.org/reports/tr25/ tr25-8.html> (The comprehensive nature of Unicode is probably part of the reason PHP6 is still a ways off...) > How about adding UTF8 support for numbers? > > It would look like this: > $number1 = Zend_Locale_UTF8::integer('۱۲'); //should equal 12 in > Arabic > echo $number1->add(5); //outputs: ۱۷ (17), depending on the > selected locale > > //other possible functions might be: toInt(), divideBy(), etc. I would recommend against implementing mathematical functions. PHP's built-in numeric functions work perfectly. I'd suggest just implementing equivalents of the conversion functions intval() and floatval() which would convert a UTF-8 string to a number and strval () which would convert a number to a UTF-8 string. I think they should be named the same as the PHP functions for consistency. > PS: I don't know if it makes sense to separate integer and float/ > double(probably not), so Zend_Locale_UTF8::number() might be a > better choice. I would like to see them separated as mentioned above. -- Willie Alberty, Owner Spenlen Media [hidden email] http://www.spenlen.com/ |
|
In reply to this post by André Hoffmann
Hi Andre,
I don't think that you will face any problem. You don't write your numbers backwards, do you? no,( ٠١٢٣٤٥٦٧٨٩ = 0123456789 ) where ( ٠ = 0 ) and ( ٩ = 9 ) How do Arabic websites(especially shops) look like today, do they use the standard ASCII digits or are prices, weights, etc. already converted into Arabic digits? now the standard ASCII digits are the most used . f.e. check this shop http://samirandaly.com/?&lang=1 in this website they are using the ASCII digits but it will be better if they can give the user the option to change the type of the digits as there is allot of people are used to reading the Arabic digits more than the ASCII digits . another scenario is an educational website for kids, in a website like that we will need a solution to use the Arabic digits. also when using the Hijri calendar (a.k.a Islamic calendar), we will need to display it to the user using Arabic digits. I'm interested in your idea, if you have any questions feel free to mail me. Best regards, On 10/16/06, André Hoffmann <[hidden email]> wrote: Hi Ahmed, -- Ahmed Shreef Web Developer Egypt |
|
In reply to this post by André Hoffmann
Hy andré
How
about adding UTF8 support for numbers?
It would look like this: $number1 = Zend_Locale_UTF8::integer('۱۲'); //should equal 12 in Arabic echo $number1->add(5); //outputs: ۱۷ (17), depending on the selected locale In general I would like this idea...
Issues:
.) UTF8 should only handle normalization and
localization of UTF8 numbers to numbers (0-9).
.) There is no need of other operations than UTF8
-> number and number -> UTF8.
.) Numbers can extend integer... so it should not
return integer but string
.) UTF8 numbers can also have signs integrated as
seperators or text... so it should only convert the numbers and let the chars as
they are.
//other
possible functions might be: toInt(), divideBy(), etc.
That
way Zend_Currency could output numbers locale aware. Maybe we could also add a
hook into Zend_Translate.
NO...
There should be no
toInt/toFloat/devideBy/add/sub...
These are functions which should not be handled by
UTF8 !
Zend_Currency should depend on
Zend_Locale_Format...
Zend_Locale_Format depends on
Zend_Locale_UTF8.
Otherwise Zend_Currency also would have to handle
the proper currency string for each locale and currency...
These informations are included in CLDR and will be
implemented in Zend_Locale_Format as soon as we are coding this
class.
Zend_Translate has no need for changing
numbers.
These are issues which are related to the
translaters of the source files.
Zend_Translate itself should never ever change
translated informations from a source file.
PS:
I don't know if it makes sense to separate integer and float/double(probably
not), so Zend_Locale_UTF8::number() might be a better choice. NO...
This makes no sense...
We only have to support NORMALIZATION
(toNumber)
and LOCALIZATION (toxxxxx)
Greetings
Thomas |
|
On 10/17/06, Thomas Weidner <[hidden email]> wrote:
Could you explain why there isn't?
Of course, I never stated the opposite(I hope ;-) )
Why should a number contain text, could you give me an example to understand where we'd need text in a number?
Well, maybe UTF8 should contain an API to calculate with numbers, but it should defenitely include toInt/toFloat, to help you caculate with numbers. Best would be __toInt(), but as far as I know they decided against implementing that in PHP 6 :-(
Agreed.
Shouldn't it output the current locale instead? Would make more sense to me as you could simply change your locale and all numbers would be output in whatever you chose. At least that'd make more sense to me.
-- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
Hy,
We are only in need of normalization and
localization.
Localized strings also include seperators and other
signs which are not numbers...
See the arabic floating seperation for
example.
Or the german 12.000 which equals to the
integer 12000.
Your UTF8 number function would not handle this
issue...
12.000 in german is 12000
12.000 in english is 12
Zend_Locale_Format on the other side is coded to
handle this problem.
Also a "toNumber" exists already at this
place.
So UTF8's number conversion would be implemented by
Zend_Locale_Format to become a localized string.
I would not include mathematical operation because
UTF8 itself should only handle UTF8 issues...
And I'm also sure that when someone is in need of
math functions within UTF string, he is also in need of
square root, multiplication and other complex
functions...
So I would prefer to let the user normalize his
number information (which normaly includes seperation within the number string),
then he should do his operations and then he localizes his information to the
expected format.
UTF8 is then faster (as the class is
smaller).
12.000 in german = 12000 integer
12.000 in english = 12 integer
!!!
UTF8 should only convert the utf8 number characters
to standard number characters.
An API will
.) slow speed of UTF8
.) will fail on real mathematical problems (square
root, logarithm...)
Only implementing add, sub, mul, div is in my eyes
not enough for mathematical functions.
So I would let the user do this...
1.) Normalize the String with Zend_Locale_Format
which depends on Zend_Locale_UTF8
2.) Do mathematics
3.) Localize the Number
And what if you have a text with english
numbers and added arabic numbers ?
"1234 equals ****... ( **** should be thought of
arabic numbers ;-) )
And... working with locale is part of Zend_Locale
itself... Zend_Locale_UTF8 should not take care which locale to
output.
Formatting of numbers (normalization and
localization) are primary done by Zend_Locale_Format which is also the
user-interface for normalization and localization.
It would not make sense to make a second
user-interface for number normalization....
Greetings
Thomas
|
|
Hi,
first off all, this: "Well, maybe UTF8 should contain an API to calculate with numbers, but it should definitely include toInt/toFloat, to help you calculate with numbers." was a typo and 'should' should have been 'shouldn't'. Sorry about that. 12.000 in German is 12000
12.000 in English is 12 Sure it is, but neither commas nor dots are considered text, but separators. And there might be a way using Zend_Locale to get the needed information. I just don't think a one way implementation makes sense. If Zend_Locale_UTF8 can output UTF-8 it should also understand it. How about adding both functions toXxx() and toInt(), where the latter one outputs the number in the standard locale and the former in Xxx. On 10/17/06, Thomas Weidner <[hidden email]> wrote:
-- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
Hy,
first off all, this: "Well, maybe UTF8 should contain an API to calculate with numbers, but it should definitely include toInt/toFloat, to help you calculate with numbers." was a typo and 'should' should have been 'shouldn't'. Sorry about that. Handling of the different Floating Point Seperators
and Number Seperators is just one point... (there are several different
seperators, not only 2 of them)
In Indian you have 12,34,456.78 as
1234456,78... so 1,234,456.78 is not parsed as indian number and this is correct
for an indian.
All this handling is already coded in
Zend_Locale_Format.
Why double the code ???
This is nonsense in my eyes.
Normalization and Localization should be done by
Zend_Locale_Format.
Zend_Locale_UTF8 should only convert f.e. arabic
numbers to computer (english) numbers.
And it should convert english numbers to arabic
numbers.
And also do this for all other number formats as
chinese, japanese, indian and all others which are not using the old arabic 0-9
letters.
But when Zend_Locale_Format gives a string like
"12.345,678" then Zend_Locale_UTF8::toArabic should convert it to
"**.***,***"... so leave the string parts as they
are and only convert the numbers to arabic... Zend_Locale_Format does already
handle the seperation (localization) of numbers.
Zend_Locale_Format also already has a toInteger()
and toFloat()...
Implementing this functionality in Zend_Locale_UTF8
would also lead to an doubling of the code.
Nonsense in my eyes...
The user should only use
Zend_Locale_Format.
How should he otherwise know which class to use
???
Today Zend_Locale_Format and tomorrow
Zend_Locale_UTF8...
Greetings
Thomas
|
|
On Oct 17, 2006, at 12:18 PM, Thomas Weidner wrote:
> All this handling is already coded in Zend_Locale_Format. > Why double the code ??? > This is nonsense in my eyes. > > ... > > Zend_Locale_Format also already has a toInteger() and toFloat()... > Implementing this functionality in Zend_Locale_UTF8 would also lead > to an doubling of the code. > Nonsense in my eyes... I got the impression that Zend_Locale_UTF8 was evolving into a more general-purpose Unicode utility class. You clearly see it as a subordinate of Zend_Locale_Format. Some of what you propose is nonsense taken in context of a general utility class: relying upon functionality in Zend_Locale_Format when I have been given an arbitrary UTF-8-encoded string which I may know nothing about is impossible. I think a decision should be made regarding the intended scope of this class... Should it be general-purpose, used by Zend_Locale but available for other uses? Or should it be written to address the specific needs of Zend_Locale only? Either way is valid, and the direction chosen will help to answer these questions more easily. Personally, I will have a need for some general-purpose Unicode classes in the near future. I would like to see much of the functionality André has proposed, but perhaps it should be in a separate proposal? -- Willie Alberty, Owner Spenlen Media [hidden email] http://www.spenlen.com/ |
|
In reply to this post by Thomas Weidner
On 10/17/06, Thomas Weidner <[hidden email]> wrote:
This would be a really strange behaviour, don't you think? Maybe we should move some code instead. As Zend_Locale_Format uses Zend_Locale_UTF8 this shouldn't be a problem, should it?
Same functionality doesn't necessarily mean double code. We could still got with something like: function Zend_Locale_Function::toInt() { return Zend_Locale_UTF8::number(..)->toInt() } Just to show you an example. So if this feature should be implemented to Zend_Locale_UTF8 it should be in a right and logical manner. But implementing some half functions that return a half result but are named toArabic() would seem really strange to me. As we're in the incubator we still have the ability to think about possibilities and may may also change stuff. The fact that the first though was so doesn't mean it can't be otherwise. I think that's part of what the incubator is for.
-- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
hay guys,
125.34 in English = ١٢٥,۳٤ in Arabic 1- replaced the en digits with ar digits. 2- replaced ( . ) with ( , ) I think this can be done by Zend_Locals_UTF8::toArabic() only without needing to pass ١٢٥.۳٤ to Zend_Locals_Format to get it as ١٢٥,۳٤
if I have a string like "12.345,678" [english formated] I will pass it to Zend_Locals_Format which returns "12.345678" . then I will pass this result to Zend_Loclas_UTF8::toArabic() which will return "١٢,٣٤٥٦٧٨". mmm, I think that I'm saying what Thomas said in another way!! -- Ahmed Shreef Web Developer Egypt |
|
Ahmed, I think you missed the point here. You shouldn't use Zend_Locale_UTF8 as it's not a user-land class (yet). It just provides the UTF-8 functionalities for other components.
Once it's finished we might make an user-land class out of it. What I was trying to say was: We shouldn't split up stuff that belongs together, i.e. Zend_Locale_UTF8 shouldn't return weird stuff that can only be used by Zend_Locale_Format. The API should follow a logic scheme even though it might be internal for now and the result might be the same. Maybe someone else has something to say to this. I mean, we could also go back in the proposal phase, but that's gonna kill any chance to get it done by 0.7(or whatever it'll be named). ;-)
On 10/17/06, Ahmed Shreef <[hidden email]> wrote: hay guys, -- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
In reply to this post by André Hoffmann
Hy,
No, I don't think this is strange
behavior...
Zend_Locale_Format is the users interface for
normalization and localization.
We have a
getNumber
toNumber
isNumber
implementation for Numbers, Integers, Floats, Dates
and Times.
So first splitting the functions to UTF8 and Format
makes no sense.
Also porting the code makes no sense in my eyes, as
UTF8 should only handle UTF8 string...
It should not handle localization or
localization... it should only handle utf8 issues.
When someone is in need of arabic signs he should
rely on Zend_Locale_Format...
If he wants to convert these to english he should
also rely on Zend_Locale_Format.
So in my eyes Zend_Locale_UTF8 should only convert
arabic number symbols to english number symbols without changing other string
content.
"12 cars" would be ١٢ in symbols...
The function should return "١٢ cars"...
What to do with the returned value should be
handled by the calling function...
When only numbers are needed the calling function
itself has to do the stripping.
So if stripping is not wished it should not be
done.
As already mentioned...
NORMALIZATION is not part of Zend_Locale_UTF8
!!!
All functions within Zend_Locale_Format are in need
of UTF8 or number functions...
So you would port all code from Zend_Locale_Format
to Zend_Locale_UTF8 ???
Makes no sense in my eyes !!!!
Why "half functions" ???
Porting "12" to "١٢" is not a half function...
If the calling function wants to have an integer, an date or a string it
should get it... this should be choosen by the calling function and not fixed by
Zend_Locale_UTF8.
Btw:
Zend_Measure already implements a possibility to
change standard numbers to roman numbers or binary numbers.
Natively I would say Arabic and other number
formats should be included there as they are a way of converting a number format
to an other number format.
See Zend_Measure_Number for further
information.
Greetings
Thomas
|
|
In reply to this post by André Hoffmann
Hy,
As you already said... "do not split up stuff that
belongs together"...
UTF8 should only handle UTF8 issues.
It should not handle Locales itself.
It should not do normalization
It should not do localization
It only has to handle UTF8 issues...
working with UTF8
converting UTF8 chars
This is what it is done for...
The proposal is already accepted.
What we need internally is not part of the
proposal.
If we decide to include additionaly functions as
converting new number formats (english -> arabic) it's a good add on and
could easily be stripped off if our Zend Boys
decide to do so.
So don't complicate things more as they
are.
I would say yes for a number format conversion
possibility.
I would include this in Zend_Measure for
first.
If we decide that this should also be included in
Zend_Locale_Format we would have to decide how to automate the decision which
format to use, as this information is not present in CLDR.
Greetings
Thomas
|
|
In reply to this post by Willie Alberty
> I think a decision should be made regarding the intended scope of
this class... > Should it be general-purpose, used by Zend_Locale but available for other uses? > Or should it be written to address the specific needs of Zend_Locale only? > Either way is valid, and the direction chosen will help to answer these questions more easily. http://www.nabble.com/Zend_Seach_Lucene-tf2315524s16154.html#a6490854 http://www.mail-archive.com/fw-general@.../msg00231.html http://framework.zend.com/wiki/display/ZFMLGEN/mail/5208 http://www.zend.com/lists/fw-general/200609/msg00641.html Localizing and translating numbers from one locale to another belongs to components responsible for l10n and i18n. If string transformations of strings (containing locale-formatted numbers) are needed or useful to support Zend_Locale* or Zend_Search* components, and Zend_Locale_Utf8 can do the transformation without using any knowledge of number formats (e.g. separators), then I think these transformations might fit within the scope of Zend_Locale_Utf8. In general, Zend_Locale_Utf8 uses only knowledge of UTF8 format, but not Locales (no CDLR), to provide low-level functions to process and analyze UTF8 strings. If I have a web form asking for a currency amount, and someone types in the amount using Arabic characters and Arabic digit and fraction separator characters, then I would expect Zend_Locale* to provide some mechanism to convert that UTF8 string to a normalized form and return the resulting value using built-in PHP data types (e.g. either a pair of integers or a float). If a userland application needs to convert a number embedded in a locale-specific string to a float, then I could imagine something like the following: Zend_Locale_Function::toFloat() uses Zend_Local_Utf8 to help to trim away superfluous UTF8 characters (not related to the number) and convert the UTF8 characters to digits. Next, Zend_Locale_Function::toFloat() uses the CDLR to properly interpret the input and normalize it, returning a PHP float value. To keep this problem and solution simple, I would be tempted to impose constraints that make the code simpler. For example, "toFloat() only works on numbers between the range of X and Y", where X and Y are small enough to avoid complications making the code more complex. Another example possible constraint: "Only integer numbers are supported." -i.e. skip toFloat() and only provide toInt(). At this time in the life of ZF, I favor extreme simplicity, and I favor either eliminating or delaying more complex behaviors and functions until after ZF 1.0. Cheers, Gavin P.S. I am curious, which website for ZF mail archives (see above) do you prefer? I find Lucene (both Nabble and ZF) works better for my search needs. I often posts links to past posts as somewhat clunky way to recall past information, and save everyone the effort of searching for it, since our posts are not automatically sorted into topical categories other than by subject threads on these archival sites. Willie Alberty wrote: > On Oct 17, 2006, at 12:18 PM, Thomas Weidner wrote: > >> All this handling is already coded in Zend_Locale_Format. >> Why double the code ??? >> This is nonsense in my eyes. >> >> ... >> >> Zend_Locale_Format also already has a toInteger() and toFloat()... >> Implementing this functionality in Zend_Locale_UTF8 would also lead >> to an doubling of the code. >> Nonsense in my eyes... > > I got the impression that Zend_Locale_UTF8 was evolving into a more > general-purpose Unicode utility class. You clearly see it as a > subordinate of Zend_Locale_Format. Some of what you propose is > nonsense taken in context of a general utility class: relying upon > functionality in Zend_Locale_Format when I have been given an > arbitrary UTF-8-encoded string which I may know nothing about is > impossible. > > I think a decision should be made regarding the intended scope of this > class... Should it be general-purpose, used by Zend_Locale but > available for other uses? Or should it be written to address the > specific needs of Zend_Locale only? Either way is valid, and the > direction chosen will help to answer these questions more easily. > > Personally, I will have a need for some general-purpose Unicode > classes in the near future. I would like to see much of the > functionality André has proposed, but perhaps it should be in a > separate proposal? > > -- > > Willie Alberty, Owner > Spenlen Media > [hidden email] > > http://www.spenlen.com/ > > -- Cheers, Gavin Which ZF List? ================= Everything, except the topics below: [hidden email] Authorization, Authentication, ACL, Access Control, Session Management [hidden email] Tests, Caching, Configuration, Environment, Logging [hidden email] All things related to databases [hidden email] Documentation, Translations, Wiki Manual / Tutorials [hidden email] Internationalization & Localization, Dates, Calendar, Currency, Measure [hidden email] Mail, MIME, PDF, Search, data formats (JSON, ...) [hidden email] MVC, Controller, Router, Views, Zend_Request* [hidden email] Community Servers/Services (shell account, PEAR channel, Jabber) [hidden email] Web Services & Servers (HTTP, SOAP, Feeds, XMLRPC, REST) [hidden email] How to un/subscribe: http://framework.zend.com/wiki/x/GgE |
|
In reply to this post by Thomas Weidner
Hi,
after taking a second look at the numbers Zend_Locale_UTF8 will have to convert I realized that there won't be no way around implementing separators in Zend_Locale_UTF8, because in other languages there aren't only exclusively numbers that represent 0 to 9 in English, but also numbers that equal fractures, greater values than 9 or even are negative. 1000000000000, 3.5, 30 to name some. Therefore seperators will be needed. Also keep in mind that Zend_Locale_UTF8 might not stay internal for ever. So we really need to change something here, if we want to include number conversion! So long. On 10/18/06, Thomas Weidner <[hidden email]> wrote:
-- best regards, André Hoffmann ZF-Blog: http://andrehoffmann.wordpress.com/ |
|
Hy,
Why include number conversion with seperator,
negative and fraction handling in Zend_Locale_UTF8 ????
I'm something missing here in your
argumentation.
Zend_Measure already includes a class for number
conversion which only has to be extended.
Zend_Locale_Format already implements the handling
for Seperator, Negative and Fractions for all Locales.
I don't see the point or the problem of only
implementing a sign conversion within Zend_Locale_UTF8 and not a complete number
localizing way as already included in Zend_Locale_Format...
1.) Zend_Locale_Format handles the input string,
stripping seperators, changing fraction and negative sign.
So our input string is normalized. This is already
implemented.
2.) Zend_Locale_Format calls Zend_Locale_UTF8 for
converting the normalized value to local signs.
So we have a normalized string with local
signs.
3.) Zend_Locale_Format localizes the returned
string adding seerators, negative and fraction signs.
This is also already implemented.
4.) In Zend_Measure_Numbers there will be added
some functions as
toArabic, fromArabic, toChinese, fromChinese and so
on...
So we could convert numbers locale aware to other
number formats.
A conversion for the roman, binary, octal,
hexadecimal, decimal and some other number formats are already implemented
there.
Greetings
Thomas
|
|
I've been following this thread with great interest, and even
attempted to join in the conversation a couple of times. However, it seems I am completely missing the point of the discussion... André, Ahmed, and myself would like to see Zend_Locale_UTF8 do more Unicode-aware things than it does now. This would make it useful outside of a Locale-only context. However, Gavin reminded us of the prior direction from Zend that said Zend_Locale_UTF8 is to be essentially a private helper class, to be used only for the explicit needs of Zend_Locale. Fine. But in reading your last response, it seems as though you either don't see the need for Zend_Locale_UTF8 or don't want it: On Oct 18, 2006, at 11:36 AM, Thomas Weidner wrote: > 1.) Zend_Locale_Format handles the input string, stripping > seperators, changing fraction and negative sign. > So our input string is normalized. This is already implemented. So you already have a comprehensive table of Unicode characters that represent the decimal and thousands separators, as well as the fraction and negative signs for every language supported by Zend_Locale_Format? > 2.) Zend_Locale_Format calls Zend_Locale_UTF8 for converting the > normalized value to local signs. > So we have a normalized string with local signs. So you already have a comprehensive table of Unicode characters that are numeric digits? How are you able to identify which characters are digits, which are delimiters, and which are white space? If you already know what characters are digits, why would you need Zend_Locale_UTF8 at all? Just use same tables for conversion that you're using for parsing. > 3.) Zend_Locale_Format localizes the returned string adding > seerators, negative and fraction signs. > This is also already implemented. Again, this implies in-depth knowledge of the character sets involved for every language, including knowledge of which characters are encoded in one-, two-, and three-bytes. Otherwise, you would not be able to reliably insert a decimal separator at the correct location in the byte stream. > 4.) In Zend_Measure_Numbers there will be added some functions as > toArabic, fromArabic, toChinese, fromChinese and so on... > So we could convert numbers locale aware to other number formats. > A conversion for the roman, binary, octal, hexadecimal, decimal and > some other number formats are already implemented there. Again, it sounds like all of the functionality you need is already implemented elsewhere. Can you be more specific with the functions you *do* need Zend_Locale_UTF8 to perform? After reading through this thread again, and factoring in the Zend direction from Gavin, I think having this class around is unnecessary. (André - If this turns out to be true, don't despair... I think there is a great need for Unicode manipulation classes in PHP 5. In fact, I have an explicit need in some of the work I'm planning for Zend_Pdf. They might just need to live outside of Zend_Locale to survive. If the adoption rate of PHP 5 by hosting providers is any indication, PHP 6 is still several years away from being practical, which means Unicode classes in the framework are unquestionably valuable.) -- Willie Alberty, Owner Spenlen Media [hidden email] http://www.spenlen.com/ |
| Powered by Nabble | Edit this page |
