I18N

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

I18N

Thomas Weidner
Hy,
 
as I mentioned befor, a I18N or L10N core is essential for building huge web apps.
Therefor let's talk a little bit to get this baby running :)
 
I see it from the practical side as i developed an corporate application which we are selling. There are much points we have to cover with I18N, and each FZ-user has other needs of it.
 
I read the thread Johannes mailed me... but the theme seems to be felt to sleep already as there are no mailings 'till over 1 week.
 
Let's Summarize what a I18N-ization with our ZF should do :
 
 
Essential for I18N seems the following
- Wrapper
- Lightweight/Fast
- Different Sources
- Different Targets
 
Lets go to detail:
- Wrapper :
The Framework MUST be a wrapper for the different Localization Mechanism we need.
Our approach has to be, that changing the mechanism should be as easy as it could be.
 
For example : 
$I18N = new Zend_Locale();
$I18N->Local('GetText',de_DE);
 
Changing the translation mechanism could also be done afterwards.
$I18N->Translate('write something');  // All would be translated to german
$I18N->Local('SQL',en_EN);             // We change our mechanism to SQL Database and the standard language to english
$I18N->Translate('write something',de_DE); // we override the standard language with our temporary language
 
Also the Module has to recognize other charactersets as input
echo $I18N->Translate('Daß ist mein öffentlicher Text');
 
- Leightweight/Fast
From the practical side...
Our Application has 0,5Mio different String. Also when we have an webserver with more than 1000 People requesting sites in different languages, speed is essential.
Leightweight means also that it has to be as simple as possible for the ZF-user to use it in his scripts.
 
-Different Sources
As mentioned befor it is essential to have different translation ressources.
There should be:
GETTEXT, XML, SQLite, MySQL, MSSQL, and maybe several others I don't know until now. :)
Even when gettext is not threadsave as i read, it's used very often.
 
- Different Targets/Goals
The CLDR Project seems to me a good approach for defining a complete LOCALE.
Zend_Locale should not only define a translating language, but also Date, Time, Measurement, Collations and so on...
 
The definition should be done in 1 class (Zend_Locale).
The other Classes could derive from them.
For example:
Zend_Locale_DateFormat
Zend_Locale_MeasurementFormat
 
When defining a date class the usage could be
$I18N = new Zend_Locale();
$I18N->Local('SqLite',de_DE);
$date = $I18N->Date($inputdate); // Using Standard date for de_DE
$time = $I18N->Time($inputtime,de_CH);  // Using temporary timeformat for location de_CH
 
or
 
$I18N = new Zend_Locale();
$I18N->Local('SqLite',de_DE);
$date = new Zend_DateTime();
$time = $date->Time($inputtime,de_CH);  // Using temporary timeformat for location de_CH
 
That where my thoughts until now...
So let's discuss about it
 
Greetings
Thomas
Reply | Threaded
Open this post in threaded view
|

Re: I18N

Steven Van Poeck
Hi Thomas,

I very much like the idea of the basic structure you're proposing for
i18N :)

I believe Jayson called upon people interested in working together on a
i18N project for the ZF:
http://www.zend.com/lists/fw-general/200605/msg00787.html. I definitely
think you should apply :)

Just one small note on the wrapper and sources: for database access, it
should use PDO.

Best regards,
Steven Van Poeck
http://poekie.free.fr

Thomas Weidner a écrit :

> Hy,
>  
> as I mentioned befor, a I18N or L10N core is essential for building huge
> web apps.
> Therefor let's talk a little bit to get this baby running :)
>  
> I see it from the practical side as i developed an corporate application
> which we are selling. There are much points we have to cover with I18N,
> and each FZ-user has other needs of it.
>  
> I read the thread Johannes mailed me... but the theme seems to be felt
> to sleep already as there are no mailings 'till over 1 week.
>  
> Let's Summarize what a I18N-ization with our ZF should do :
>  
>  
> Essential for I18N seems the following
> - Wrapper
> - Lightweight/Fast
> - Different Sources
> - Different Targets
>  
> Lets go to detail:
> - Wrapper :
> The Framework MUST be a wrapper for the different Localization Mechanism
> we need.
> Our approach has to be, that changing the mechanism should be as easy as
> it could be.
>  
> For example :
> $I18N = new Zend_Locale();
> $I18N->Local('GetText',de_DE);
>  
> Changing the translation mechanism could also be done afterwards.
> $I18N->Translate('write something');  // All would be translated to german
> $I18N->Local('SQL',en_EN);             // We change our mechanism to SQL
> Database and the standard language to english
> $I18N->Translate('write something',de_DE); // we override the standard
> language with our temporary language
>  
> Also the Module has to recognize other charactersets as input
> echo $I18N->Translate('Daß ist mein öffentlicher Text');
>  
> - Leightweight/Fast
>  From the practical side...
> Our Application has 0,5Mio different String. Also when we have an
> webserver with more than 1000 People requesting sites in different
> languages, speed is essential.
> Leightweight means also that it has to be as simple as possible for the
> ZF-user to use it in his scripts.
>  
> -Different Sources
> As mentioned befor it is essential to have different translation ressources.
> There should be:
> GETTEXT, XML, SQLite, MySQL, MSSQL, and maybe several others I don't
> know until now. :)
> Even when gettext is not threadsave as i read, it's used very often.
>  
> - Different Targets/Goals
> The CLDR Project seems to me a good approach for defining a complete LOCALE.
> Take a look here http://www.unicode.org/reports/tr35/
> Zend_Locale should not only define a translating language, but also
> Date, Time, Measurement, Collations and so on...
>  
> The definition should be done in 1 class (Zend_Locale).
> The other Classes could derive from them.
> For example:
> Zend_Locale_DateFormat
> Zend_Locale_MeasurementFormat
>  
> When defining a date class the usage could be
> $I18N = new Zend_Locale();
> $I18N->Local('SqLite',de_DE);
> $date = $I18N->Date($inputdate); // Using Standard date for de_DE
> $time = $I18N->Time($inputtime,de_CH);  // Using temporary timeformat
> for location de_CH
>  
> or
>  
> $I18N = new Zend_Locale();
> $I18N->Local('SqLite',de_DE);
> $date = new Zend_DateTime();
> $time = $date->Time($inputtime,de_CH);  // Using temporary timeformat
> for location de_CH
>  
> That where my thoughts until now...
> So let's discuss about it
>  
> Greetings
> Thomas
>
>
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.394 / Virus Database: 268.7.1/347 - Release Date: 24/05/2006

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
Hy Steven,

when using database-access for I18N our Zend_Locale should of course use
Zend_DB and not self-code a database access.
That's the idea of having several modules for several surposes.

Additional Ideas which we use in our I18N are for example :

- automatic language recognition
We recognize the language set by the Browser and can display the right
language on the fly.
For example: In the browser the definition is 'fr; q=1.0, en; q=0.5'

when we have an french (fr) translation we set standard to french as it is
set 100%.
Otherwise we will set the en language which is set 50%.
When we dont have eigher french nor english, we will display the language
our application was programmed in (german).

A nice feature which could be added to Zend_Locale or Zend_Client
Of course our user can set an other language he wishes everytime and
everywhere in the application

- Text-Source
A text-based (Text-Database) or CSV-based source for language files could be
good for users which dont have access to
PDO or gettext. Text-/CSV-files are good for Translation teams as they often
have problems with XML Files.

- Conversion
Convert from one locale in another.
For example a given input date/time to a defined in an other language.
$input = '18.06.2006 13:05:00.204';
$output = $I18N->ConvertDate('18.06.2006 13:05:00.204',fr-FR); // Output
would be i think '2006/06/18 01:05:00PM' for example
This is a nice feature when for example having to save always in a defined
language or format.

- Unicode
Of course we should be aware to use unicode source files, as we otherwise
would have problems with
our friends from asia as they are using extendes charactersets.

@Jason:
I didn't found the FAQ i read about from you. Where is the button to apply ?
:)

ok, that's it for now...
Maybe later for some new ideas.

Greetings
Thomas


----- Original Message -----
From: "Steven Van Poeck" <[hidden email]>
To: "Thomas Weidner" <[hidden email]>
Cc: <[hidden email]>
Sent: Thursday, May 25, 2006 2:41 PM
Subject: Re: [fw-general] I18N


> Hi Thomas,
>
> I very much like the idea of the basic structure you're proposing for i18N
> :)
>
> I believe Jayson called upon people interested in working together on a
> i18N project for the ZF:
> http://www.zend.com/lists/fw-general/200605/msg00787.html. I definitely
> think you should apply :)
>
> Just one small note on the wrapper and sources: for database access, it
> should use PDO.
>
> Best regards,
> Steven Van Poeck
> http://poekie.free.fr
>
> Thomas Weidner a écrit :
>> Hy,
>>  as I mentioned befor, a I18N or L10N core is essential for building huge
>> web apps.
>> Therefor let's talk a little bit to get this baby running :)
>>  I see it from the practical side as i developed an corporate application
>> which we are selling. There are much points we have to cover with I18N,
>> and each FZ-user has other needs of it.
>>  I read the thread Johannes mailed me... but the theme seems to be felt
>> to sleep already as there are no mailings 'till over 1 week.
>>  Let's Summarize what a I18N-ization with our ZF should do :
>>  Essential for I18N seems the following
>> - Wrapper
>> - Lightweight/Fast
>> - Different Sources
>> - Different Targets
>>  Lets go to detail:
>> - Wrapper :
>> The Framework MUST be a wrapper for the different Localization Mechanism
>> we need.
>> Our approach has to be, that changing the mechanism should be as easy as
>> it could be.
>>  For example : $I18N = new Zend_Locale();
>> $I18N->Local('GetText',de_DE);
>>  Changing the translation mechanism could also be done afterwards.
>> $I18N->Translate('write something');  // All would be translated to
>> german
>> $I18N->Local('SQL',en_EN);             // We change our mechanism to SQL
>> Database and the standard language to english
>> $I18N->Translate('write something',de_DE); // we override the standard
>> language with our temporary language
>>  Also the Module has to recognize other charactersets as input
>> echo $I18N->Translate('Daß ist mein öffentlicher Text');
>>  - Leightweight/Fast
>>  From the practical side...
>> Our Application has 0,5Mio different String. Also when we have an
>> webserver with more than 1000 People requesting sites in different
>> languages, speed is essential.
>> Leightweight means also that it has to be as simple as possible for the
>> ZF-user to use it in his scripts.
>>  -Different Sources
>> As mentioned befor it is essential to have different translation
>> ressources.
>> There should be:
>> GETTEXT, XML, SQLite, MySQL, MSSQL, and maybe several others I don't know
>> until now. :)
>> Even when gettext is not threadsave as i read, it's used very often.
>>  - Different Targets/Goals
>> The CLDR Project seems to me a good approach for defining a complete
>> LOCALE.
>> Take a look here http://www.unicode.org/reports/tr35/
>> Zend_Locale should not only define a translating language, but also Date,
>> Time, Measurement, Collations and so on...
>>  The definition should be done in 1 class (Zend_Locale).
>> The other Classes could derive from them.
>> For example:
>> Zend_Locale_DateFormat
>> Zend_Locale_MeasurementFormat
>>  When defining a date class the usage could be
>> $I18N = new Zend_Locale();
>> $I18N->Local('SqLite',de_DE);
>> $date = $I18N->Date($inputdate); // Using Standard date for de_DE
>> $time = $I18N->Time($inputtime,de_CH);  // Using temporary timeformat for
>> location de_CH
>>  or
>>  $I18N = new Zend_Locale();
>> $I18N->Local('SqLite',de_DE);
>> $date = new Zend_DateTime();
>> $time = $date->Time($inputtime,de_CH);  // Using temporary timeformat for
>> location de_CH
>>  That where my thoughts until now...
>> So let's discuss about it
>>  Greetings
>> Thomas
>>
>>
>> ------------------------------------------------------------------------
>>
>> No virus found in this incoming message.
>> Checked by AVG Free Edition.
>> Version: 7.1.394 / Virus Database: 268.7.1/347 - Release Date: 24/05/2006
>

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Sergej-2
A caching mechanism would be a good idea. Whenever you are parsing a XML or geting translations from database it could be cached in some *.php file as array with last modification date as one of the array's elements. Then every time you just compare modification dates and decide do you need recaching or not



Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
Sergio,
A caching mechanism would be a good idea. Whenever you are parsing a XML or geting translations from database it could be cached in some *.php file as array with last modification date as one of the array's elements. Then every time you just compare modification dates and decide do you need recaching or not

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
In reply to this post by Sergej-2
Sergio,
A caching mechanism would be a good idea. Whenever you are parsing a XML or geting translations from database it could be cached in some *.php file as array with last modification date as one of the array's elements. Then every time you just compare modification dates and decide do you need recaching or not
Of course we will include some king of caching, but using a seperate php file is nonsense...
and I'm no friend of arrays.
Think of a huge translation table, as we are using it, with 0,5Mio Translation Lines.
An array with 0,5Mio entries, will crash almost every server :)
 
2nd:
Why should someone use modification dates ?
A translation table will never be edited several times a day.
Normally you will change it once a day or even lesser.
And modification dates will double our caching memory useage.
 
greets
Thomas
Reply | Threaded
Open this post in threaded view
|

Re[2]: I18N

Alex Yuzhakoff
Hello Thomas,

Friday, May 26, 2006, 1:24:09 PM, you wrote:

TW> Of course we will include some king of caching, but  using a
TW> seperate php file is nonsense...
My to cents :)) Not just one big php, but some small php files. For
example, all localization data is separated into some sections (for
this page, for another page). Each section is small, so php files for
cache are small too.

TW> and I'm no friend of arrays.
Finally you really work with array of key-value pairs :) So arrays are
often used for this task.


--
Best regards,
  Alexei "SibProgrammer" Yuzhakov,
  Developer, SiteBuilder for Unix,
  SWsoft, Inc.

Reply | Threaded
Open this post in threaded view
|

Re: Re[2]: I18N

Thomas Weidner
Hy,

> TW> Of course we will include some king of caching, but  using a
> TW> seperate php file is nonsense...
> My to cents :)) Not just one big php, but some small php files. For
> example, all localization data is separated into some sections (for
> this page, for another page). Each section is small, so php files for
> cache are small too.

then we will have 1000s of caching files.
Each file with about 100-500 entries.

But:
Often there are the same phrases used in several pages.
So our cache would expand. It would not hold only
0,5Mio entries in my case, he will extend to about 1,5Mio Entries or even
more.

A page seperated cache seems to me not so good.
Maybe an alphabetical would be better:
a,b,c,d,e
when having more than 100 'e's it could be seperated to
a,b,c,d,ea,ee,et,f,g...
for example to become dynamically.

we could also use a pre-compiled translation table as
gettext use it with it's *.mo files.

> TW> and I'm no friend of arrays.
> Finally you really work with array of key-value pairs :) So arrays are
> often used for this task.

Let's correct me and say :
I'm no friend of a 0,5Mio huge Array... :)

And i think only because arrays are often used, doesn't mean that they are
the
fastest possibility for us.

Greets
Thomas

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
In reply to this post by Thomas Weidner
Hy Steven,

>> - automatic language recognition

> I'm sorry but I have absolutely no idea what you're talking about:
> languages and precentages ? Where's the link between the two ?

In every browser you can set your preferred language.
And you can set a possibility.

when you set

German = 100%
English = 80%
Frensh = 30%

and the page you are requesting only understands
English and French
the page automatically known which language it should display, as the
browser sends this in it's header.

This a feature which is often used by good multilanguage sites.
Look at language-settings in your browser and the header the browser sends
to the webserver.

>> - Text-Source
> Tell me about it :p I think the text database is a good idea, using
> SQLite or ini files for instance. I rather would not see plain text
> nor CSV files coming through, as I think using those would be very
> error prone. In that case, I'd prefer XML.

I would preffer gettext :)
But the clue is that we have several source types.
We could include almost any source-typ i think.
Plain text, csv, ini, xml or database... it's our decision.

And xml is even more error-prone...
when you have to edit it per hand without knowledge.

>> - Conversion
> That would actually be '18/06/2006 13:05:00' in French :)

Ok... i've had no french the last 30 years :))

> This point
> makes me think that if we're heading for conversion possibilities
> -which I think is a very good idea- we'll need to store a "main" value
> which can then be converted to any language.

That was my thought.
Like the unix timestamp for datetime for example.

>> - Unicode
> Indeed you should be using unicode source files, but I can't see how
> you would be able to impose that on people. As far as I can see, you
> can only issue that as a recommendation. Or do you know of ways of
> telling which characterset a source file is using ?

Of course recommendation...
Latest when PHP6 is arriving we should have it :)

Yes, there are possibillities...
XML for example defines it in it's first line, where you can set the
characterset of the XML file.

It's just a definition which we have to define. But there are several
possibilities...

Greets
Thomas

Reply | Threaded
Open this post in threaded view
|

Re[4]: I18N

Alex Yuzhakoff
In reply to this post by Thomas Weidner
Hello Thomas,

Friday, May 26, 2006, 2:27:02 PM, you wrote:

TW> then we will have 1000s of caching files.
TW> Each file with about 100-500 entries.

TW> But:
TW> Often there are the same phrases used in several pages.
TW> So our cache would expand. It would not hold only
TW> 0,5Mio entries in my case, he will extend to about 1,5Mio Entries or even
TW> more.
It's only a disk usage. During request we load needed sections only and
nothing more. Really this approach is very fast.

TW> A page seperated cache seems to me not so good.
Cache is based on sections separation. So if each section of localization data
represent the page (this is only one example).

TW> Maybe an alphabetical would be better:
TW> a,b,c,d,e
TW> when having more than 100 'e's it could be seperated to
TW> a,b,c,d,ea,ee,et,f,g...
TW> for example to become dynamically.

TW> we could also use a pre-compiled translation table as
TW> gettext use it with it's *.mo files.
I personally didn't like gettext cause of harder maintaining than
other sources (xml, db, ini-files).

--
Best regards,
  Alexei "SibProgrammer" Yuzhakov,
  Developer, SiteBuilder for Unix,
  SWsoft, Inc.

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Steven Van Poeck-2
In reply to this post by Thomas Weidner
2006/5/26, Thomas Weidner <[hidden email]>:

> Hy Steven,
>
> >> - automatic language recognition
>
> > I'm sorry but I have absolutely no idea what you're talking about:
> > languages and precentages ? Where's the link between the two ?
>
> In every browser you can set your preferred language.
> And you can set a possibility.
>
> when you set
>
> German = 100%
> English = 80%
> Frensh = 30%
>
> and the page you are requesting only understands
> English and French
> the page automatically known which language it should display, as the
> browser sends this in it's header.
>
> This a feature which is often used by good multilanguage sites.
> Look at language-settings in your browser and the header the browser sends
> to the webserver.

I was completely unaware of this. Thanks for explaining :)

>
> >> - Text-Source
> > Tell me about it :p I think the text database is a good idea, using
> > SQLite or ini files for instance. I rather would not see plain text
> > nor CSV files coming through, as I think using those would be very
> > error prone. In that case, I'd prefer XML.
>
> I would preffer gettext :)
> But the clue is that we have several source types.
> We could include almost any source-typ i think.
> Plain text, csv, ini, xml or database... it's our decision.
>

It actually should be the end-user's decision. I do think we should
narrow down the initial choice to the sources that seem the most
convenient to us. If the user wants to use another source, he can
still extend the base class.

> And xml is even more error-prone...
> when you have to edit it per hand without knowledge.
>

But at least it will complain about structural errors, as will the
parsing of an ini file. Plain text or CSV files will not. That's the
reason I would rather not see the latter beeing used.

> >> - Conversion
> > That would actually be '18/06/2006 13:05:00' in French :)
>
> Ok... i've had no french the last 30 years :))
>
> > This point
> > makes me think that if we're heading for conversion possibilities
> > -which I think is a very good idea- we'll need to store a "main" value
> > which can then be converted to any language.
>
> That was my thought.
> Like the unix timestamp for datetime for example.
>
> >> - Unicode
> > Indeed you should be using unicode source files, but I can't see how
> > you would be able to impose that on people. As far as I can see, you
> > can only issue that as a recommendation. Or do you know of ways of
> > telling which characterset a source file is using ?
>
> Of course recommendation...
> Latest when PHP6 is arriving we should have it :)
>

Yes I know but for the moment, the ZF is PHP5-based. I don't know if
it will be PHP 6 compatible. And although PHP 6 will natively support
Unicode, I don't know if it will have a mechanism of determining the
encoding of a source file through code.

> Yes, there are possibillities...
> XML for example defines it in it's first line, where you can set the
> characterset of the XML file.
>
> It's just a definition which we have to define. But there are several
> possibilities...

OK. Say we'd oblige any source file to "announce" it's encoding on the
first line using a predefined -preferably web standard compliant-
format. What about DB sources then ? How would we figure out what
character encoding these sources have ? I don't think there is any way
in which we can *ensure* the sources are unicode. So it will stay no
more but a recommendation...

Best regards,

Steven Van Poeck
http://poekie.free.fr
Reply | Threaded
Open this post in threaded view
|

Re: Re[4]: I18N

Thomas Weidner
In reply to this post by Alex Yuzhakoff
Hy

> It's only a disk usage. During request we load needed sections only and
> nothing more. Really this approach is very fast.

Ok...
I just didn't want to declare "we use disk cache" without other opinions and
tests.

> TW> A page seperated cache seems to me not so good.
> Cache is based on sections separation. So if each section of localization
> data
> represent the page (this is only one example).

But also think that there are several languages... so the cache will expand
for each language.

> TW> we could also use a pre-compiled translation table as
> TW> gettext use it with it's *.mo files.
> I personally didn't like gettext cause of harder maintaining than
> other sources (xml, db, ini-files).

Harder Maintaining ???
Ever used po-edit ?

Our project is about 7Mio php-codelines with 0,5Mio output-strings.
Just 1 click in poedit and the translation table is updated and i can send
it to the translators.
Changes are highligted and always on top.

For huge sites as mine is, xml or ini would be horrible to maintain. :)

Greets
Thomas

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
In reply to this post by Steven Van Poeck-2
Hy,

>> But the clue is that we have several source types.
>> We could include almost any source-typ i think.
>> Plain text, csv, ini, xml or database... it's our decision.
>
>It actually should be the end-user's decision. I do think we should
>narrow down the initial choice to the sources that seem the most
>convenient to us. If the user wants to use another source, he can
>still extend the base class.

With out decision i meant that we declare that the framework will
initially accept the source formats xxx,yyy,zzz.
And the user can define which one he uses.
Of course he can expand it... i even wish that someone will expand it :)

>> And xml is even more error-prone...
>> when you have to edit it per hand without knowledge.
>
> But at least it will complain about structural errors, as will the
> parsing of an ini file. Plain text or CSV files will not. That's the
> reason I would rather not see the latter beeing used.

As you said... it's the end-user decision. I preffer gettext for huge
projects.
Maybe we will switch to a text-DB... the future is open :)

>> >> - Unicode
>> Of course recommendation...
>> Latest when PHP6 is arriving we should have it :)

> Yes I know but for the moment, the ZF is PHP5-based. I don't know if
> it will be PHP 6 compatible. And although PHP 6 will natively support
> Unicode, I don't know if it will have a mechanism of determining the
> encoding of a source file through code.

Ok... was just a wish having unicode as i see the problem with
multi-language in asia.

> OK. Say we'd oblige any source file to "announce" it's encoding on the
> first line using a predefined -preferably web standard compliant-
> format. What about DB sources then ? How would we figure out what
> character encoding these sources have ? I don't think there is any way
> in which we can *ensure* the sources are unicode. So it will stay no
> more but a recommendation...

Of course we can not ensure that sources are unicode.
But when the user set's unicode we will handle it as unicode.
When nothing is declared we will set the standard-charset (UTF8?) or try to
recognize it.

As with database we could do something like this:
$I18N = new Zend_Locale();
$I18N->SetLocale('MSSQL',de_DE,UTF16);

Greets
Thomas

Reply | Threaded
Open this post in threaded view
|

Re[6]: I18N

Alex Yuzhakoff
In reply to this post by Thomas Weidner
Hello Thomas,

Friday, May 26, 2006, 3:41:59 PM, you wrote:

>> TW> A page seperated cache seems to me not so good.
>> Cache is based on sections separation. So if each section of localization
>> data
>> represent the page (this is only one example).

TW> But also think that there are several languages... so the cache will expand
TW> for each language.
My point of view is to have a 500Mb of cache data and very fast application
instead of very slow application without cache. And of cause we must
have an option to switch off/on the caching.

>> TW> we could also use a pre-compiled translation table as
>> TW> gettext use it with it's *.mo files.
>> I personally didn't like gettext cause of harder maintaining than
>> other sources (xml, db, ini-files).

TW> Harder Maintaining ???
TW> Ever used po-edit ?
Not only by me. Then working with different projects, with different
people. But, as already discussed, ZF must support different sources
for localization data.


--
Best regards,
  Alexei "SibProgrammer" Yuzhakov,
  Developer, SiteBuilder for Unix,
  SWsoft, Inc.

Reply | Threaded
Open this post in threaded view
|

Re[7]: I18N

Thomas Weidner
Hy,

> My point of view is to have a 500Mb of cache data and very fast
> application
> instead of very slow application without cache. And of cause we must
> have an option to switch off/on the caching.

Ok... that's a fine point... caching decision.

> TW> Harder Maintaining ???
> TW> Ever used po-edit ?
> Not only by me. Then working with different projects, with different
> people. But, as already discussed, ZF must support different sources
> for localization data.

That's all i said in my first post...
We need to support different sources ! :)

Greets
Thomas
( I signed an NDA so no company card to send here :( )

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Sergej-2
In reply to this post by Thomas Weidner
Thomas Weidner wrote:
> 2nd:
> Why should someone use modification dates ?
> A translation table will never be edited several times a day.
> Normally you will change it once a day or even lesser.
> And modification dates will double our caching memory useage.
Because it just will in development/testing stage.

Few words about arrays. I'm very skeptical about 0.5mb arrays. Such huge
arrays definitely should be splited in smaller ones. Then you could load
just the part you need, depending on action (example).

--
http://www.mif.vu.lt/~sean3322/other/signature

Reply | Threaded
Open this post in threaded view
|

Re: I18N

Thomas Weidner
Hy,

>> Why should someone use modification dates ?
>> A translation table will never be edited several times a day.
> Because it just will in development/testing stage.

But when you are in development/testing you can also drop the cache and
recreate it.
There's no need for modification timestamps.

> Few words about arrays. I'm very skeptical about 0.5mb arrays. Such huge
> arrays definitely should be splited in smaller ones. Then you could load
> just the part you need, depending on action (example).

Jup, just my thought :)
But with a seperated caching mechanism as discussed with alex and steven
it will work quite well.

The question is how to break the different parts for the cache.
A per-file break seems to me not very effective, as you are often using the
same phrase
in several pages.
Better would be a per-phrase breaking I think. Like a dictionary:
Book 1 from Aa-Dg, Book 2 from Dh-Rt and so on...
And it would be dynamically enough so that it can be very big and of course
fast accessable
as the caching mechanism will have to choose how much "books" he will
create.

Greets
Thomas

Reply | Threaded
Open this post in threaded view
|

Re: I18N

GavinZend
What *simple* i18n and l10n capabilities do we *absolutely* need *now*
in the ZF?

Instead of focusing now on the mechanics of how i18n (i.e.
gettext-style) translations might be implemented, I'm concerned about
what features we might want in the ZF.  Regarding very recent posts:
- Support for a simple string translation system, where a given string
is mapped to another, possibly in a different character set, subject to
an interpretation context (Zend_Locale).
- If a browser specifies a set of language preferences, what supporting
features should exist in the ZF (if any)?

For example, given the widespread adoption of gettext and
http://savannah.nongnu.org/projects/php-gettext/  within PHP projects,
what features and capabilities do you *not* need or are *missing* from
php-gettext?

Since I have absolute confidence that our contributors on this list
could skillfully create a superb implementation of a gettext clone in
pure PHP, I'd like to ignore the implementation details like
storage/caching/performance issues for the moment, and arrive at a
common understanding of what should be different in the API exposed by a
ZF locale translation component.

For example, the API around gettext presumes a single active, global
locale (see PHP example usage:
http://cvs.savannah.nongnu.org/viewcvs/php-gettext/examples/pigs_dropin.php?rev=1.3&root=php-gettext&view=auto 
), where we would want all translations to occur within the context of
an instance of Zend_Locale, without limiting the use of different
Zend_Locale's throughout our application code.  At the same time though,
we shouldn't make it difficult for an application to use localization
and internationalization functions all within the context of a single
Zend_Locale.

Cheers,
Gavin

P.S.
The i18n mailing list discusses details for upcoming support of i18n
features in PHP 6: http://marc.theaimsgroup.com/?l=php-i18n ).
Also support for Date objects exists in recent builds of PHP 5, but is
not yet enabled by default at compile time.