How to implement i18n without performance overhead
i18n is always difficult to implement and costs a lot performance. Normally, implementations use gettext() or a custom t()-function to translate a string. t() searches a INI or XML file for a translation key and returns the value. For example t('setting', 'de'), gives the German translation 'Einstellung'.
Typical optimizations use associative arrays (hashmaps) loaded into APC or Memcached. This requires a lot of memory for the array and produces a lot of cpu cycles for calling t() all the time. So the question is, can we do this better?
Yes! We use a just-in-time compiler for our PHP files and write the compiled PHP files to disk, so APC can cache them like regular PHP files.
An example PHP class looks like this:
The translation files look like this:
Now we need a proxy to translate the PHP class before including it:
To make things even faster, we can skip file_exists() and filemtime() by using a small static compiler instead of the just-in-time compiler:
Sometimes, strings need to be translated and combined with other values on different positions, depending on the language. e.g. "10 EUR" and "USD 10". This can be also done easily by using sprintf():
By using a compiler for translations, we can make i18n a lot easier and faster!
Typical optimizations use associative arrays (hashmaps) loaded into APC or Memcached. This requires a lot of memory for the array and produces a lot of cpu cycles for calling t() all the time. So the question is, can we do this better?
Yes! We use a just-in-time compiler for our PHP files and write the compiled PHP files to disk, so APC can cache them like regular PHP files.
An example PHP class looks like this:
The "{t}" and "{/t}" patterns serve as opening and closing tags indicating strings to be translated.
// example.php
<?php
class example {
public function now() {
return '{t}Hello World, now it is:{/t} '.date('{t}m/d/Y g:i a{/t}');
}
}
The translation files look like this:
Each language has one translation file (lang/<country-code>.ini). Each translation item is written into one line. The first element in the line is the English string, followed by " = " and the localized string.
// lang/en.ini (empty)
// lang/de.ini
Hello World, now it is: = Hallo Welt, jetzt ist es:
m/d/Y g:i a = d/m/Y H:i
Now we need a proxy to translate the PHP class before including it:
Before including example.php, we check if a translated version is available or build a new one. The same happens if example.php is being changed. The build takes the translation file (.ini) and converts it to a (.php) file. Then example.php gets translated with the translation file. The output is stored in cache/.
// instead of
require("core/example.php");
echo (new example())->now();
// we write
define('LANG', 'en');
require(translate('core/example.php'));
echo (new example())->now();
// input: example.php
// output: cache/<lang>_example.php_<timestamp>.php
function translate($file) {
$cache_file = 'cache/'.LANG.'_'.basename($file).'_'.filemtime($file).'.php';
// (re)build translation?
if (!file_exists($cache_file)) {
$lang_file = 'lang/'.LANG.'.ini';
$lang_file_php = 'cache/'.LANG.'_'.filemtime($lang_file).'.php';
// convert .ini file into .php file
if (!file_exists($lang_file_php)) {
file_put_contents($lang_file_php, '<?php $strings='.
var_export(parse_ini_file($lang_file), true).';', LOCK_EX);
}
// translate .php into localized .php file
$tr = function($match) use (&$lang_file_php) {
static $strings = null;
if ($strings===null) require($lang_file_php);
return isset($strings[ $match[1] ]) ? $strings[ $match[1] ] : $match[1];
};
// replace all {t}abc{/t} by tr()
file_put_contents($cache_file, preg_replace_callback(
'!\{t\}([^\{]+)\{/t\}!', $tr, file_get_contents($file)), LOCK_EX);
}
return $cache_file;
}
To make things even faster, we can skip file_exists() and filemtime() by using a small static compiler instead of the just-in-time compiler:
// compiler.php
static $langs = array('en', 'de');
static $files = array('core/example.php');
foreach ($langs as $lang) {
// load translations
$strings = parse_ini_file('lang/'.$lang.'.ini');
foreach ($files as $file) {
// translate .php into localized .php file
$tr = function($match) use (&$lang, &$strings) {
return isset($strings[ $match[1] ]) ? $strings[ $match[1] ] : $match[1];
};
// replace all {t}abc{/t} by tr()
file_put_contents('cache/'.$lang.'_'.basename($file), preg_replace_callback(
'!\{t\}([^\{]+)\{/t\}!', $tr, file_get_contents($file)), LOCK_EX);
}
}
// index.php
define('LANG', 'en');
require('cache/'.LANG.'_example.php');
echo (new example())->now();
Sometimes, strings need to be translated and combined with other values on different positions, depending on the language. e.g. "10 EUR" and "USD 10". This can be also done easily by using sprintf():
// PHP
$str = sprintf('{t}USD %d{/t}', 10);
// lang/de.ini
USD %d = %d EUR
By using a compiler for translations, we can make i18n a lot easier and faster!
Comments
Post a Comment