Localizing Pelican Sites
Localization
Localization is provided via the i18n_subsites plugin.
As mentioned before, localization is provided via the plugin i18n_subsites
. This plugin employs jinja2.ext.i18n
to achieve this.
Localization works at two specific levels:
- Theme templates. Templates often contain hardcoded text and this needs to be localized. This is espcially the case for our design as the homepage contains text that is not pulled from any of the site articles or pages.
- Pages & articles. These are typically published in markdown and these need to be localized as well.
We'll examine the localization process for both below.
Theme templates
Theme files are localized using Babel
, which is used to extract the localizable content wrapped in {% trans %
& {% endtrans %}
blocks within the templates. Babel includes a tool - pybabel
that helps you do this.
There are two distinct use-case scenarios. Initial creation of the translation files and subsequently updating them for any changes that occur during routine maintenance.
Initial creation
-
Extract the localizable strings to a message template file. This will create a new messages.pot file replacing the existing template file and its contents.
$ cd theme $ pybabel extract --mapping ../babel.cfg -o messages.pot .
Note that since we're localizing themes, we have to work from the
theme
subfolder. -
Create the translations catalog based on a generated message template file for each language that you want to support (specified in
pelicanconf.py
). In our case, we only have one extra language to support -zh
.$ cd theme $ pybabel init --input-file messages.pot --output-dir translations/ --locale zh
-
Compile the created
.PO
catalog files to its compiled format -- a.MO
file. This command is to be issued from the project home folder.$ cd <project-home> $ pybabel compile --directory theme/translations/
This would compile all the .PO files in theme/translations/
into a binary format .MO file. With this done, when pelican content
command is issued to generate the site, for each supported language, a replican of all the theme HTML files for the specified language would be generated and placed under the <lang-code>
folder under ./output
.
So since we only support zh
as an extra language, a zh
language version of our site would be generated under ./output/zh
folder.
Updating for changes
There's one additonal step involved. As we maintain the site, new strings will be added and/or existing ones would be tweaked. In this scenario we need to update the message template file (use pybabel extract
command above) and the translations for each supported language. For this use the pybabel update
subcommand.
-
Regenerate the messages template
.POT
file to capture all the new strings. Do this:$ cd theme $ pybabel extract --mapping ../babel.cfg -o messages.pot .
-
Update the new & modified strings in the .PO file.
$ cd theme $ pybabel update --input-file messages.pot --output-dir translations/ --locale zh
-
Translate the strings and compile the PO file to MO file.
$ cd <project-home> $ pybabel compile --directory theme/translations/
The above command ensures that existing translations are preserved while new strings are added to the PO file. Any changes to an existing string, will be marked by commenting off the original translation and the string will be marked as Fuzzy. You can disable this with the --no-fuzzy-matching
option.
Pages & articles
Localization of pages and articles works in a different way. For each article or page that you wish to localize, create a duplicate file. You may wish to use the language code as the filename suffix to identify the file as being the different language version of the same file.
Then in all the files, specify the same slug
value but different value for lang
. The same value connects the artifacts as different language versions of the same article or page. For example, if youave a page about.md
, you may want to create about_zh.md
for its Chinese version and specify the same value for the slug
field in both thereby connecting the two pages as being the same but for two different languages.