How to convert .xml to .po and back

Asked by PeanutBlake on 2010-01-20

Dear all,

Launchpad.net is really excellent for co-translation. So I'm going to build ADempiere ERP translation project here.( ADempiere ERP is an open source project on sourceforge.net, here is the link: http://sourceforge.net/projects/adempiere)

There is only one problem: I cannot convert .xml files to .po files.
I've tried po4a, but the .po file is not what I want. Here is the command I used:
$ po4a-gettextize -f guide -m origin.xml > untranslate.po

There are totally about 40 .xml files to translate in ADempiere ERP.
How can I convert them into .po files?
And how can I convert .po files back to the .xml?

Any suggestions are appreciated! Thanks!

Example of ADempiere xml file:
--------------------------------
<adempiereTrl language="en_US" table="AD_Window">
 <row id="100" trl="Y">
  <value column="Name" original="ABC">Table and Column</value>
  <value column="Description" original="123">Maintain Tables and Columns</value>
  <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
 </row>
</adempiereTrl>
--------------------------------

.po file format that I want:
--------------------------------
#. type: Name
#: id="100"
msgid "Table and Column"
msgstr ""

#. type: Description
#: id="100"
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Help
#: id="100"
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

Question information

Language:
English Edit question
Status:
Solved
For:
Launchpad itself Edit question
Assignee:
No assignee Edit question
Solved by:
PeanutBlake
Solved:
2010-01-29
Last query:
2010-01-29
Last reply:
2010-01-27
Henning Eggers (henninge) said : #1

I do not have an answer but what you are looking for is *extraction*, not conversion. You want to extract the translatable strings from the xml files. So maybe this is what you might need to search for.

David Planella (dpm) said : #2

You can use the xml2po application for both extraction and merging back to xml files. Quoting the xml2po man page:

"
EXAMPLES:
    To create a POTemplate book.pot from input files chapter1.xml and
    chapter2.xml, run the following:
        /usr/bin/xml2po -o book.pot chapter1.xml chapter2.xml

    After translating book.pot into de.po, merge the translations back,
    using -p option for each XML file:
        /usr/bin/xml2po -p de.po chapter1.xml > chapter1.de.xml
        /usr/bin/xml2po -p de.po chapter2.xml > chapter2.de.xml
"

Aaron Bentley (abentley) said : #3

Please try xml2po as David suggested.

Yes, I would suggest using xml2po. If you want to slightly mis-use source references and comments for translators like you suggest, you may need to provide a new mode for xml2po (look at figure extraction in docbook.py mode to see how to achieve something similar).

PeanutBlake (peanutblake) said : #5

Thanks for all friends!
It seams that I have to create a template if I want to use xml2po or po4a.
Another way is to write a script file to deal with it.
Henning Eggers is right. It is a EXTRACTION rather than a simple conversion.

PeanutBlake (peanutblake) said : #6

Dear all,

Finally I solved the problem by Python programming.
The package of etree works perfect for XML format.
And now POT files are ready for updating to lp.net ^_^

Peanut Blake

David Planella (dpm) said : #7

Thanks for the update.

However, I'd be interested: which part of xml2po or po4all did not work for you, that you had to resort to writing your own script?

PeanutBlake (peanutblake) said : #8

Hi, David Planella

Thank you for your attention! I didn't write any code that xml2po or po4all can do ( It will be a stupid work ^_^ ). I just wrote python script to modify the original XML format to a new one, so that po4a-gettextize will be able to generate useful information.

"Original XML format" just generate "POT file -Origin", you can see that it can't provide useful information such as ID number and column type, these are very important information for translation in ADempiere ERP.

So I designed a "New XML format", it generates "POT file -New", which shows ID number and column type like "#. type: Content of: <id-100><Name>". That is what I have done with python script.

Here are the results ( still waiting for approvement now ^_^ ):
https://translations.edge.launchpad.net/aderp/+imports
=========================================
EXAMPLE:

------Original XML format-------
-----------------------------------------
<row id="100" trl="Y">
  <value column="Name" original="ABC">Table and Column</value>
  <value column="Description" original="123">Maintain Tables and Columns</value>
  <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
</row>
-----------------------------------------

------POT file -Origin---------------
-----------------------------------------
#. type: Content of: <row><value>
msgid "Table and Column"
msgstr ""

#. type: Content of: <row><value>
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Content of: <row><value>
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

------New XML format--------
-----------------------------------------
<id-100>
  <Name>"Table and Column</Name>
  <Description>Maintain Tables and Columns</Description>
  <Help>The Table and Column Window defines all tables with their columns</Help>
</id-100>
-----------------------------------------

------POT file -New---------------
-----------------------------------------
#. type: Content of: <id-100><Name>
msgid "Table and Column"
msgstr ""

#. type: Content of: <id-100><Description>
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Content of: <id-100><Help>
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

David Planella (dpm) said : #9

El dg 18 de 04 de 2010 a les 13:05 +0000, en/na PeanutBlake va escriure:
> Question #98058 on Launchpad Translations changed:
> https://answers.edge.launchpad.net/rosetta/+question/98058
>
> PeanutBlake posted a new comment:
> Hi, David Planella
>
> Thank you for your attention! I didn't write any code that xml2po or
> po4all can do ( It will be a stupid work ^_^ ). I just wrote python
> script to modify the original XML format to a new one, so that po4a-
> gettextize will be able to generate useful information.
>
> "Original XML format" just generate "POT file -Origin", you can see that
> it can't provide useful information such as ID number and column type,
> these are very important information for translation in ADempiere ERP.
>
> So I designed a "New XML format", it generates "POT file -New", which
> shows ID number and column type like "#. type: Content of:
> <id-100><Name>". That is what I have done with python script.
>
> Here are the results ( still waiting for approvement now ^_^ ):
> https://translations.edge.launchpad.net/aderp/+imports
> =========================================
> EXAMPLE:
>
> ------Original XML format-------
> -----------------------------------------
> <row id="100" trl="Y">
> <value column="Name" original="ABC">Table and Column</value>
> <value column="Description" original="123">Maintain Tables and Columns</value>
> <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
> </row>
> -----------------------------------------
>
> ------POT file -Origin---------------
> -----------------------------------------
> #. type: Content of: <row><value>
> msgid "Table and Column"
> msgstr ""
>
> #. type: Content of: <row><value>
> msgid "Maintain Tables and Columns"
> msgstr ""
>
> #. type: Content of: <row><value>
> msgid "The Table and Column Window defines all tables with their columns"
> msgstr ""
> --------------------------------
>
> ------New XML format--------
> -----------------------------------------
> <id-100>
> <Name>"Table and Column</Name>
> <Description>Maintain Tables and Columns</Description>
> <Help>The Table and Column Window defines all tables with their columns</Help>
> </id-100>
> -----------------------------------------
>
> ------POT file -New---------------
> -----------------------------------------
> #. type: Content of: <id-100><Name>
> msgid "Table and Column"
> msgstr ""
>
> #. type: Content of: <id-100><Description>
> msgid "Maintain Tables and Columns"
> msgstr ""
>
> #. type: Content of: <id-100><Help>
> msgid "The Table and Column Window defines all tables with their columns"
> msgstr ""
> --------------------------------
>

Thanks for the update. I'm glad you could do the conversion to a format
that meets your needs.