How to convert .xml to .po and back

Asked by PeanutBlake

Dear all,

Launchpad.net is really excellent for co-translation. So I'm going to build ADempiere ERP translation project here.( ADempiere ERP is an open source project on sourceforge.net, here is the link: http://sourceforge.net/projects/adempiere)

There is only one problem: I cannot convert .xml files to .po files.
I've tried po4a, but the .po file is not what I want. Here is the command I used:
$ po4a-gettextize -f guide -m origin.xml > untranslate.po

There are totally about 40 .xml files to translate in ADempiere ERP.
How can I convert them into .po files?
And how can I convert .po files back to the .xml?

Any suggestions are appreciated! Thanks!

Example of ADempiere xml file:
--------------------------------
<adempiereTrl language="en_US" table="AD_Window">
 <row id="100" trl="Y">
  <value column="Name" original="ABC">Table and Column</value>
  <value column="Description" original="123">Maintain Tables and Columns</value>
  <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
 </row>
</adempiereTrl>
--------------------------------

.po file format that I want:
--------------------------------
#. type: Name
#: id="100"
msgid "Table and Column"
msgstr ""

#. type: Description
#: id="100"
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Help
#: id="100"
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

Question information

Language:
English Edit question
Status:
Solved
For:
Launchpad itself Edit question
Assignee:
No assignee Edit question
Solved by:
PeanutBlake
Solved:
Last query:
Last reply:
Revision history for this message
Henning Eggers (henninge) said :
#1

I do not have an answer but what you are looking for is *extraction*, not conversion. You want to extract the translatable strings from the xml files. So maybe this is what you might need to search for.

Revision history for this message
David Planella (dpm) said :
#2

You can use the xml2po application for both extraction and merging back to xml files. Quoting the xml2po man page:

"
EXAMPLES:
    To create a POTemplate book.pot from input files chapter1.xml and
    chapter2.xml, run the following:
        /usr/bin/xml2po -o book.pot chapter1.xml chapter2.xml

    After translating book.pot into de.po, merge the translations back,
    using -p option for each XML file:
        /usr/bin/xml2po -p de.po chapter1.xml > chapter1.de.xml
        /usr/bin/xml2po -p de.po chapter2.xml > chapter2.de.xml
"

Revision history for this message
Aaron Bentley (abentley) said :
#3

Please try xml2po as David suggested.

Revision history for this message
Данило Шеган (danilo) said :
#4

Yes, I would suggest using xml2po. If you want to slightly mis-use source references and comments for translators like you suggest, you may need to provide a new mode for xml2po (look at figure extraction in docbook.py mode to see how to achieve something similar).

Revision history for this message
PeanutBlake (peanutblake) said :
#5

Thanks for all friends!
It seams that I have to create a template if I want to use xml2po or po4a.
Another way is to write a script file to deal with it.
Henning Eggers is right. It is a EXTRACTION rather than a simple conversion.

Revision history for this message
PeanutBlake (peanutblake) said :
#6

Dear all,

Finally I solved the problem by Python programming.
The package of etree works perfect for XML format.
And now POT files are ready for updating to lp.net ^_^

Peanut Blake

Revision history for this message
David Planella (dpm) said :
#7

Thanks for the update.

However, I'd be interested: which part of xml2po or po4all did not work for you, that you had to resort to writing your own script?

Revision history for this message
PeanutBlake (peanutblake) said :
#8

Hi, David Planella

Thank you for your attention! I didn't write any code that xml2po or po4all can do ( It will be a stupid work ^_^ ). I just wrote python script to modify the original XML format to a new one, so that po4a-gettextize will be able to generate useful information.

"Original XML format" just generate "POT file -Origin", you can see that it can't provide useful information such as ID number and column type, these are very important information for translation in ADempiere ERP.

So I designed a "New XML format", it generates "POT file -New", which shows ID number and column type like "#. type: Content of: <id-100><Name>". That is what I have done with python script.

Here are the results ( still waiting for approvement now ^_^ ):
https://translations.edge.launchpad.net/aderp/+imports
=========================================
EXAMPLE:

------Original XML format-------
-----------------------------------------
<row id="100" trl="Y">
  <value column="Name" original="ABC">Table and Column</value>
  <value column="Description" original="123">Maintain Tables and Columns</value>
  <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
</row>
-----------------------------------------

------POT file -Origin---------------
-----------------------------------------
#. type: Content of: <row><value>
msgid "Table and Column"
msgstr ""

#. type: Content of: <row><value>
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Content of: <row><value>
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

------New XML format--------
-----------------------------------------
<id-100>
  <Name>"Table and Column</Name>
  <Description>Maintain Tables and Columns</Description>
  <Help>The Table and Column Window defines all tables with their columns</Help>
</id-100>
-----------------------------------------

------POT file -New---------------
-----------------------------------------
#. type: Content of: <id-100><Name>
msgid "Table and Column"
msgstr ""

#. type: Content of: <id-100><Description>
msgid "Maintain Tables and Columns"
msgstr ""

#. type: Content of: <id-100><Help>
msgid "The Table and Column Window defines all tables with their columns"
msgstr ""
--------------------------------

Revision history for this message
David Planella (dpm) said :
#9

El dg 18 de 04 de 2010 a les 13:05 +0000, en/na PeanutBlake va escriure:
> Question #98058 on Launchpad Translations changed:
> https://answers.edge.launchpad.net/rosetta/+question/98058
>
> PeanutBlake posted a new comment:
> Hi, David Planella
>
> Thank you for your attention! I didn't write any code that xml2po or
> po4all can do ( It will be a stupid work ^_^ ). I just wrote python
> script to modify the original XML format to a new one, so that po4a-
> gettextize will be able to generate useful information.
>
> "Original XML format" just generate "POT file -Origin", you can see that
> it can't provide useful information such as ID number and column type,
> these are very important information for translation in ADempiere ERP.
>
> So I designed a "New XML format", it generates "POT file -New", which
> shows ID number and column type like "#. type: Content of:
> <id-100><Name>". That is what I have done with python script.
>
> Here are the results ( still waiting for approvement now ^_^ ):
> https://translations.edge.launchpad.net/aderp/+imports
> =========================================
> EXAMPLE:
>
> ------Original XML format-------
> -----------------------------------------
> <row id="100" trl="Y">
> <value column="Name" original="ABC">Table and Column</value>
> <value column="Description" original="123">Maintain Tables and Columns</value>
> <value column="Help" original="TEST">The Table and Column Window defines all tables with their columns</value>
> </row>
> -----------------------------------------
>
> ------POT file -Origin---------------
> -----------------------------------------
> #. type: Content of: <row><value>
> msgid "Table and Column"
> msgstr ""
>
> #. type: Content of: <row><value>
> msgid "Maintain Tables and Columns"
> msgstr ""
>
> #. type: Content of: <row><value>
> msgid "The Table and Column Window defines all tables with their columns"
> msgstr ""
> --------------------------------
>
> ------New XML format--------
> -----------------------------------------
> <id-100>
> <Name>"Table and Column</Name>
> <Description>Maintain Tables and Columns</Description>
> <Help>The Table and Column Window defines all tables with their columns</Help>
> </id-100>
> -----------------------------------------
>
> ------POT file -New---------------
> -----------------------------------------
> #. type: Content of: <id-100><Name>
> msgid "Table and Column"
> msgstr ""
>
> #. type: Content of: <id-100><Description>
> msgid "Maintain Tables and Columns"
> msgstr ""
>
> #. type: Content of: <id-100><Help>
> msgid "The Table and Column Window defines all tables with their columns"
> msgstr ""
> --------------------------------
>

Thanks for the update. I'm glad you could do the conversion to a format
that meets your needs.