Please import bug tracker data for epics-base

Asked by Andrew Johnson

Ralph Lange has converted all the old bugs from our ancient Mantis bug database to your XML import format, and we'd like to try an import into the project epics-base. We're still awaiting a response to LP Question #96591 which could result in some changes if there are some undocumented tags we could use, but absent that we're ready for a first try. I've read the responses to Question #91604.

The XML file is at http://www.aps.anl.gov/epics/mantis/MantisExport.xml.gz

Thanks!

Question information

Language:
English Edit question
Status:
Solved
For:
Launchpad itself Edit question
Assignee:
Deryck Hodge Edit question
Solved by:
Deryck Hodge
Solved:
Last query:
Last reply:
Revision history for this message
Andrew Johnson (anj) said :
#1

Ping!

Any chance of a response or an ETA for this bug data import?

Thanks...

Revision history for this message
Deryck Hodge (deryck) said :
#2

Hi, Andrew.

We'll have to take a look at it next week. Sorry for the delay in response. Someone from the Launchpad bugs team will reply here Monday of next week with an update.

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#3

Ping ping!

It's been almost a month since I last asked about this bug data import; I never did hear from the Launchpad bugs team.

Thanks...

Revision history for this message
Deryck Hodge (deryck) said :
#4

Hi, Andrew.

I'm very sorry about the dropping this. It's a combination of getting assigned to me and other developers on our team thinking it was handled and me being a bit busy lately.

Graham Binns is going to work on Question #9659. I'll do a local import and see if there are any initial problems today. I'll let you know if there are problems; otherwise, I'll wait to hear from you if what Graham finds out matters to your bug data dump.

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#5

Since Graham's answer to question #96591 was "not supported" please go ahead and try importing the XML data as given above.

Thanks,
- Andrew

Revision history for this message
Deryck Hodge (deryck) said :
#6

Hi,

I spent a good bit of time trying to get the bug dump imported. There were a number of issues with the data. I fixed a few things with xml validation, date format, and status values, but I started to trip over integrity errors where the data was trying to assign to a milestone that didn't exist. There were also errors with trying to set duplicate status on a bug that didn't exist.

I believe this would be better fixed by you guys, since I don't know what you want to see here.

Here is the cleaned up XML file I was working on, with the changes I could make. Hope it helps some.

http://people.canonical.com/~deryck/mantis-export-reformat.xml.gz

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#7

Thanks Deryck, we'll work on re-doing those. Are the character entity replacements necessary? Your version has all the &quot; instances replaced with double-quote characters, and is missing some &lt; and &gt; entities completely, e.g. in the <text> tag inside the <comment> dated 2002-11-05T03:42:57.

Ralph,

I'm looking at the changes Deryck has made:

* We were missing the leading <?xml version="1.0"?> line
* The <date> and similar tags need to be expressed in Zulu timezone, i.e. replace the "-06:00" with "Z" and add 6 hours to the time/date quoted.
* The <status> tag contents should be all caps with no spaces, i.e. "FIXRELEASED" not "Fix released", "NEW" etc.
* The milestone issue is probably that we had trailing commas on some of the milestone values, e.g. "3.14.10,".
* I've noticed some mal-formed character entities in our XML version, e.g. line 704 has &gt"&gt; in it; that might be in the original Mantis data though.

Unfortunately we can't just fix Deryck's version because of the lost &lt; and &gt; entities in some of the code segments.

Do you have any time to work on this? I don't mind fixing things in a text editor, but adjusting all of the times for the timezone change might be a little hard for me.

- Andrew

Revision history for this message
Andrew Johnson (anj) said :
#8

Ralph: There's also have an important typo in the original data; a comment dated 2004-01-15T12:54:11 mentions the release 3.15.5 but should say 3.14.5, which got converted into a milestone 3.15.5 as well.

Revision history for this message
Deryck Hodge (deryck) said :
#9

Hi, Andrew.

I ran your file through xmllint trying to fix the mal-formed entities like &gt". I imagine xmllint probably replaced the quot entities with double quotes. I don't think any of the sed work I did to fix dates or statuses would have done this.

Cheers,
deryck

Revision history for this message
Ralph Lange (ralph-lange) said :
#10

Hi,

I'm checking the issues that came yup, fixing my Mantis2XML converter....

> * We were missing the leading <?xml version="1.0"?> line

The example on https://help.launchpad.net/Bugs/ImportFormat doesn't have it. I was wondering, but decided to stick with the example rather than use common sense.
Fixed.

> * The <date> and similar tags need to be expressed in Zulu timezone, i.e. replace the "-06:00" with "Z" and add 6 hours to the time/date quoted.

The format spec defines the dates as "element date { xsd:dateTime }" with no hint that time zones are not supported.
Fixed.

> * The milestone issue is probably that we had trailing commas on some of the milestone values, e.g. "3.14.10,".

Well spotted. My regexps never seem to be final...
Fixed.

> * I've noticed some mal-formed character entities in our XML version, e.g. line 704 has &gt"&gt; in it; that might be in the original Mantis data though.

That's a bad one. A bug in either some browser versions or our antique Mantis couldn't handle <a href....> tags correctly and screwed up the mysql database entries. Bummer!
More regexps... (btw: http://www.fileformat.info/tool/regex.htm is a *great* tool!!)
Fixed.

> *There's also have an important typo in the original data; a comment dated 2004-01-15T12:54:11 mentions the release 3.15.5 but should say 3.14.5, which got converted into a milestone 3.15.5 as well.

Nice one!
Fixed. (Manually, in the mysql table.)

So - I think we are ready for the next round. I put the new XML file on
   http://pubweb.bnl.gov/~rlange/MantisExport.xml.gz

Thanks for your help!
Ralph

Revision history for this message
Andrew Johnson (anj) said :
#11

Deryck, please try again with the file at the URL that Ralph posted in the previous comment.

Some of these issues could be cleaned up for the next user by revising the wiki page describing the import format. I'd be happy to do that if you'd like me to.

- Andrew

Revision history for this message
Deryck Hodge (deryck) said :
#12

I got:

Traceback (most recent call last):
  File "./scripts/bug-import.py", line 70, in <module>
    script.run()
  File "/home/deryck/launchpad/lp-branches/devel/lib/lp/services/scripts/base.py", line 248, in run
    self.main()
  File "./scripts/bug-import.py", line 64, in main
    importer.importBugs(self.txn)
  File "/home/deryck/launchpad/lp-branches/devel/lib/lp/bugs/scripts/bugimport.py", line 260, in importBugs
    tree = ET.parse(self.bugs_filename)
  File "<string>", line 45, in parse
  File "<string>", line 32, in parse
SyntaxError: not well-formed (invalid token): line 7709, column 83

This is an issue with a &quot" and I'm hesitant to xmllint it, due to the above issues with quot entities being messed up. I'll wait on another update from you guys, just so we're sure the data is correct.

Cheers,
deryck

Revision history for this message
Deryck Hodge (deryck) said :
#13

And yes, please feel free to update the wiki. Thanks!

Revision history for this message
Ralph Lange (ralph-lange) said :
#14

That was a similar issue. Buggy browser or old Mantis were trying to convert everything that looked remotely like an email address into an <a href="mailto....> tag, even where it was not appropriate (in verbatim C code), and they did it wrong, screwing up the database entries.

Another regexp...
Fixed.

New version, same place:
      http://pubweb.bnl.gov/~rlange/MantisExport.xml.gz

Thanks again!
Ralph

Revision history for this message
Deryck Hodge (deryck) said :
#15

Hi,

Another one:

SyntaxError: not well-formed (invalid token): line 9428, column 0

Which seems to be tripping over some sort of end of line characters, at least as represented by vim.

Cheers,
deryck

Revision history for this message
Ralph Lange (ralph-lange) said :
#16

Darn... formfeed characters from C sources included in error reports.
Another regexp...
Fixed.

New version, same place:
      http://pubweb.bnl.gov/~rlange/MantisExport.xml.gz

Fingers crossed....

Thanks,
Ralph

Revision history for this message
Deryck Hodge (deryck) said :
#17

This one works. So the next step is that we need to import on our staging site and let you all confirm that the import looks sane before we do it to our production site. We can do this fairly quickly tomorrow, Friday, I imagine. I'll get the import sometime between 1100 and 1400 UTC and you guys can verify it tomorrow. If it looks good, we'll do the import to the main site.

I'll ping here tomorrow when the staging import is done.

Cheers,
deryck

Revision history for this message
Deryck Hodge (deryck) said :
#18

Just changing the status, since we no longer need a clean xml file.

Revision history for this message
Deryck Hodge (deryck) said :
#19

We've got lots of bugs devs doing updates and testing against staging today, so if I get in line for importing the bugs today it's going to be late UTC. We will likely have the staging DB update this weekend, so the window for viewing imported data would be small.

I will wait until Monday to do the test import to give you all more time to review the bugs.

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#20

Hi Deryck,

Up to you but we're both currently on US time, so unless "late" means close to 11pm UTC (5pm Central) I should still be able to take a look at the results, although I can't speak for Ralph.

Thanks either way,
- Andrew

Revision history for this message
Deryck Hodge (deryck) said :
#21

Bugs are in. See: https://bugs.staging.launchpad.net/epics-base

Let me know here how the import looks.

Cheers,
deryck

Revision history for this message
Ralph Lange (ralph-lange) said :
#22

We had plenty of time to look at the import ... thanks for doing that today.

Question: the <urls> do not show up at any place we can see - are they dropped by the importer?
Never mind - we added the urls to the description text, so the browser will show them as clickable links, anyway.

We added a whole bunch of fixes for different remaining issues.

Could you import the new version from
      http://pubweb.bnl.gov/~rlange/MantisExport.xml.gz
to the refreshed staging server and ping us when it's available? Thank you!

I think we're getting very close now...

Cheers,
Ralph

Revision history for this message
Andrew Johnson (anj) said :
#23

Marking Q as open, new file available...

Revision history for this message
Deryck Hodge (deryck) said :
#24

The bug import is running now and should be available on staging in 20-30 minutes from this message.

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#25

Thanks Deryck.

This looks pretty good now, I've just done a manual pass through with an editor fixing a few more issues, hopefully the last significant tweaking we'll want to do. The new file is at
    http://www.aps.anl.gov/epics/mantis/MantisExport.xml.gz

Thanks,
- Andrew

Revision history for this message
Deryck Hodge (deryck) said :
#26

Hi, Andrew.

So do we feel good enough about these updates to make this the final import? i.e. are they just trivial edits?

Or do we need to look at them on staging again before doing the final import?

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#27

Hi Deryck,

I don't think I've messed anything up but I was doing hand-edits without checking the result with any XML tools; I don't know how the importer reacts if it finds a fault part-way through, but if it's not too much work it's probably safer to run this on staging first. Unless Ralph comes up with something that I've missed I should be able to give the go-ahead for the live site within less than an hour after staging is complete.

Thanks,
- Andrew

Revision history for this message
Andrew Johnson (anj) said :
#28

Hi again Deryck,

We've changed our minds, go ahead and import that file from www.aps.anl.gov directly into the live site; we can edit any remaining issues that we find in Launchpad.

Thanks very much for your help and patience.
- Andrew

Revision history for this message
Best Deryck Hodge (deryck) said :
#29

The import is now done for launchpad.net. Thanks for your patience through this and for choosing Launchpad to manage your bugs!

Cheers,
deryck

Revision history for this message
Andrew Johnson (anj) said :
#30

Thanks Deryck Hodge, that solved my question.