Merge lp:~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex into lp:openupgrade-addons

Proposed by Holger Brunn (Therp)
Status: Merged
Merged at revision: 8174
Proposed branch: lp:~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex
Merge into: lp:openupgrade-addons
Diff against target: 55 lines (+32/-8)
1 file modified
document_page/migrations/7.0.1.0.1/post-migration.py (+32/-8)
To merge this branch: bzr merge lp:~therp-nl/openupgrade-addons/7.0-wiki-nobacktracking-regex
Reviewer Review Type Date Requested Status
Ronald Portier (Therp) code review and test Approve
Pedro Manuel Baeza code review Approve
Sandy Carter (http://www.savoirfairelinux.com) (community) code review, no test Approve
Review via email: mp+238872@code.launchpad.net

Description of the change

This branch avoids using regexes in cases where backtracking can become so expensive that it seems like a deadlock, i.e. with

re.compile("'''''(([^']|([^']'{0,4}[^']))+)'''''").sub("<b><i>\1</i></b>", "'''''test'''' hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world hello world")

Note that this occurs with all regexes of this kind in the case it's a malformed wiki syntax

To post a comment you must log in.
Revision history for this message
Sandy Carter (http://www.savoirfairelinux.com) (sandy-carter) wrote :

LGTM +1

review: Approve (code review, no test)
Revision history for this message
Pedro Manuel Baeza (pedro.baeza) wrote :

LGTM

review: Approve (code review)
Revision history for this message
Ronald Portier (Therp) (rportier1962) wrote :

Tested conversion in a separate function and code review

review: Approve (code review and test)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'document_page/migrations/7.0.1.0.1/post-migration.py'
2--- document_page/migrations/7.0.1.0.1/post-migration.py 2014-09-15 07:43:23 +0000
3+++ document_page/migrations/7.0.1.0.1/post-migration.py 2014-10-20 12:48:46 +0000
4@@ -111,9 +111,6 @@
5 re_ol_li = re.compile("^(\*\*+|#+):? ")
6 re_ul_ol_li = re.compile("^(\*+|#+):? ")
7 re_youtube = re.compile("^(https?://)?(www\.)?youtube.com/(watch\?(.*)v=|embed/)([^&]+)")
8-re_b_i = re.compile("'''''(([^']|([^']('{1,4})?[^']))+)'''''")
9-re_b = re.compile("'''(([^']|([^'](''?)?[^']))+)'''")
10-re_i = re.compile("''(([^']|([^']'?[^']))+)''")
11
12
13 class Wiky:
14@@ -316,9 +313,36 @@
15 break
16
17 # Bold, Italics, Emphasis
18- wikitext = re_b_i.sub("<b><i>\1</i></b>", wikitext)
19- wikitext = re_b.sub("<b>\1</b>", wikitext)
20- wikitext = re_i.sub("<i>\1</i>", wikitext)
21-
22- return wikitext
23+ head = ''
24+ tail = wikitext
25+ while tail:
26+ quote_pos = tail.find("'")
27+ if quote_pos > -1:
28+ head += tail[:quote_pos]
29+ tail = tail[quote_pos:]
30+ pos = 0
31+ while pos < len(tail) and tail[pos] == "'":
32+ pos += 1
33+ if pos == 2 or pos == 3 or pos == 5:
34+ endquote_pos = tail.find("'"*pos, pos)
35+ if endquote_pos > -1:
36+ text = tail[pos:endquote_pos]
37+ if pos == 2:
38+ text = '<i>%s</i>' % text
39+ elif pos == 3:
40+ text = '<b>%s</b>' % text
41+ elif pos == 5:
42+ text = '<b><i>%s</i></b>' % text
43+ head += text
44+ tail = tail[endquote_pos + pos:]
45+ else:
46+ head += tail[:pos]
47+ tail = tail[pos:]
48+ else:
49+ head += tail[:pos]
50+ tail = tail[pos:]
51+ else:
52+ head += tail
53+ tail = ''
54+ return head
55 # vim:expandtab:smartindent:tabstop=4:softtabstop=4:shiftwidth=4:

Subscribers

People subscribed via source and target branches