Discussion:
writing a moin parser
jacob martinson
2005-06-14 16:57:21 UTC
Permalink
are there any docs anywhere on writing a moin parser? i.e. how a
parser fits in with the rest of moin, the interfaces, etc.

our users would like to be able to use a simpler wiki markup language
like in jspwiki.

thanks!

-jacob


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
Thomas Waldmann
2005-06-15 04:30:58 UTC
Permalink
Post by jacob martinson
are there any docs anywhere on writing a moin parser? i.e. how a
parser fits in with the rest of moin, the interfaces, etc.
Not really, but there is some example code in MoinMoin/parser/*.py. :)

And it is not THAT complicated.

A parser gets the raw page text and must generate calls to the formatter
to produce output.

The existing wiki parser does this by parsing the raw page line-per-line
with that big ugly regular expression and doing formatter calls
depending on what matched. It keeps some state in variables like in_pre
or in_table.
Post by jacob martinson
our users would like to be able to use a simpler wiki markup language
like in jspwiki.
Maybe it is a good idea to make a page on the moinmoin wiki about that
topic. There might be 2 different scenarios:

1) You want to make a jspwiki compatible parser as an additional parser
for moin (because your users are used to that and you have data in that
markup). Good luck in that case! :)

2) You just want to improve the moinmoin default markup and parser.

We also want that, but there definitely needs to be some very concrete
and common plan to make this work. E.g. we want to simplify the link
markup, which is currently too complicated and too irregular. Same thing
applies to "include" long-term. We also want to generate xhtml at some
time, but with that line-by-line parser it won't be possible. We also
want to use DOM to make include work better. attachments shall be
unified (in storage as well as in linking / including).

Maybe get on #moin irc channel on irc.freenode.net to talk about it, if
you want to help with that.



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
jacob martinson
2005-06-15 14:53:40 UTC
Permalink
My immediate goal is to make the moin markup simpler, not to replicate
jspwiki syntax per se, just to make it simple enough to migrate from
jspwiki w/o resistance from our userbase.

I got a bit lost when working with the big ugly re, especially when it
came to it referencing the unicode character map. I'm going to have
to read up a bit on how unicode works in Python.

Also, the standard wiki.py parser does some things with string
substitution I had never seen before (substituting named placeholders
with values from a dictionary, rather than substituting generic
placeholders linearly from a tuple), which kind of confused me at
first, but now that I understand it . . . I see it's a really cool way
to do a large number of substitutions.

I'll dig around some more and hopefully find a way out to freenode
from our network.

Thanks for all the info!!!

-j
Post by Thomas Waldmann
Post by jacob martinson
are there any docs anywhere on writing a moin parser? i.e. how a
parser fits in with the rest of moin, the interfaces, etc.
Not really, but there is some example code in MoinMoin/parser/*.py. :)
And it is not THAT complicated.
A parser gets the raw page text and must generate calls to the formatter
to produce output.
The existing wiki parser does this by parsing the raw page line-per-line
with that big ugly regular expression and doing formatter calls
depending on what matched. It keeps some state in variables like in_pre
or in_table.
Post by jacob martinson
our users would like to be able to use a simpler wiki markup language
like in jspwiki.
Maybe it is a good idea to make a page on the moinmoin wiki about that
1) You want to make a jspwiki compatible parser as an additional parser
for moin (because your users are used to that and you have data in that
markup). Good luck in that case! :)
2) You just want to improve the moinmoin default markup and parser.
We also want that, but there definitely needs to be some very concrete
and common plan to make this work. E.g. we want to simplify the link
markup, which is currently too complicated and too irregular. Same thing
applies to "include" long-term. We also want to generate xhtml at some
time, but with that line-by-line parser it won't be possible. We also
want to use DOM to make include work better. attachments shall be
unified (in storage as well as in linking / including).
Maybe get on #moin irc channel on irc.freenode.net to talk about it, if
you want to help with that.
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Moin-devel mailing list
https://lists.sourceforge.net/lists/listinfo/moin-devel
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
Karl Dubost
2005-06-15 19:51:34 UTC
Permalink
Post by jacob martinson
I got a bit lost when working with the big ugly re, especially when it
came to it referencing the unicode character map. I'm going to have
to read up a bit on how unicode works in Python.
http://www.jorendorff.com/articles/unicode/python.html
http://effbot.org/zone/unicode-objects.htm
http://diveintopython.org/xml_processing/unicode.html
http://evanjones.ca/python-utf8.html
--
Karl Dubost - http://www.la-grange.net/
Près de vous, madame, oubliant les cieux,
L'astronome étonné se trouble;
C'est dans l'éclat caressant de vos yeux,
Qu'il avait cru trouver l'étoile double.




-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
jacob martinson
2005-06-15 20:20:30 UTC
Permalink
Post by Thomas Waldmann
Post by jacob martinson
are there any docs anywhere on writing a moin parser? i.e. how a
parser fits in with the rest of moin, the interfaces, etc.
Not really, but there is some example code in MoinMoin/parser/*.py. :)
And it is not THAT complicated.
I'm sure it's simple once you understand it, but looking at wiki.py
it's pretty easy to get lost on what is actually happening. This has
got to be one of the best candidates for being rewritten with the re.X
option I've ever seen:

word_rule =
ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s][%(l)s]+){2,})+(?![%(u)s%(l)s]+)'
% {
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' %
re.escape(PARENT_PREFIX)) or '',
}

And I thought I knew reg expressions pretty well!

Thanks for everyone's help so far... I'm still chugging at it...

-j


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
jacob martinson
2005-06-15 20:40:02 UTC
Permalink
i thought this was going to be a simple change, but it's proving
difficult for me so far. i'm just trying to change wiki.py to
recognize [Pagename] as a wiki word, instead of ["Pagename"].

-j
Post by jacob martinson
Post by Thomas Waldmann
Post by jacob martinson
are there any docs anywhere on writing a moin parser? i.e. how a
parser fits in with the rest of moin, the interfaces, etc.
Not really, but there is some example code in MoinMoin/parser/*.py. :)
And it is not THAT complicated.
I'm sure it's simple once you understand it, but looking at wiki.py
it's pretty easy to get lost on what is actually happening. This has
got to be one of the best candidates for being rewritten with the re.X
word_rule =
ur'(?:(?<![%(l)s])|^)%(parent)s(?:%(subpages)s(?:[%(u)s][%(l)s]+){2,})+(?![%(u)s%(l)s]+)'
% {
'u': config.chars_upper,
'l': config.chars_lower,
'subpages': config.allow_subpages and (wikiutil.CHILD_PREFIX + '?') or '',
'parent': config.allow_subpages and (ur'(?:%s)?' %
re.escape(PARENT_PREFIX)) or '',
}
And I thought I knew reg expressions pretty well!
Thanks for everyone's help so far... I'm still chugging at it...
-j
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
jacob martinson
2005-06-23 06:11:43 UTC
Permalink
just wanted to say thanks to everyone for your help.

the changes i made were small/trivial but the only way i could get
them to take effect in a consistant/reliable way was to delete all the
*pyc files under site-packages/MoinMoin and restart twisted/apache
each time i made a change to wiki.py. for some reason just deleting
wiki.pyc didn't do it. 90% of the time i've spent on this little
project has been fighting some sort of importing issue.

anyway, here are my current changes to make [wiki word] be
interpretted the way i wanted:

$ diff wiki.py wiki.py.0
432c432
< wikiname = word[1:-1]
---
wikiname = word[2:-2]
958c958
< rules = rules + ur'|(?P<wikiname_bracket>\[.*?\])'
---
rules = rules + ur'|(?P<wikiname_bracket>\[".*?"\])'
i hope to move from jspwiki to moin soon so i'm going to go through
the standard parser and make it as clean & easy to use as possible,
taking hints from the other wikis i've used. once i get something
nice i'll post it to the parsermarket in case anyone else might like
it.

on a side note i thought i knew python, but there's been a number of
things i've seen since reading moin code i didn't know you could do.

thanks again for all your help!!!

-jacob


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click

Loading...