[13:21] <ion> keybuk: Feel like merging my mountall changes? :-) I could rebase the patch set on to the current mountall source package later today, but there are probably no conflicts.
[13:22] <Keybuk> ion: am going to merge after alpha 6 is out
[13:22] <ion> Alright
[19:58] <pocek> Hi
[19:59] <sadmac2> hello
[20:05] <pocek> How's upstart 1.0 going? I watched fosdem video and it seems really promising :)
[20:05] <sadmac2> its going'
[20:09] <sadmac2> pocek: are you on the mailing list?
[20:12] <pocek> Just subscribed
[20:12] <sadmac2> pocek: well that's the place to be
[20:13] <sadmac2> I'll probably shoot another braindump towards scott tonight
[20:14] <Keybuk> \o/
[20:15] <sadmac2> Keybuk: celebrating my braindump or have you done wonders?
[20:16] <Keybuk> celebrating your brain dumps
[20:16] <Keybuk> I like them
[20:16] <sadmac2> sweet
[20:16] <Keybuk> for a start, they help me actually work out what *I* mean sometimes :p
[20:16] <Keybuk> plus you have lots of ideas
[20:16] <sadmac2> Keybuk: I got a bit sidetracked last night in my coding. Long story short libnih is about to grow a non-deterministic RE language parser to replace nih_config :)
[20:16] <sadmac2> Keybuk: bison was inadequate, so I'm writing us our own.
[20:17] <Keybuk> heh
[20:17] <Keybuk> I'm clearly rubbing off on you ;)
[20:17] <ion> :-)
[20:17] <Keybuk> "why use this pre-canned library when I COULD WRITE MY OWN!" :p
[20:18] <ion> sadmac: Have you looked at PEGs?
[20:18] <ion> http://treetop.rubyforge.org/ as an example project implementing PEG.
[20:18] <ion> They have some pros and some cons.
[20:22] <sadmac2> ion: I spent a good few hours trying to get my hands on any parser generators I could find. Didn't find those.
[20:23] <sadmac2> ion: the big thing we need is a non-deterministic lexer. For example: "start on start". The same token has two different meanings in that stanza.
[20:24] <sadmac2> Its a keyword, and then later its an unquoted string.
[20:24] <ion> PEG should handle that fine
[20:24] <ion> Without any extra effort.
[20:25] <sadmac2> ion: the other thing I wanted (more of a nice-to-have) is expressions that take arguments. Its good for things like this:
[20:25] <sadmac2> opstring(SEP) ::= event SEP OPERATOR SEP event
[20:26] <sadmac2> on_opstring ::= opstring(NONBREAKING_SPACE)
[20:27] <sadmac2> event ::= event_spec | '(' opstring(BREAKING_SPACE) ')'
[20:27] <ion> Please give an example of a chunk to be parsed by that.
[20:27] <sadmac2> the being-nested-makes-linebreaks-ok thing is really annoying in bison
[20:27] <ion> Ah
[20:28] <sadmac2> ion: surely. "start on a and (b or \n c) \n start on a and b or \n c"
[20:28] <ion> Yeah, got it
[20:28] <sadmac2> ion: the \ns are linebreaks
[20:28] <sadmac2> ion: the second line there should be an error
[20:28] <sadmac2> ion: the first should be fine
[20:29] <sadmac2> its less of a pain to do conventionally when you don't have big chunks of identical handler code for each version of opstring
[20:30] <sadmac2> that grammar was kinda fudgy
[20:32] <sadmac2> opstring(SEP) ::= event | opstring(SEP) SEP OPERATOR SEP opstring(SEP) | '(' opstring(BREAKING_SPACE) ')'
[20:32] <sadmac2> on_opstring ::= opstring(NONBREAKING_SPACE)
[20:32] <sadmac2> much better.
[20:32] <sadmac2> BNF is hard, lets go shopping
[20:41] <sadmac2> ion: implementing these can't be easy, and ruby isn't a good libnih dependency :)
[20:42] <ion> As i said, it’s just an example implementation. One is free to implement PEG without such dependencies.
[20:46] <sadmac2> ion: yeah, I just wonder what it would take to do in C :)
[20:46] <sadmac2> ion: right now the plan is just to use C syntax and express it that way.
[20:49] <sadmac2> with hopes of then bootstrapping the specification of a better grammar through that later on.
[20:50] <ion> Extending e.g. Treetop’s ‘rule’ element with parameters (as you described) would be very natural in fact. If one is to implement PEG in C, passing the separator as a parameter wouldn’t be the hard part. :-)
[20:51] <ion> Pseudo-treetopish-code:
[20:51] <ion> On a second thought, i’ll write this to pastebin. Moment.
[20:57] <ion> Something like http://pastebin.com/f79d84635
[20:58] <sadmac2> ion: I'm so used to people bending ruby into dsls that I keep trying to think how this still constitutes valid ruby :)
[20:59] <ion> PEG consumes input greedily, just like a regexp, without a need for separate tokenization. PEG can be thought of as an extended regexp that supports giving names to chunks and calling them by names (and recursion comes with that).
[20:59] <ion> That’s not Ruby. That’s Treetop grammar syntax.
[20:59] <sadmac2> ion: yeah
[21:04] <ion> Say, begin with a plain regexp:
[21:04] <ion> hello\s+[a-zA-Z0-9]+
[21:04] <ion> Give it a name:
[21:04] <ion> hello_stanza := hello\s+[a-zA-Z0-9]+
[21:05] <ion> Split it to parts and give them names:
[21:05] <ion> hello_stanza = 'hello' whitespace name; whitespace := \s+; name := [a-zA-Z0-9]+
[21:05] <ion> And implement support for recursion. There’s your basic PEG parser.
[21:05] <ion> Just a regexp engine extended a bit.
[21:06] <sadmac2> ion: sounds nice.
[21:06] <sadmac2> ion: I may do that.
[21:06] <sadmac2> ion: I could probably extend glibc regexen to do most of it for me too
[21:07] <ion> And as for precedence, it’s greedy just like regexps: the first matching tree is the one it picks.
[21:08] <ion> Want anything in parenthesis to have precedence over an and expression, and that to have precedence over an or expression? Give them in that order:
[21:08] <ion> event_spec := '(' event_spec ')' | event_spec sep 'and' sep event_spec | event_spec sep 'or' sep event_spec | event;
[21:08] <ion> or something along those lines.
[21:09] <ion> The regexp engine looks at the next character. Is it '('? Nope? Then look at the next alternative. Just like a plain regexp such as /\(foo\)|alternative|another/
[21:10] <ion> PEG does have some disadvantages: http://en.wikipedia.org/wiki/Parsing_expression_grammar#Disadvantages
[21:21] <sadmac2> ion: searching the fedora repos for peg gives me every mpeg and jpeg tool or library ever written
[21:21] <ion> http://piumarta.com/software/peg/
[21:21] <sadmac2> ion: I was just about to link you the same thing
[21:21] <ion> Search for \<peg\> or \bpeg\b, depending on the regexp dialect of rpm or whatever. :-P
[21:22] <ion> (No PEG results in Ubuntu, though.)
[21:23] <sadmac2> ion: this looks new, and reeks of next months abandonware
[21:23] <ion> He has bothered to write an extensive manual, that tells something. :-) http://piumarta.com/software/peg/peg.1.html
[21:24] <sadmac2> Keybuk: care to weigh in with preferences?