/srv/irclogs.ubuntu.com/2009/09/15/#upstart.txt

=== h\h is now known as haraldh
ionkeybuk: Feel like merging my mountall changes? :-) I could rebase the patch set on to the current mountall source package later today, but there are probably no conflicts.13:21
Keybukion: am going to merge after alpha 6 is out13:22
ionAlright13:22
pocekHi19:58
sadmac2hello19:59
pocekHow's upstart 1.0 going? I watched fosdem video and it seems really promising :)20:05
sadmac2its going'20:05
sadmac2pocek: are you on the mailing list?20:09
pocekJust subscribed20:12
sadmac2pocek: well that's the place to be20:12
sadmac2I'll probably shoot another braindump towards scott tonight20:13
Keybuk\o/20:14
sadmac2Keybuk: celebrating my braindump or have you done wonders?20:15
Keybukcelebrating your brain dumps20:16
KeybukI like them20:16
sadmac2sweet20:16
Keybukfor a start, they help me actually work out what *I* mean sometimes :p20:16
Keybukplus you have lots of ideas20:16
sadmac2Keybuk: I got a bit sidetracked last night in my coding. Long story short libnih is about to grow a non-deterministic RE language parser to replace nih_config :)20:16
sadmac2Keybuk: bison was inadequate, so I'm writing us our own.20:16
Keybukheh20:17
KeybukI'm clearly rubbing off on you ;)20:17
ion:-)20:17
Keybuk"why use this pre-canned library when I COULD WRITE MY OWN!" :p20:17
ionsadmac: Have you looked at PEGs?20:18
ionhttp://treetop.rubyforge.org/ as an example project implementing PEG.20:18
ionThey have some pros and some cons.20:18
sadmac2ion: I spent a good few hours trying to get my hands on any parser generators I could find. Didn't find those.20:22
sadmac2ion: the big thing we need is a non-deterministic lexer. For example: "start on start". The same token has two different meanings in that stanza.20:23
sadmac2Its a keyword, and then later its an unquoted string.20:24
ionPEG should handle that fine20:24
ionWithout any extra effort.20:24
sadmac2ion: the other thing I wanted (more of a nice-to-have) is expressions that take arguments. Its good for things like this:20:25
sadmac2opstring(SEP) ::= event SEP OPERATOR SEP event20:25
sadmac2on_opstring ::= opstring(NONBREAKING_SPACE)20:26
sadmac2event ::= event_spec | '(' opstring(BREAKING_SPACE) ')'20:27
ionPlease give an example of a chunk to be parsed by that.20:27
sadmac2the being-nested-makes-linebreaks-ok thing is really annoying in bison20:27
ionAh20:27
sadmac2ion: surely. "start on a and (b or \n c) \n start on a and b or \n c"20:28
ionYeah, got it20:28
sadmac2ion: the \ns are linebreaks20:28
sadmac2ion: the second line there should be an error20:28
sadmac2ion: the first should be fine20:28
sadmac2its less of a pain to do conventionally when you don't have big chunks of identical handler code for each version of opstring20:29
sadmac2that grammar was kinda fudgy20:30
sadmac2opstring(SEP) ::= event | opstring(SEP) SEP OPERATOR SEP opstring(SEP) | '(' opstring(BREAKING_SPACE) ')'20:32
sadmac2on_opstring ::= opstring(NONBREAKING_SPACE)20:32
sadmac2much better.20:32
sadmac2BNF is hard, lets go shopping20:32
sadmac2ion: implementing these can't be easy, and ruby isn't a good libnih dependency :)20:41
ionAs i said, it’s just an example implementation. One is free to implement PEG without such dependencies.20:42
sadmac2ion: yeah, I just wonder what it would take to do in C :)20:46
sadmac2ion: right now the plan is just to use C syntax and express it that way.20:46
sadmac2with hopes of then bootstrapping the specification of a better grammar through that later on.20:49
ionExtending e.g. Treetop’s ‘rule’ element with parameters (as you described) would be very natural in fact. If one is to implement PEG in C, passing the separator as a parameter wouldn’t be the hard part. :-)20:50
ionPseudo-treetopish-code:20:51
ionOn a second thought, i’ll write this to pastebin. Moment.20:51
ionSomething like http://pastebin.com/f79d8463520:57
sadmac2ion: I'm so used to people bending ruby into dsls that I keep trying to think how this still constitutes valid ruby :)20:58
ionPEG consumes input greedily, just like a regexp, without a need for separate tokenization. PEG can be thought of as an extended regexp that supports giving names to chunks and calling them by names (and recursion comes with that).20:59
ionThat’s not Ruby. That’s Treetop grammar syntax.20:59
sadmac2ion: yeah20:59
ionSay, begin with a plain regexp:21:04
ionhello\s+[a-zA-Z0-9]+21:04
ionGive it a name:21:04
ionhello_stanza := hello\s+[a-zA-Z0-9]+21:04
ionSplit it to parts and give them names:21:05
ionhello_stanza = 'hello' whitespace name; whitespace := \s+; name := [a-zA-Z0-9]+21:05
ionAnd implement support for recursion. There’s your basic PEG parser.21:05
ionJust a regexp engine extended a bit.21:05
sadmac2ion: sounds nice.21:06
sadmac2ion: I may do that.21:06
sadmac2ion: I could probably extend glibc regexen to do most of it for me too21:06
ionAnd as for precedence, it’s greedy just like regexps: the first matching tree is the one it picks.21:07
ionWant anything in parenthesis to have precedence over an and expression, and that to have precedence over an or expression? Give them in that order:21:08
ionevent_spec := '(' event_spec ')' | event_spec sep 'and' sep event_spec | event_spec sep 'or' sep event_spec | event;21:08
ionor something along those lines.21:08
ionThe regexp engine looks at the next character. Is it '('? Nope? Then look at the next alternative. Just like a plain regexp such as /\(foo\)|alternative|another/21:09
ionPEG does have some disadvantages: http://en.wikipedia.org/wiki/Parsing_expression_grammar#Disadvantages21:10
sadmac2ion: searching the fedora repos for peg gives me every mpeg and jpeg tool or library ever written21:21
ionhttp://piumarta.com/software/peg/21:21
sadmac2ion: I was just about to link you the same thing21:21
ionSearch for \<peg\> or \bpeg\b, depending on the regexp dialect of rpm or whatever. :-P21:21
ion(No PEG results in Ubuntu, though.)21:22
sadmac2ion: this looks new, and reeks of next months abandonware21:23
ionHe has bothered to write an extensive manual, that tells something. :-) http://piumarta.com/software/peg/peg.1.html21:23
sadmac2Keybuk: care to weigh in with preferences?21:24
=== robbiew is now known as robbiew-afk
=== sadmac_ is now known as sadmac
=== robbiew-afk is now known as robbiew
=== robbiew is now known as robbiew-afk

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!