[16:37] <cid420> anybody around uses alpine for there mail services
[16:52] <cid420> nvm i figured it out
[17:16] <dzho> cid420: heh
[17:16] <dzho> I use mutt
[19:30] <paultag> thafreak: some debian people really like salt
[19:40] <jrgifford> My head hurts. Too much XML.
[19:40] <jrgifford> paultag: salt is interesting.
[19:40] <jrgifford> different concept than say, puppet or chef.
[19:40] <paultag> aye
[19:41] <jrgifford> anyone want to parse 300MB XML files?
[19:41] <jrgifford> * 2031
[19:41] <paultag> that's it?
[19:41] <jrgifford> nonono.
[19:42] <paultag> fun fact of the day
[19:42] <jrgifford> the files are 300MB each
[19:42] <paultag> nbd
[19:42] <jrgifford> there are 2031 of them.
[19:42] <paultag> Wikipedia's full mutli-gig dump is a single XML file
[19:42] <jrgifford> and 1 additional each week, also the same size.
[19:42] <jrgifford> ew.
[19:42] <paultag> I've parsed that
[19:42] <paultag> I don't think you can scare me
[19:42] <jrgifford> yes i can.
[19:42] <paultag> :)
[19:42] <jrgifford> if i tell you where this data comes from.
[19:42] <jrgifford> USPTO.
[19:42] <jrgifford> i have to parse PATENTS.
[19:43] <paultag> dude
[19:43] <paultag> do you know what I do for work? :)
[19:43] <jrgifford> parse congressional records.
[19:43] <paultag> nonono
[19:43] <paultag> that's another team
[19:43] <jrgifford> oh..
[19:43] <jrgifford> are you patents? :P
[19:43] <paultag> nah, worse
[19:43] <jrgifford> (I haven't followed sunlight lately)
[19:43] <jrgifford> i'm scared.
[19:44] <paultag> I scrape all 50 states + 2 (DC and PR)'s legislative everything
[19:44] <paultag> 250+ scrapers
[19:44] <jrgifford> you aren't openstates, are you?
[19:44] <paultag> pdf scraping
[19:44] <paultag> yeah, I am
[19:44] <jrgifford> hah!
[19:44] <paultag> I do a lot of PDF scraping
[19:44] <paultag> and a lot of OCD
[19:44] <paultag> OCR
[19:44] <paultag> and a lot of hurt
[19:44] <jrgifford> no, ocd is a good way of putting it.
[19:44] <paultag> webpages with 3 body tags
[19:44] <paultag> others with invalid markup
[19:44] <paultag> pages that use a white 'i' rather then a space
[19:45] <jrgifford> that sounds easier than these... but i can't judge.
[19:45] <paultag> We also parse FEC filing data
[19:45] <paultag> which is also insane
[19:45] <Unit193> cid420: Yes I do.
[19:45] <cid420> whats that Unit
[19:46] <cid420> was it about alpine i asked before?
[19:54] <cid420> i got everything i wanted in my servers. now i am bored
[19:57] <cid420> alright where is a good start to learn programming for Linux like tutorials etc.
[21:22] <skellat> .nws 44004
[21:22] <jenni> Lake Effect Snow Advisory issued December 10 at 3:44PM EST until December 11 at 9:00AM EST by NWS
[21:23] <jenni> Complete weather watches, warnings, and advisories for Ashtabula, OH, available here: http://alerts.weather.gov/cap/wwaatmget.php?x=OHC007 -- You may also PM the bot to get the full list.
[21:23] <skellat> Oh, goodie
[21:45] <andygraybeal> nice