=== alpacaherder is now known as skellat | ||
cid420 | anybody around uses alpine for there mail services | 16:37 |
---|---|---|
cid420 | nvm i figured it out | 16:52 |
dzho | cid420: heh | 17:16 |
dzho | I use mutt | 17:16 |
paultag | thafreak: some debian people really like salt | 19:30 |
jrgifford | My head hurts. Too much XML. | 19:40 |
jrgifford | paultag: salt is interesting. | 19:40 |
jrgifford | different concept than say, puppet or chef. | 19:40 |
paultag | aye | 19:40 |
jrgifford | anyone want to parse 300MB XML files? | 19:41 |
jrgifford | * 2031 | 19:41 |
paultag | that's it? | 19:41 |
jrgifford | nonono. | 19:41 |
paultag | fun fact of the day | 19:42 |
jrgifford | the files are 300MB each | 19:42 |
paultag | nbd | 19:42 |
jrgifford | there are 2031 of them. | 19:42 |
paultag | Wikipedia's full mutli-gig dump is a single XML file | 19:42 |
jrgifford | and 1 additional each week, also the same size. | 19:42 |
jrgifford | ew. | 19:42 |
paultag | I've parsed that | 19:42 |
paultag | I don't think you can scare me | 19:42 |
jrgifford | yes i can. | 19:42 |
paultag | :) | 19:42 |
jrgifford | if i tell you where this data comes from. | 19:42 |
jrgifford | USPTO. | 19:42 |
jrgifford | i have to parse PATENTS. | 19:42 |
paultag | dude | 19:43 |
paultag | do you know what I do for work? :) | 19:43 |
jrgifford | parse congressional records. | 19:43 |
paultag | nonono | 19:43 |
paultag | that's another team | 19:43 |
jrgifford | oh.. | 19:43 |
jrgifford | are you patents? :P | 19:43 |
paultag | nah, worse | 19:43 |
jrgifford | (I haven't followed sunlight lately) | 19:43 |
jrgifford | i'm scared. | 19:43 |
paultag | I scrape all 50 states + 2 (DC and PR)'s legislative everything | 19:44 |
paultag | 250+ scrapers | 19:44 |
jrgifford | you aren't openstates, are you? | 19:44 |
paultag | pdf scraping | 19:44 |
paultag | yeah, I am | 19:44 |
jrgifford | hah! | 19:44 |
paultag | I do a lot of PDF scraping | 19:44 |
paultag | and a lot of OCD | 19:44 |
paultag | OCR | 19:44 |
paultag | and a lot of hurt | 19:44 |
jrgifford | no, ocd is a good way of putting it. | 19:44 |
paultag | webpages with 3 body tags | 19:44 |
paultag | others with invalid markup | 19:44 |
paultag | pages that use a white 'i' rather then a space | 19:44 |
jrgifford | that sounds easier than these... but i can't judge. | 19:45 |
paultag | We also parse FEC filing data | 19:45 |
paultag | which is also insane | 19:45 |
Unit193 | cid420: Yes I do. | 19:45 |
cid420 | whats that Unit | 19:45 |
cid420 | was it about alpine i asked before? | 19:46 |
cid420 | i got everything i wanted in my servers. now i am bored | 19:54 |
cid420 | alright where is a good start to learn programming for Linux like tutorials etc. | 19:57 |
skellat | .nws 44004 | 21:22 |
jenni | Lake Effect Snow Advisory issued December 10 at 3:44PM EST until December 11 at 9:00AM EST by NWS | 21:22 |
jenni | Complete weather watches, warnings, and advisories for Ashtabula, OH, available here: http://alerts.weather.gov/cap/wwaatmget.php?x=OHC007 -- You may also PM the bot to get the full list. | 21:23 |
skellat | Oh, goodie | 21:23 |
andygraybeal | nice | 21:45 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!