[13:48] <fahadsadah> Who manages http://irclogs.ubuntu.com?
[13:48] <LjL> fahadsadah: are you going to ask retroactive permission to wget it? :P
[13:49] <fahadsadah> No.
[13:49] <LjL> fahadsadah: /whois ubuntulog
[13:49] <fahadsadah> Actually, was going to ask for monolithic files
[13:49] <fahadsadah> Or at least a tarbal
[13:49] <fahadsadah> *tarball
[13:49] <fahadsadah> rt
[13:49] <fahadsadah> That's something
[13:50] <LjL> fahadsadah: i doubt you will have much luck with that, tbh
[13:50] <fahadsadah> Can't find any rts in here
[13:50] <LjL> fahadsadah: it's not a single person
[13:51] <fahadsadah> A team?
[13:51] <LjL> something like that
[13:51] <LjL> basically it's canonical
[13:51] <Tm_T> what I wonder is why you need such thing?
[13:52] <fahadsadah> Statistics
[13:52] <Tm_T> hmm, what kind of statistics?
[13:52] <fahadsadah> pisg
[13:52] <fahadsadah> Similar to http://digit.cluenet.org/clueirc.html
[13:52] <Tm_T> for some particular channel or all?
[13:52] <LjL> fahadsadah: it doesn't really take too long to wget the whole thing, anyway. i've done it, it's manageable
[13:53] <fahadsadah> #ubuntu
[13:53] <Tm_T> fahadsadah: BTW pisg is known to be heavy, really heavy
[13:53] <Tm_T> compared to many others that is
[13:53] <LjL> also, no fun with a channel like #ubuntu :P
[13:53] <fahadsadah> It's also shiny
[13:53] <Tm_T> indeed
[13:53] <fahadsadah> really shiny
[13:53] <fahadsadah> =p
[13:54] <Tm_T> fahadsadah: not shinier than any others in my experience
[13:55] <fahadsadah> Please can you suggest one?
[13:55] <LjL> actually, /me goes to download the latest to see what the karmic spike was like
[13:56] <fahadsadah> LjL: You say the wget was manageable?
[13:56] <Tm_T> fahadsadah: fisg, irssistats, ircstats ... there's others
[13:56] <fahadsadah> It's been going for two hours, and is still in 2006
[13:56] <fahadsadah> Tm_T: Thanks
[13:56] <LjL> fahadsadah: uh. i don't really remember just how long it took for me, but that seems way too long.
[13:56] <LjL> fahadsadah: you're very sure it's downloading only the .txt, and only for #u?
[13:56] <LjL> (also, what's your connection like?)
[13:57] <fahadsadah> 100Mbps
[13:57] <LjL> well mine is 10...
[13:57] <fahadsadah> And it's downloading everything, then discarding everything that isn't index.html or #ubuntu.txt
[13:57] <LjL> oh. ouch.
[13:57] <fahadsadah> I know
[13:57] <fahadsadah> Stupid wget
[13:57] <Tm_T> :-P
[13:57] <Tm_T> I wouldn't blame wget
[13:58] <fahadsadah> OK, stupid options passed to wget
[13:58] <fahadsadah> Tantamount to stupid user
[13:58] <LjL> eh, i'm pretty sure wget can be made to not download the rest in the first place... also, you could just tell it which files to download in advance
[13:58] <fahadsadah> wget -rA "#ubuntu.txt,index.html" http://irclogs.ubuntu.com
[14:00] <fahadsadah> I Ctrl+Ced it
[14:00] <fahadsadah> Seeing as I know all the filenames I want, I'll make a file containing them, with ruby, then use wget -i
[14:00] <fahadsadah> Thanks for your help
[14:01] <LjL> fahadsadah: yeah, i've done the same thing with php
[14:01] <LjL> for($Date=mktime(12, 0, 0, 2, 16, 2008); $Date<time()-3600*48; $Date+=3600*24) {          $Filename="http://irclogs.ubuntu.com/".date("Y", $Date)."/".date("m", $Date)."/".date("d", $Date)."/%23ubuntu.txt";
[14:02] <fahadsadah> That's only since 2008?
[14:02] <LjL> fahadsadah: because i already had the ones before that
[14:03] <fahadsadah> Great, thanks
[14:03]  * fahadsadah rubyfies
[14:03] <LjL> if i actually had the logs i'd just give you a tarball, but my php script processes them and then discards them, so i don't have them
[14:09] <fahadsadah> LjL: Fast.
[14:10] <fahadsadah> Hasn't been half a minute, and I'm in 2005
[14:10] <LjL> fahadsadah: well, 2004 has very few things
[14:10] <LjL> but indeed, most of the time spent will be requesting files, rather than actually downloading them... and when you requested them all instead of just the #ubuntu one, that's death
[14:10] <fahadsadah> I wonder how much disk space five years of #ubuntu takes up?
[14:11] <LjL> i don't remember. too much for me to keep on my drive :P but that doesn't mean much, i hardly have have one gb free
[14:11] <fahadsadah> I'm on a linode 360
[14:11] <fahadsadah> So 16GB disk
[14:12] <LjL> fahadsadah: i made a quick calculation, it should take about 1.5gb, perhaps less
[14:14] <fahadsadah> In 2006
[14:15] <fahadsadah> I'll probably do #ubuntu-offtopic too
[14:18] <fahadsadah> I'll make a cron job
[14:19] <fahadsadah> Every day, it will download the previous day's
[14:19] <fahadsadah> And do a pisg regeneration
[14:20] <LjL> fahadsadah: -offtopic is not logged
[14:22] <fahadsadah> Oh.
[14:22] <fahadsadah> =(
[14:31] <fahadsadah> Wow
[14:31] <fahadsadah> It's done
[14:31] <fahadsadah> =D
[22:01] <neodirtchief> Enter text here...test
[22:03] <McPeter> oO