[07:53] <guruprasad> RikMills, ahasenack, the PPA publisher appears to be fixed now and I can see it publishing builds. Your builds could be stuck in the backlog and may need some time to be processed.
[07:54] <RikMills> guruprasad: Thanks. I think it already caught up with the ones that mattered, but thanks for the info
[18:35] <toidi> hi, I'm trying to pull a good bit of data out of launchpad and am wondering what the etiquette is. Is it possible to run a local mirror? Should I be rate limiting? What can I do to be well behaved?
[18:37] <sarnold> hey toidi, I'm not on the launchpad team, but a few thoughts .. running a local mirror feels pretty implausible, it manages so much data that getting a mirror in the first place feels impossible. I know some of the security team's scripts have some local caching of information that's been retrieved to avoid round-trips through the APIs where we can
[18:38] <sarnold> rate limiting seems like a very good idea indeed; once in a while I see mentions in internal channels of a client somewhere hammering a service and earning a blackhole route as a prize :)
[18:38] <toidi> Ok, I understand. I'm currently using the launchpadlib package, do you know if there's a way to set that to ratelimit politely?
[18:39] <sarnold> if you 'just' want an ubuntu archive mirror, that's an approachable problem
[18:39] <sarnold> ah sorry, no idea there :(
[18:39] <toidi> Well, what I actually want is *all* the debs
[18:39] <toidi> and ddebs... and dsc... etc
[18:40] <toidi> The mirrors AFAICT only have recent tips, eg libssl3_3.0.2-0ubuntu1.7_amd64.deb
[18:40] <toidi> they do not have libssl3_3.0.2-0ubuntu1.*6*_amd64.deb
[18:40] <toidi> and so on
[18:41] <toidi> Ideally with build info as well
[18:41] <toidi> I understand thats a large volume of data and would be perfectly willing to just mail over a hard drive or whatever since 99% of it will never change again
[18:42] <toidi> but I imagine that's impossible
[18:42] <sarnold> ahhh that's a good challenge :) the mirrors do remove packages that aren't referenced in any of the lists.. you could do the usual rsync mirroring but skip the --delete --delete-after parts ..
[18:42] <sarnold> it'd help collecting new stuff but couldn't help much for old stuff, and doesn't address the ddebs at all :/
[18:43] <toidi> yeah, on a forward moving basis I could pull the archives and the ddeb archives
[18:43] <toidi> but this is mostly looking backwards
[18:44] <sarnold> and with ddebs, that service feels unreliable enough that pulling from launchpad is probably more reliable inthe long run
[18:44] <toidi> it certainly is nice to have them all bundled together between souce, binary, and debuginfo
[18:44] <toidi> I can recreate those links with buildids and it works, but it's painful
[18:45] <toidi> dpkg -x grep grep grep repeat
[18:46] <toidi> so in the meantime, it sounds like just putting some small pause between requests is the way to go?
[18:46] <toidi> any sense of what's an acceptable rate?
[18:47] <toidi> and does it matter if I'm logged in or not? Not touching any restricted data or trying to write anything, but I feel like it's impolite if the entire pool of anonymous users gets some usage quota
[18:49] <toidi> (thanks for your help, btw. Sorry to pelt you with questions)
[18:55] <sarnold> toidi: yeah, I like the 'delays' options; I don't know quite what ot suggest, but my first thought is to measure the time they take to execute and sleep twice that? that would scale up and down a bit with the load on the system..
[18:55] <sarnold> toidi: no idea on logged in vs anon
[18:57] <toidi> ok, sounds like an interesting approach. I'll see how it works. Thanks very much for your help
[18:57] <toidi> I'll reiterate that if anyone knows a way to get this data without hammering launchpad I am totally happy to use it.