/srv/irclogs.ubuntu.com/2020/03/05/#launchpad-dev.txt

tomwardillcjwatson: https://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/38027308:38
wgranthttps://code.launchpad.net/~wgrant/launchpad/+git/launchpad/+merge/38029715:11
cjwatsonwgrant: r=me15:19
leniHi could we discuss your API ?15:25
cjwatsonAh yes, sorry, been busy today15:26
cjwatsonSo I'm thinking we could have something like what ddeb-retriever uses to get a feed of changes to debug symbol packages, only for bzr branches and git repositories15:27
wgrantIndeed, I think that makes a lot of sense15:27
cjwatsonSomething like lp.git_repositories.getAllRepositories(modified_since_date=...)15:28
leniThat would be great15:29
cjwatsonleni: What's the client code for this?  Are you in a position to use the Python launchpadlib package, or is the client in something other than Python?15:29
cjwatsonIf you aren't using launchpadlib (or at least lazr.restfulclient) you'll need to implement iterating over batched collections yourself15:29
cjwatsonWhich isn't too hard, just needs a bit of care15:29
leniIs it possible to also have an option to list them by creation date ?15:30
cjwatsonCan you explain why?15:30
leniWe're using python so we can use launchpadlib15:30
leniBecause it would be easier to make with an indexing value15:30
cjwatsonIndexing value?15:30
cjwatsonCreation doesn't seem a very valuable thing to consider; a repository might well be created entirely blank15:31
leniLike a creation date or a uid15:31
leniWe take everything even blank repos15:32
cjwatsonRight, but you surely want to know when the repository stops being blank15:32
cjwatsonModification seems much more interesting than creation (creation is just a specific kind of modification)15:32
cjwatsonSo you just need an identifier for the repository?15:32
leniModification is already known as the content is updated by pulling at a certain interval15:33
cjwatsonHm, we also need to think a bit about exactly how things work if a repository is modified while somebody is iterating over a date-ordered collection of repositories15:33
cjwatsonWe would much rather you only poll when we tell you that the repository has changed, if possible15:34
cjwatsonThat's why I'm suggesting giving you a feed ordered by modification date - so that you can pull only the ones that have changed15:34
cjwatsons/poll/pull15:34
cjwatsonShould be much more efficient for both of us15:35
leniI'm looking into how that could work15:36
cjwatsonIterating over ~1000000 public bzr branches and ~17000 public git repositories even though they mostly haven't changed would be pretty inefficient15:37
leniOf course15:37
leniSo apparently it's not yet doable to only pull on new changes but that's something they're willing to add in the future15:44
leniBut it would still be interesting for the initial listing as we get all the repos and not just the projects15:44
cjwatsonThis might be acceptable for git repositories because there are fewer of them, but I think I'm not very willing to do the bzr side until you're only pulling ones we tell you are changed15:48
leniFor now there's no bzr loader so it's only on the git side15:49
leniWe'd happy to get that for now15:49
cjwatsonIf we did ordering by modification date, that ought to still get you everything you need, you might just have to deduplicate slightly15:49
cjwatsonBut I'm looking into some technical details of ordering15:50
cjwatsonWhat do you plan to do when repositories are renamed?15:50
leniIt would create a new one15:52
leniBy modification date as you stated if there's a modification while iterating it would skip that one no ?15:53
cjwatsonSo you can probably just use repository.unique_name as an identifier15:53
cjwatsonWe need to solve that anyway, so I'm thinking about i15:53
cjwatsont15:53
leniIf we have that unique_name that's great15:53
cjwatsonhttps://launchpad.net/+apidoc/devel.html#git_repository15:53
tomwardillhttps://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/380300 - make the default target on a git MP page more friendly15:53
cjwatsonr=me15:54
leniAnd that unique_name cannot be changed ?15:54
cjwatsonIt's unique at a given point in time, but is mutable15:55
cjwatsonBut that should be OK since you said that if a repository is renamed it would create a new one15:55
cjwatsonWe could also expose the immutable ID.  We normally prefer not to, but it's possible15:56
leniYes it would not be a problem15:56
leniWhen you call a function like that does it send back everything or is it paginated ?16:00
cjwatsonYou get an initial batch of 75 and it's paginated16:07
cjwatsonhttps://code.launchpad.net/~cjwatson/lazr.restful/range-factory/+merge/355966 would be needed I think16:07
cjwatsonleni: Is the code here going to be open (or at least in a position where we can review it)?16:09
leniYes of course16:09
cjwatsonWe have a hacky option that will require a bit more care on your side; but if we can review the code it ought to be possible to make it safe16:10
lenihttps://forge.softwareheritage.org/ is where it will be reviewed16:10
cjwatson(It's quite a bit easier on our side)16:10
leniIn what way would it be hacky ?16:13
wgrantThe LP API's pagination system doesn't quite support the ordering that we need for this to be completely safe. We're devising a solution which will effectively emulate pagination by making a bunch of different getRepository requests.16:16
tomwardillcjwatson: https://code.launchpad.net/~twom/launchpad/+git/launchpad/+merge/380302 I did an oops and diddn't get it switched back to 'Needs Review' in time, so have a follow on MP16:19
cjwatsonr=me16:22
leniSo if you think that will work we're ok with it16:25

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!