[00:25] <kb9vqf> I'm attempting to upgrade a Launchpad system (circa early 2010) to the latest code.  The system has been in production all this time; losing any data would not be acceptable.  On my test box I am having problems upgrading the database.
[00:26] <kb9vqf> What is the best way to handle the schema changes?  The database upgrade process keeps dying, and it looks like somehow I need to apply the "new" 2009 schema
[00:26] <kb9vqf> sorry, that should be 2209 schema above
[00:27] <kb9vqf> The database upgrade paches refer to missing functions that are defined in the 2209 schema but nowhere else
[00:27] <kb9vqf> I guess that over time some of the incremental patches were dropped from the source?  Any help would be appreciated here. :-)
[00:33] <wgrant> kb9vqf: Hm, upgrading something that old without good knowledge of the datamodel would be rather challenging.
[00:34] <wgrant> It's not just a matter of applying database patches in all cases.
[00:34] <kb9vqf> Hmm, ok
[00:35] <kb9vqf> We've only been using the PPA features
[00:35] <kb9vqf> How were these upgrades performed on the production systems back whe?
[00:35] <kb9vqf> *when?
[00:35]  * kb9vqf has been having a hard time digging up any information outside of the source itself
[00:36] <kb9vqf> I'm most interested in the missing functions and such--I'm guessing that the functions were added via SQL commands that are not part of the patchset, but don't know for sure
[00:37] <wgrant> A combination of SQL patches and in some cases custom Python, and not all of the Python is in the tree. The 2209 baseline schema was introduced in 2011.
[00:37] <wgrant> All DDL is in the patches.
[00:37] <kb9vqf> OK
[00:37] <wgrant> But the 2208 patches would have to be retrieved from bzr history.
[00:37] <kb9vqf> They are in the archive directory fortunately
[00:37] <wgrant> Oh, I thought I'd deleted them. Must have been saving them.
[00:38] <kb9vqf> I've been able to apply most of the patches, but one in particular is failing that concerns me
[00:38] <kb9vqf> patch-2209-00-1.sql
[00:38] <kb9vqf> sets sourcepackagename NOT NULL constraint
[00:38] <wgrant> Anyway, those will bring a database with no rows up to date. But there is much data migration that may be needed between some of the patches.
[00:38] <wgrant> Right, that one, for example.
[00:39] <kb9vqf> I take it custom python was used there?
[00:39] <wgrant> Between the baseline and patch-2209-00-1.sql, an online data migration script was run to populate that column.
[00:39] <kb9vqf> any chance of obtaining that script or do I have to write something?
[00:39] <wgrant> It's probably possible to struggle through this, but it's probably more trouble than it's worth.
[00:40] <kb9vqf> I'm running into obsolescence issues fast due to the old codebase relying on very old python libraries
[00:40] <kb9vqf> so I need to do something soon, but at the same time we have years of work on the PPAs
[00:40] <wgrant> Does that revno even run on lucid?
[00:40] <kb9vqf> Yes
[00:40] <kb9vqf> In fact I hacked it up to run on Debian Wheezy
[00:40] <kb9vqf> to buy time :)
[00:40] <wgrant> Heh
[00:41] <kb9vqf> So far the only major problem I have run into (well, that I am aware of at the moment) is that sourcepackagename migration
[00:41] <kb9vqf> Is there a better way to save the PPA information and just reload the database?
[00:42] <wgrant> Isn't that the very first patch?
[00:42] <wgrant> There are a lot of others further down the line which will be bigger problems.
[00:42] <kb9vqf> Been applying 2208- patches
[00:42] <kb9vqf> ok
[00:43] <wgrant> Is it really that critical to preserve the PPA history?
[00:43] <kb9vqf> Yeah, in this case it is...some of our software is still in use on the oldest PPAs and if there is a security patch we need to be able to rebuild
[00:43] <kb9vqf> at the same time we can't sanely reupload the sources for all PPAs
[00:44] <kb9vqf> the PPAs *are* the snapshot source repository in many cases
[00:45] <kb9vqf> A related question: with the latest Launchpad is this kind of thing going to continue to to be the case where there is no way to sanely migrate to a new Launchpad version?
[00:45]  * kb9vqf is envisioning a very ugly script to download source from the old Launchpad instance and rebuild on a new Launchpad instance
[00:46]  * kb9vqf is also shuddering at the power bill for the tens of thousands of builds
[00:48] <kb9vqf> wgrant: so if I take this one step at a time, first major problem is the sourcepackagename column.  is there any way to get the original migration script, and if not, what data is supposed to be in that field?
[00:48] <wgrant> It's not fundamentally terribly difficult to make it very upgradable going forward, but there's been no motivation to do so in the past.
[00:49] <kb9vqf> Am I still the only non-Canonical deployment?
[00:49] <wgrant> That particular one is very simple: it's just a copy of SourcePackageRelease.sourcepackagename.
[00:49] <wgrant> There have been a couple of others over the years, but none survived.
[00:49] <kb9vqf> Heh.  This one needs to survive as we use it very heavily :)
[00:50] <kb9vqf> And it's interesting to be the first and only :)
[00:51] <kb9vqf> OK, I'm taking notes.  patch-2209-00-1.sql requires copies of SourcePackageRelease.sourcepackagename to be placed in sourcepackagename.  What other patches should I be concerned about?
[01:00] <wgrant> kb9vqf: There are no bugs, blueprints, translations, etc. IIRC, are there?
[01:00] <kb9vqf> no
[01:01] <kb9vqf> Here is where most (but not all) of our work really is:
[01:01] <kb9vqf> https://quickbuild.pearsoncomputing.net/~trinity
[01:01] <kb9vqf> There are a few other PPAs, and we were going to deploy translations soon so I'm glad we had this chat first :)
[01:04] <kb9vqf> If it makes any difference, the majority of the other work is at https://quickbuild.pearsoncomputing.net/~slavek-banko and https://quickbuild.pearsoncomputing.net/~kb9vqf
[01:05] <kb9vqf> wgrant: hopefully the fact that we're only using PPAs (for now) makes this a bit easier?
[01:10] <wgrant> The really awkward patches were around bugs, branches and blueprints. Translations changes were fairly managable.
[01:10] <wgrant> Since the 2209 baseline, the only patches that should impact you apart from the 2209-00-1 are:
[01:10] <wgrant> patch-2209-01-1.sql
[01:10] <wgrant> patch-2209-40-*.sql (PackageUpload.searchable_*)
[01:10] <wgrant> patch-2209-41-*.sql (BFJ refactor)
[01:10] <wgrant> patch-2209-49-*.sql (ProcessorFamily refactor)
[01:10] <wgrant> patch-2209-51-*.sql (BFJO refactor)
[01:10] <wgrant> patch-2209-58-*.sql (LibraryFileContent.id widening)
[01:10] <kb9vqf> Ah, ok.  No bugs, bzr is completely disabled, translations we want to get started with but currently we have no translations on Launchpad
[01:10] <wgrant> The rest *should* apply cleanly, though there are also two postgres upgrades in the middle there.
[01:11] <kb9vqf> wgrant: OK, thanks.  the two refactors look a bit scary though, any hints?
[01:11] <wgrant> Of those, 2209-01-1.sql, 2209-40-*.sql and 2209-58-*.sql are pretty simple, the others I can dig up the prod migrations for.
[01:11] <wgrant> And before 2209 you should be reasonably safe.
[01:12] <wgrant> We only started doing the extra Python steps when we stopped doing 90 minute database upgrade outages, which was probably late 2010 or early 2011.
[01:12] <kb9vqf> Yeah most of the pre-2209 stuff is just missing functions; I'm fixing up the migration patches for those as I sit here :)
[01:12] <kb9vqf> wgrant: if you don't mind digging up the migration scripts I'd be very grateful
[01:13] <kb9vqf> also if there is any way to register a "vote" for making Launchpad properly upgradable I'd like to see that...I don't think my instance is going anywhere anytime soon :)
[01:17] <wgrant> You will want to check your LaunchpadDatabaseRevision table and make sure you apply any 2208-* patches that are missing.
[01:18] <wgrant> Rather than dig up the incremental migration scripts, probably best to write some SQL that does each of the nasty sequences in one hit.
[01:19] <wgrant> The postgres version upgrades may be more problematic. If you run into issues there, it may be worth hacking the database until it roughly resembles the new schema, then setting up a new instance on precise and postgres 9.3 and pg_restore'ing the data into the relevant tables.
[01:19] <kb9vqf> wgrant: I'm actually on 9.3 now
[01:19] <kb9vqf> pg_dump, g_restore from production to set up the staging box :)
[01:20] <kb9vqf> so at this point I pretty much just need to know what to do to the data for those 5 patches
[01:22] <kb9vqf> I can probably handle the SQL, but any advice you can give about what needs to actually happen is going to shave days/weeks of time off on this end (e.g. if I don't know what it does I get to study all of the bzr revisions around that timeframe...yuck ;-)
[01:23] <wgrant> I'm not that cruel.
[01:23] <kb9vqf> :)
[01:23] <wgrant> The SQL will take me like 10 minutes, 'cause I wrote most of those patches.
[01:23] <wgrant> I'll have a look at it tonight.
[01:23] <kb9vqf> Actually you guys have been most supportive...I do appreciate it
[01:23] <kb9vqf> Gives Canonical a high standing in my book
[01:23] <kb9vqf> do you want my Email address?
[01:24] <wgrant> It's just your nick @pearsoncomputing.net, isn't it?
[01:24] <kb9vqf> correct
[01:24] <kb9vqf> I'll work on getting the database up to a proper 2208 schema; you will handle the rest then or should I be writing some SQL for the first 2209 patch?
[01:25]  * kb9vqf notes wgrant has a very good memory
[01:25] <wgrant> It's only been like four years!
[01:25] <kb9vqf> yeah :-P
[01:25] <wgrant> Three of them are two lines of SQL which I could write in my sleep, the other three will take a few minutes.
[01:25] <wgrant> If you can get to the 2209 baseline, my list should cover everything else.
[01:26] <kb9vqf> ok, thanks again.  it helps to know what needs to be done to the database before coding (I'm still a bit in the dark there but I'll understand once I see the SQL)
[01:26] <wgrant> And the patches from where you are now to 2209 should be OK. If not, they'll be similarly esy.
[01:26] <kb9vqf> Yeah, I can hande the 2209 easily
[01:26] <kb9vqf> *the conversion to 2209
[01:27] <wgrant> Also, modern LP supports PPAs for multiple distros, so you can stop doing the awkward Debian-in-Ubuntu thing if you desire.
[01:28] <kb9vqf> wgrant: that's one of the final straws that made me bite this upgrade bullet...that one along with the Any arch building on machines other than i386 :)
[01:29]  * kb9vqf got a bit tired of watching half the build farm chew on l10n packages while the other half sat idle
[01:30] <kb9vqf> so while you're here...we're looking at managing our translations with our Launchpad instance.  Problem is we use GIT not BZR; can Launchpad properly handle a GIT-based translation module at this point?
[01:31] <kb9vqf> The features for translators on the Launchpad instance are compelling, but we need to stick with GIT for other reasons
[01:31] <wgrant> LP can import git branches into bzr (as long as they don't use submodules or signed tags), but setting up all the codehosting infrastructure for an import of one project is probably excessive. I'd suggest just manually scripting the import.
[01:32] <kb9vqf> Can they then export back to GIT?
[01:32] <wgrant> No, that's not currently supported.
[01:32] <wgrant> Though might be in a few months :)
[01:32] <kb9vqf> Then I'll hold off--once that support is in I think we're good to go on the translation end
[01:32] <kb9vqf> Most of the other open-source translation projects leave a lot to be desired
[01:33] <kb9vqf> *translation infratructure projects
[01:34]  * kb9vqf hopes there are no major database changes in the next 6 months so that he can upgrade his Launchpad instance more easily
[01:36] <kb9vqf> wgrant: one more question: what exactly is the fatsam database anyway?  I'm assuming it's the binary data store but never could find any information on it
[01:37] <wgrant> fatsam is the very old (like, 2004) name for the librarian.
[01:37] <wgrant> The directory in the dev config never got renamed.
[01:37] <wgrant> librarian is just a blob store. The filesystem structure is rather trivial.
[01:38] <wgrant> For LibraryFileContent.id == 0xdeadbeef, the file is at /de/ad/be/ef
[01:38] <kb9vqf> Yeah I saw the trivial structure (filelight FTW) but wanted confirmation
[01:38] <wgrant> All the build logs, package files, etc. will be stored in that tree.
[01:38] <kb9vqf> Doesn't it seem a bit dangerous to have "make schema" wipe it out?
[01:39] <kb9vqf> Given that it's the heart of the librarian I would expect some guards on its destruction :-)
[01:39] <wgrant> make schema hardcodes the path from the dev config.
[01:39] <wgrant> Which is like /var/tmp/fatsam
[01:39] <wgrant> It's in /var/tmp; it's fair game.
[01:39] <kb9vqf> Ah, ok
[01:39]  * kb9vqf goes to move fatsam out of /var/tmp on the upgraded system...
[01:39] <wgrant> Heh
[01:40] <kb9vqf> There's not much of a guide on how to run a properly installed Launchpad instance; I'm just glad the thing didn't blow up fatally in the past four years :)
[01:40]  * kb9vqf also learned a lot
[15:39] <kb9vqf> wgrant: Did you have a chance to assemble those upgrade scripts yet? :-)
[15:48]  * kb9vqf has the database at 2208 on the test system; ready to proceed
[17:27] <kb9vqf> wgrant: ping
[19:18] <kb9vqf> wgrant: ping
[21:12] <kb9vqf> wgrant: If you haven't written the update SQL yet you don't need to do anything for patch-2209-00-1.sql; I got somewhat impatient and banged out the 4 lines of SQL already. ;-)  I don't know what to do for the other patches so I'll leave them up to you.
[22:24] <kb9vqf> wgrant: ping