[01:08] hey all, trying 14.04 MAAS and getting this error when trying a pxe boot; Exception: {'__all__': [u'Node group interface with this Nodegroup and Interface already exists.']} [01:08] any idea? [01:09] Kupo24z: which log? [01:09] maas.log [01:09] looks like boot images are not importing either, similar error [01:10] can you paste the log please [01:10] and the pserv.log [01:10] and the cluster.log [01:10] are you on the latest revision 2202? [01:11] http://paste.ubuntu.com/7219582/ [01:12] beta 2 [01:16] what is the package version? [01:16] maas package [01:19] the time stamps do not match up on the logs for the operations you say they are doing [01:19] I think you have a cluster that is not accepted yet [01:20] what did you do leading up to this error? [01:20] upgrade? fresh install? [01:21] fresh install from beta 2 iso, updating apt right now [01:21] I tihnk the iso had a problem [01:21] packaged install from apt is OK [01:25] on reboot after update http://paste.ubuntu.com/7219609/ [01:25] Version: 1.5+bzr2227-0ubuntu1 [01:27] this is the output on import boot images: http://paste.ubuntu.com/7219621/ [01:31] im assuming 'raised unexpected: AssertionError(u'MAAS_URL is not set.' is the cause [01:31] however in /etc/maas/maas_cluster.conf i can see it there [01:33] can you remove maas entirely and re-install from scratch from the archive please [01:33] apt-get purge '*maas*' [01:33] I don't trust the iso [01:33] I get E: Couldn't find any package by regex '*maas*' with that, syntax different? [01:34] argh [01:34] apt-get purge maas and then I think apt-get autoremove should get rid of the rest [01:38] does it require a maas user? django.db.utils.OperationalError: FATAL: password authentication failed for user "maas" [01:40] http://paste.ubuntu.com/7219650/ [01:44] urrrr [01:44] is this while purging? [01:45] roaksoax_: on the offchance you didn't actually go yet, can you look at this? --^ [01:45] Kupo24z: is this while purging? [01:51] o/ [01:53] Is it possible to use maas for servers that are not in the same subnet? [01:54] diadistis: yes, you need to install a cluster controller on the subnet [01:54] each subnet, I mean [01:55] The problem I'm facing right now is that we have about 20 dedicated servers in 12 subnets. I tried to use ipxe to boot them but no luck... [01:57] diadistis: http://maas.ubuntu.com/docs/cluster-configuration.html === jhobbs_ is now known as jhobbs === CyberJacob|Away is now known as CyberJacob [06:45] rvba: have you seen bug 1304078 [06:45] bug 1304078 in MAAS "Endpoint /MAAS/api/1.0/files/?op=list returns HTTP 500 with juju-core." [Undecided,New] https://launchpad.net/bugs/1304078 [06:45] bigjools: just saw it yeah. I'll have a look now that https://code.launchpad.net/~rvb/maas/migr-bug-1302156/+merge/214667 is up for review. [07:01] Speaking of reviews: I need a few! One looks huge but is really just moving code. [07:07] btw folks, we don't need 14.10 milestones that have a 1.5 bugtask on 14.04 [07:07] if it's fixed in 1.5 then it's fixed in trunk [07:10] Assuming we backport from trunk to 1.5? Because there was talk of doing it the other way around. [07:14] always backport [07:14] if we chop and change things will get *very* confusing very quickly [07:25] OK [07:26] Anybody free to review https://code.launchpad.net/~jtv/maas/split-boot_resources/+merge/214670 ? It's really just moving code, no other changes. [07:26] Although one obscure function moved "to nowhere." [07:28] bigjools: couldn't reproduce the problem from https://bugs.launchpad.net/maas/+bug/1304078. Will follow up when I get more details. [07:28] Ubuntu bug 1304078 in MAAS "Endpoint /MAAS/api/1.0/files/?op=list returns HTTP 500 with juju-core." [Undecided,Incomplete] [07:28] rvba: he's using an old version [07:28] bigjools: no, just the cloud archive. [07:29] On precise. [07:29] yes, an old version :) [07:29] Well, precise is still supported. [07:29] and old! [07:29] :) [07:29] okay :) [07:29] rvba: don't we have CI for that version in the lab? [07:30] * rvba checks [07:30] * bigjools stops to eat, back in a bit [07:31] bigjools: no, what we test in the lab is whatever is in the daily PPA (i.e. the package built from 1.2) [07:49] bigjools: btw i can finally try to break maas-on-arm tomorrow i think [08:00] rvba: urgh, I guess we should fix that [08:00] mwhudson: tip top! [08:01] bigjools: we probably should. I'll ask Diogo to do it. [08:03] rvba: thanks [09:13] Any reviewers available for https://code.launchpad.net/~jtv/maas/split-boot_resources/+merge/214670 ? It's a large diff, but it's all moving code, not changing it. [09:22] jtv: Hi Jeroen. What's the idea behind make_image_spec() in tests? What's the problem with let say hardcoded arch and release? Thanks! [09:23] strikov: it's a slightly controversial issue. On the one hand, it's nice to have concrete human-readable strings in the test, and to have them look realistic. On the other hand, tests should not pass "by accident." [09:23] For example, if a test says "arch=i386" somewhere, it could be that it's hitting something that's broken for every architecture except i386, because that happens to be default somewhere else. [09:24] Generally, with the factory style in our tests, we try to show that the behaviour we want has no implicit dependencies on other setup, configuration, defaults, etc. [09:25] Any two items that a test creates are different and unrelated, unless there needs to be some specific connection — and then the test makes it explicitly. [09:25] jtv: we're using release-XXXXXX and arch-XXXXXX generators right now, maybe it's better to stick with some more realistic value. But I got your point in general. Thanks! [09:25] Those names are a compromise, really. [09:26] You get random, but you also get recognisable. [09:26] jtv: True [10:04] gmb: jtv: The maas-test failure in the lab needs investigation http://paste.ubuntu.com/7220923/ [10:04] "Bad Request" *after* destroying the VM? [10:05] jtv: the ordering of the messages is wrong. [10:05] gmb: shouldn't you backport the HWE doc onto 1.5? [10:06] (Looks like it's only in trunk.) === CyberJacob is now known as CyberJacob|Away [10:08] rvba: I wonder if this means that we broke the interaction between the API client and the API somehow. [10:09] Because that looks like the first API request that maas-test makes. [10:09] Or, of course, we're just setting a value that is no longer in the config... [10:11] Maybe we're setting an unsupported series? [10:11] Yeah, that would be my guess too. [10:12] Might be nice to dump the response body at that point in maas-test... [10:15] jtv: I /think/ the change I just landed will fix the problem… [10:15] Ah good. [10:15] I'll pick up one of those other maas-test bugs then. [10:25] jtv: maas-test CI is still failing :/ [10:25] i.e. my change didn't fix the pb. [10:30] rvba: what was the fix that didn't work? [10:31] jtv: it's not the it didn't work. I just didn't fix the CI problem (which I haven't diagnosed properly yet) https://code.launchpad.net/~rvb/maas-test/only-trusty/+merge/214312 [10:31] Oh, it was something you had already and hoped might _also_ fix the problem? [10:32] AFAICT there is no real validation of the config value in that set_config call, is there? [10:33] jtv: yes, I hoped it would fix the pb but it didn't. The only validation is that it's a valid series. [10:33] IIRC [10:33] We validate that? In the set_config call? [10:33] I didn't think we did... [10:34] I do remember making a change: I moved the commissioning series from the "networking" section of the config to the "Ubuntu" section. [10:34] But that was only in the inline dict in get_default_config, I think — in which case it shouldn't matter, right? [10:35] No, it shouldn't. [10:36] I don't see any kind of validation of the value. [10:38] The traceback also shows us that urllib2 will raise an exception when it gets an error code from the API... the maas-test code only checks for a non-OK return value. [10:38] Aren't we adding the server-side logs to the test details though? [10:39] I don't see the logs anywhere. [10:40] rvba: maybe that's because the error happens during setUp, and we don't gather the logs yet at that stage. :/ [10:50] jtv: I'm debugging it the problem in the lab manually… [10:50] s/it// [10:51] rvba: meanwhile I'll put up a branch that makes the fixture dump logs if it fails at this point. It doesn't look very invasive. [10:52] rvba: I do wonder if the RPC connection time is an issue for maas-test... [10:59] rvba: https://code.launchpad.net/~jtv/maas-test/maasfixture-log-earlier/+merge/214718 [10:59] ← should help debug that problem [11:09] jtv: yep, looks good. [11:09] jtv: not sure it will help with our immediate pb though. [11:10] It will activate the logging of server-side information before the fateful API request. [11:10] So as long as the API logs the failure, we'll get it. [11:11] But I do wonder: does maas-test wait for the cluster and region to hook up their RPC? [11:11] I don't think the API will log a failure, it's a validation error. [11:12] We don't know that. [11:12] True, let's see. [11:12] Yeah. [11:12] Meanwhile, I have to call it a day. [11:19] jtv: I want to come up with a test that generates pretty complex metadata (multiple versions inside the product, each with a specific label and set of subarches). I started to do it in a 'random fields' fashion (as you did TestMain) but feel that it's too much (i had to create a bunch of code to just generate this fields the right way). Any ideas which fields should be indeed random and which one I can hardcode? [11:23] strikov: put the complexity into the unit tests, where it's still controllable. Otherwise updating the test later becomes a nightmare. [11:24] Factory methods can help a lot: "create an X for me with all the values randomised." [11:25] The overall test will show that the parts fit together; the unit tests can put real stress on each of the parts. [11:26] The trick for the tests is to hate whoever writes the code, and try to prove them Wrong in every way possible. Even when that person is actually you. [11:26] In other words, schizophrenia is one of the most valuable traits in software development. [11:27] If you try to put that sort of thing in the big, end-to-end tests, it inevitably becomes a little arbitrary which corner cases the test does or doesn't exercise. [11:28] With unit tests, it's easier to throw the real nightmares at the code. [11:31] jtv: Just to make sure that I got you correctly. What do you mean by unit tests -- something which resides in src/provisioningserver/import_image/tests/? [11:31] Well yes, but that's not the whole story. :) [11:31] I mean tests that take one small part of the software ("unit"!) and test it in detail, by calling it directly. [11:49] * jtv has to go now [11:51] strikov: for examples, have a look at the existing tests in src/provisioningserver/import_image/tests/, but specifically the tests for the lower-level functions, not the test for main(). [11:51] Good night! [13:39] gmb: allenap: time for a tiny review? https://code.launchpad.net/~rvb/maas-test/maas-test-use-trusty/+merge/214751 [13:40] rvba: otp [13:40] rvba: Sure [13:40] That worked out well then :) [13:51] rvba, gmb: roaksoax just reminded me that we need to ensure that maas-test’s changelog is up to date, and that each change since the last upload has a bug number attached. Do you know if that’s the case? [13:52] Er, nope. [13:52] It's probably not the case. [13:57] gmb: I wonder if you missed my message from earlier… shouldn't you backport the HWE documentation to 1.5? [13:58] rvba: Yes, I think I missed that; had some connection problems with IRC… [13:58] rvba: Good point. I’ll do that now. [13:58] Cool. [14:16] rvba, gmb: Do either of you fancy doing it? :) [14:17] allenap: I created a bug for the change I just landed. [14:19] gmb: is it normal that the HWE doc isn't linked from the main index? [14:37] rvba: Nope, that’s an oversight. I’ll fix it. [14:43] gmb: okay. While you're at it, maybe add a note similar to the one we have in docs/networks.rst to state that this feature is new. [14:43] Yep [14:55] so... now that things have changed again (since the last time I updated my trust maas server) what do I edit to ONLY download trusty boot images (PXE and Ephemeral) [14:56] is /etc/maas/import_pxe_files still valid (and /etc/maas/import_ephemerals) or is boot_resources.yaml the file now? [14:56] err... bootresources.yaml [15:50] bladernr_: bootresources.yaml from now on [15:52] roaksoax: yeah, figured that out. the old files shouldn't have been left if they're no longer honored, IMO... or at least should have been renamed to something like import_pxe_files.unused [15:53] and just to be sure, it's safe now to delete /var/lib/maas/ephemeral? that would free up 20GB of disk space... [15:53] I'm guessing yes but wanted to confirm to avoid hosing my server [16:07] Under this new boot-resources scheme, how long are snapshot dirs kept? [16:08] if I update daily, and pull in, say, 4GB per day of new images, if there's no garbage collection or whatever, I could quickly run out of disk space, depending on my maas server's setup [16:23] bladernr_: You can delete snapshots as you like, just leave the one that’s pointed to by the current symlink. The snapshots are created by hard-linking to files in the cache directory, so, even if you delete all snapshots, a sync should not need to download much, if anything. [16:24] It does if I'm pulling in x86 and amd64 dailies for development... it's at least 12 images a day if trusty images are spun daily [16:25] 3.5 - 4GB estimated per day, so if I have a cron set to update the images each morning, and no automated garbage collection, and forget cron is running, I could eat up 100GB in 25 days. [16:26] I must admit though, I REALLY REALLY appreciate that the cluster now tells me exactly what images it has available [16:26] there are some really great UI changes in the recent updates :D [16:30] and that leads to another question about how these dirs are handled... "current" points to the latest snapshot dir. So lets say I'm pulling down Precise x86 and amd64 images, and trusty daily x86 and amd64 images. Precise images won't be new each day, so the snapshot dir for today could have only trusty images in it. Will the new method aggregate those dirs to allow me access to all of the available [16:30] images? [16:52] ok, curious... [16:53] I added precise x86 and amd64 to bootresources.yaml and re-ran the import-pxe-files command. I have a second snapshot that is 6.8GB vs 3.5 for trusty only. [16:53] So, question is: did this copy my existing trusty stuff over, or re-download? That would answer the above question about how existing stuff is handled. === roadmr is now known as roadmr_afk [17:02] bladernr_: It should have hard-linked to the pre-existing images, so 3.5GB of the 6.8GB in the second snapshot should be the same on-disk data as the first. [17:02] bladernr_: But if you discover otherwise, please file a bug. [17:03] bladernr_: Btw, glad to hear you like some of the new stuff :) Unfortunately, for now, you’ll have to arrange garbage collection yourself. [17:09] allenap, are you sure? [17:09] what did you mean. [17:09] i think it should clean up after itself. [17:09] smoser: I looked at the code in MAAS and I didn’t see any clean-up. Is it in simplestreams? [17:09] bladernr_, "current" is current for everything. they get munged into that dir. [17:10] i believe the sync is done with max=1 [17:10] er... wait. thats not relevant. [17:10] i thought that it only kept 2 thigns. [17:10] i was pretty sure that oleg did that. [17:10] if not then its a critical bug. [17:10] smoser: There’s only one snapshot in the Garage MAAS, so maybe it is cleaning up after all. [17:11] Oleg doesn’t seem to be around today to ask. [17:11] Or he’s EODed. [17:13] allenap, and du should not be tricked by hard links. [17:13] it should count correctly. [17:17] smoser: Aye, but `du snapshot-1` and `du snapshot-2` run separately will sum up to more than `du snapshot-[12]`. [17:17] and in both cases its counting is correct :) [17:38] bladernr_: should be. Please do file bugs for all that stuff you are finding === vladk is now known as vladk|offline === roadmr_afk is now known as roadmr [18:23] smoser, allenap thanks, just double checked and it is indeed hardlinking to the older items... (the hard links threw me, I'm so used to everyone using symlinks for that type of work) === CyberJacob|Away is now known as CyberJacob === vladk|offline is now known as vladk [19:42] hi allenap, so r.e. calling the celery job [19:43] i am getting a 403, http://paste.ubuntu.com/7223235/ [19:43] looking at the code, the nodegroup is the only one allowed to call that celery task? [20:07] tych0: That sounds about right. Do you have access to those credentials? === vladk is now known as vladk|offline [22:26] Is there any documentation, beyond a couple VERY brief blog posts I've managed to find, that explains in good detail the various options and ways to modify the fast-path install via curtin_userdata? [22:27] I have a working d-i preseed in maas that I now want to translate into cloud-init-isms and have found bits related to ec2 cloud-config that I'm not sure work in MAAS... [22:34] good example I haven't found yet, I can add a PPA (thanks to smoser's blog), but not sure how to do things like update apt cache, then install individual packages after passing things to debconf-set-selections === CyberJacob is now known as CyberJacob|Away [22:43] Oh.. so this seems to be working a lot easier than I was thinking it was going to. Not sure what I was doing wrong before, but adding in things line by line and re-doing the install to see the progress seems to be working well thus far. [23:54] bladernr_: if you collect useful info can you please let me know and I will add it to the maas docs