bigjools | hey tych0, there? | 01:16 |
---|---|---|
bigjools | would be easier to chat here for a moment instead of your bug | 01:16 |
bigjools | I see others reporting the problem earlier on here | 01:16 |
tych0 | bigjools: here now | 01:43 |
bigjools | tych0: howdy | 01:43 |
bigjools | tych0: was just wondering what you were doing when not seeing the proxy used? | 01:44 |
tych0 | just standard stuff | 01:44 |
tych0 | fastpath with trusty commissioning and precise images | 01:44 |
tych0 | i honestly don't have any idea whether it happens all the time or not | 01:45 |
tych0 | sometimes i am paying attention and sometimes i just let it run :-) | 01:45 |
bigjools | I think it's a bug in the preseed for fastpath | 01:45 |
tych0 | is there a file it drops somewhere or something? | 01:45 |
tych0 | i can check for that | 01:45 |
tych0 | next time i see it | 01:45 |
bigjools | I vaguely remember talking to smoser about this but his suggestion to fix it didn't work and then it fell by the wayside | 01:45 |
bigjools | the proxy config in the preseed is curtin-specific | 01:46 |
tych0 | ah, ok | 01:46 |
bigjools | well the d-i is d-i specific, so ... :) | 01:47 |
tych0 | yep | 01:47 |
tych0 | i assume it edits some apt preferences file or drops something in preferences.d? | 01:47 |
tych0 | oh | 01:48 |
tych0 | this node has 90curtin-aptproxy | 01:48 |
bigjools | look in contrib/preseeds_v2/curtin_userdata | 01:48 |
tych0 | was that the fix? | 01:48 |
bigjools | well apparently it doesn't work :) | 01:48 |
tych0 | well | 01:48 |
tych0 | it might | 01:48 |
tych0 | i didn't file it about this node | 01:48 |
tych0 | i just know i've seen it before | 01:48 |
bigjools | ok | 01:49 |
tych0 | if there is something i should look for | 01:49 |
tych0 | i can look for that next time i know i see it | 01:49 |
tych0 | i guess maybe the absence of this file is a good place to start | 01:49 |
bigjools | I am not sure tbh. If I start to look at this I will first examine the squid log as the node boots | 01:49 |
tych0 | yeah | 01:49 |
bigjools | then delve into the node itself | 01:49 |
tych0 | ok | 01:50 |
bigjools | there's different things that get pulled off the internets | 01:50 |
tych0 | bigjools: do you want my copy of /var/log/maas for the other issue? | 01:52 |
tych0 | i'm not sure if it'll be useful or not | 01:52 |
tych0 | it does sound like a bug in celery | 01:52 |
tych0 | whatever happens, celery shouldn't crash | 01:52 |
bigjools | tych0: it's a celery bug imo | 01:53 |
bigjools | and I agree | 01:53 |
tych0 | yep | 01:53 |
bigjools | we can work around it with a wrapper | 01:53 |
tych0 | eyah, that might be good | 01:53 |
bigjools | well there *is* a wrapper already, we just need to make it restart | 01:53 |
tych0 | i just sat down this morning and nothing would work | 01:53 |
tych0 | made me a sad panda | 01:53 |
bigjools | same happened to me yesterday | 01:53 |
bigjools | restarted the worker and boom, loads of jobs went through that had been queued | 01:54 |
bigjools | which is kinda dangerous | 01:54 |
tych0 | yeah | 01:54 |
tych0 | same here | 01:54 |
tych0 | a couple of extra power on power off cycles to make things interesting :-) | 01:54 |
bigjools | I think we need to put expirations on all the jobs | 01:54 |
bigjools | some have that but not all | 01:54 |
bigjools | but we'll revisit this as part of work to harden maas | 01:55 |
roaksoax | rharper: why is the tgt stuff ebing dropped? | 02:09 |
roaksoax | err | 02:09 |
roaksoax | jtv: ^^ | 02:09 |
jtv | The new import script doesn't use our pre-existing tgt setup. | 02:10 |
roaksoax | jtv: right, that's why I'm asking... what does the new script do? | 02:11 |
jtv | That link we had in /etc/tgt/conf.d was pointing to a file that no longer existed. | 02:11 |
jtv | It uses tgt-admin to create the targets. | 02:11 |
bigjools | roaksoax: the new script constructs a tgt conf on the fly and inserts it using tgtadmin | 02:11 |
jtv | IIRC it does write a "metadata" file, and a master config, but all in /var/lib/maas. | 02:11 |
roaksoax | bigjools: ok... where do these get stored, any ideas? | 02:12 |
jtv | /var/lib/maas/boot-resources | 02:12 |
bigjools | roaksoax: they are ephemeral | 02:12 |
jtv | No there is a tgt config file. | 02:13 |
bigjools | it is not used after insertion though | 02:13 |
bigjools | you can query tgt using "tgtadmin -s" | 02:13 |
bigjools | with no config | 02:13 |
bigjools | the config is for the admin script, not tgt. | 02:14 |
roaksoax | bigjools: tgt-admin | 02:14 |
roaksoax | bigjools: ok cool then | 02:14 |
bigjools | roaksoax: yeah I forgot which is which, there's a helper wrapper and the main thing | 02:14 |
bigjools | I think tgtadmin is the wrapper | 02:15 |
roaksoax | bigjools: tgtadmin doesn't exist | 02:15 |
bigjools | roaksoax: it does on my box :) | 02:15 |
jtv | tgt-admin, with the dash | 02:16 |
bigjools | there are two scripts | 02:16 |
jtv | Not on my machine. | 02:16 |
roaksoax | bigjools: it does not exist on my system | 02:16 |
roaksoax | after upgrade | 02:17 |
jtv | Not on my Trusty machine. | 02:17 |
* bigjools boggles | 02:17 | |
bigjools | let me boot my server, one sec | 02:17 |
bigjools | I am 100% sure I have seen both scripts | 02:17 |
jtv | From an installed package that isn't a dependency? | 02:19 |
bigjools | ah ok it's tgtadm | 02:20 |
bigjools | my bad | 02:20 |
roaksoax | ok cool | 02:22 |
jtv | Uh-oh. | 02:25 |
bigjools | roaksoax: what did you think about my bug on the packaging? | 02:25 |
bigjools | https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1300507 | 02:25 |
ubot5 | Ubuntu bug 1300507 in maas (Ubuntu) "Rabbit password is reset on every upgrade which forces lockstep cluster restarts" [Undecided,New] | 02:25 |
jtv | Dear friends, who takes care of re-installing the tgt targets after reboot? | 02:25 |
roaksoax | bigjools: i'm looking into it | 02:25 |
bigjools | roaksoax: ok cheers | 02:25 |
roaksoax | bigjools: ^^ | 02:26 |
jtv | I just rebooted my maas server and got no output from "sudo tgt-admin -s" | 02:26 |
bigjools | jtv: the script | 02:26 |
roaksoax | uhmmm | 02:26 |
bigjools | jtv: oh,,,, oh dear! | 02:26 |
bigjools | that's the bug we found then | 02:26 |
roaksoax | then that's why we need config files | 02:26 |
jtv | Could well be. | 02:26 |
bigjools | jtv: which means we need that confiug back | 02:26 |
bigjools | fuuuuuuuuuuu | 02:26 |
roaksoax | yup | 02:26 |
jtv | New one. Not the old one. | 02:26 |
bigjools | jtv: the new one is still ephemeral IIRC | 02:27 |
bigjools | it gets written once for each target | 02:27 |
jtv | No, there's a single config with all targets. | 02:27 |
jtv | In the "current" snapshot. | 02:27 |
bigjools | jtv: are you *sure*? | 02:27 |
bigjools | anyway I will file a bug | 02:27 |
bigjools | arse | 02:27 |
jtv | I have 4 <target> definitions in my maas.tgt. | 02:27 |
bigjools | jtv: ok I think this code changed since I last looked, it seems ok now | 02:28 |
jtv | What code? | 02:28 |
bigjools | the script | 02:29 |
bigjools | https://bugs.launchpad.net/maas/+bug/1300548 | 02:29 |
ubot5 | Ubuntu bug 1300548 in MAAS "tgt targets do not persist after a reboot" [Critical,Triaged] | 02:29 |
jtv | And now, having smashed everyone's hopes and dreams, I can reboot in peace. | 02:32 |
bigjools | heh | 02:32 |
roaksoax | bigjools: so there's something weird | 02:33 |
roaksoax | bigjools: i have no idea why this happens | 02:33 |
roaksoax | bigjools: i have not make any changes that would cause this in comparison to trusty I think | 02:33 |
jtv | Oh dear. Shouldn't have rebooted. Suddenly I'm back in low-res. | 02:35 |
bigjools | roaksoax: the password reset thing? | 02:36 |
roaksoax | bigjools: yup | 02:36 |
jtv | This is horrible. Big, fuzzy pixels. Luckily I kept my pre-Saucy hack somewhere to get back to a decent resolution. | 02:36 |
bigjools | roaksoax: it has always happened | 02:36 |
roaksoax | bigjools: i've never seen it | 02:36 |
bigjools | we just ignored the problem until now; for some reason the cluster is getting restarted first which has made it apparent | 02:37 |
jtv | We're nog going to be limited to 1920×1200 for the release version, are we? | 02:37 |
bigjools | jtv: hi-dpi is a feature of unity now IIRC | 02:37 |
jtv | It worked in Saucy, and until today, Trusty. | 02:37 |
bigjools | roaksoax: but basically every upgrade I ever do resets the rabbit p/w | 02:37 |
roaksoax | bigjools : does that file exist in the inital precise package? | 02:38 |
roaksoax | bigjools: and has it changed since then? | 02:38 |
bigjools | which file? | 02:38 |
roaksoax | maas_local_celeryconfig.py | 02:38 |
* bigjools checks | 02:38 | |
bigjools | roaksoax: it's new after precise | 02:40 |
bigjools | I think | 02:40 |
roaksoax | bigjools: ok so if we iupgrade from precise to trusty, then we still need to run that file | 02:40 |
roaksoax | bigjools: err to run that psasword change | 02:40 |
bigjools | the actual file is not in my tree it gets generated | 02:41 |
bigjools | roaksoax: why do we need to regenerate passwords at all? | 02:41 |
bigjools | we cannot do this, it screws over remote clusters | 02:41 |
roaksoax | bigjools: why? becuase the upgrade should generate passwords for systems wher the file didn't exist before | 02:42 |
bigjools | roaksoax: one sec | 02:42 |
roaksoax | bigjools: there's a reason why it is there. if we upgrade say precise to trusty directly, and we don't have that, then we see failure because the psasword would never get generated | 02:42 |
bigjools | roaksoax: this is going to break when updating on remote clusters, it can only work when cluster is local | 02:43 |
bigjools | otherwise they will be out of sync | 02:44 |
roaksoax | bigjools: well then we should recommend upgrading regions first and then upgrade clusters | 02:44 |
roaksoax | so the cluster gets regenerated | 02:45 |
roaksoax | get new pass | 02:45 |
roaksoax | bigjools: what should actually happen, ois region should tell the clusters to update their password automatically | 02:45 |
bigjools | roaksoax: it still forces a lockstep upgrade. I don't know how we can get around that | 02:45 |
bigjools | roaksoax: agreed | 02:45 |
bigjools | roaksoax: I'll file a bug a bout that | 02:46 |
bigjools | roaksoax: https://bugs.launchpad.net/maas/+bug/1300554 | 02:48 |
ubot5 | Ubuntu bug 1300554 in MAAS "If the rabbit password changes, clusters are not informed" [High,Triaged] | 02:48 |
roaksoax | bigjools: cool, otherwise we can just note that in a release upgrade | 02:48 |
roaksoax | bigjools: err release note | 02:48 |
bigjools | roaksoax: fine for now | 02:48 |
bigjools | thanks | 02:48 |
roaksoax | bigjools: that for upgrades that bug will happen and the "fix" is to restart the clusters | 02:48 |
roaksoax | anywayi'm off | 02:48 |
roaksoax | night | 02:48 |
bigjools | roaksoax: something must have changed in apt for it to screw the local cluster | 02:48 |
jtv | nn roaksoax | 02:49 |
bigjools | different ordering of installs | 02:49 |
bigjools | ok cheers roaksoax, sleep well | 02:49 |
* bigjools thinks about writing release notes | 02:49 | |
roaksoax | bigjools: btw.. i was thinking thay maybe tgtadmin might have a way to export the condig into a file for persistancy | 02:52 |
roaksoax | wouls be worth looking into that | 02:52 |
bigjools | roaksoax: yeah, good point, thanks | 02:52 |
bigjools | --dup! | 02:53 |
bigjools | err | 02:53 |
bigjools | --dump | 02:53 |
bigjools | any better jtv? | 02:53 |
jtv | Well this gets me back to the previous bad setting. :( | 02:53 |
bigjools | heh | 02:53 |
roaksoax | bigjools: cool. we should dump each time a new entry gets added | 02:53 |
jtv | And my previous xrandr incantations didn't work. | 02:53 |
roaksoax | anyway... night! | 02:53 |
bigjools | jtv: just noticed tgt-admin has a --dump which might be useful | 02:53 |
jtv | Good night. | 02:53 |
bigjools | cheers roaksoax | 02:53 |
jtv | bigjools: I doubt it. | 02:53 |
jtv | We already write the full config anyway. | 02:54 |
bigjools | just an option | 02:54 |
bigjools | ok | 02:54 |
jtv | Wow, and the letter "b" is broken in my dash. That's a weird one. | 02:58 |
* bigjools eats lunch | 03:09 | |
=== CyberJacob|Away is now known as CyberJacob | ||
dimitern | rvba, ping | 09:01 |
dimitern | rvba, bigjools, allenap, I managed to pin down the probable cause of bug 1299114 | 09:03 |
ubot5 | bug 1299114 in MAAS "'ValidationError' object has no attribute 'error_dict' when creating a network" [Critical,Triaged] https://launchpad.net/bugs/1299114 | 09:03 |
allenap | dimitern: Excellent! However, I have to pop out for a few minutes. I’ll ping when I’m back. | 09:04 |
dimitern | http://paste.ubuntu.com/7188828/ - it seems the error happens only when you try to add a network with a ip (network) address that matches an existing network | 09:04 |
jtv | Which is probably the cause of the error in the first place. | 09:05 |
jtv | There's a separate check for that, and it must be broken. | 09:05 |
dimitern | yep, it's just does not have a good error message | 09:05 |
jtv | Not what I meant. :) The ValidationError probably needs to be constructed in some particular way. | 09:06 |
jtv | dimitern: could you update the bug with your new information? | 09:07 |
jtv | Hmm... I wonder why we have a NetworksListingForm and a NetworkListForm. | 09:08 |
jtv | Looks like a leftover from parallel work producing the same form from two people, probably one as just a placeholder. | 09:10 |
dimitern | jtv, just did | 09:10 |
jtv | Thanks. | 09:12 |
jtv | I think this means that the exception at the very end of src/maasserver/models/network.py isn't quite right for what form validation wants. | 09:13 |
jtv | dimitern: any chance we could get the full traceback of the exception? | 09:35 |
jtv | Is that in the logs? | 09:35 |
jtv | Ah, found it. | 09:35 |
jtv | Wow, this could have been clearer. Yes, it's documented all over the place that validate_unique must raise ValidationError if it finds a clash, but not that that ValidationError must be one that was constructed from a dict, not an error message like in the documentation examples! | 09:38 |
allenap | jtv: Django’s ValidationError is a horror-show. Reading its code is not only like seeing inside the sausage factory, but also seeing the animal’s skulls being caved in. | 09:51 |
gmb | jtv: src/provisioningserver/tests/test_maas_import_pxe_files.py | 09:53 |
jtv | Yes, that'd be the one. | 10:00 |
jtv | dimitern, just for you. :) https://code.launchpad.net/~jtv/maas/bug-1299114/+merge/213623 | 10:13 |
AskUbuntu | I get Internal Server Error when trying to connect the MAAS GUI in a fresh installation on 12.04 | http://askubuntu.com/q/441870 | 10:15 |
dimitern | jtv, great, thanks! | 10:25 |
dimitern | allenap, rvba, bigjools, I'd appreciate if someone can spare some time to review this gomaasapi CL https://codereview.appspot.com/82460044/ | 11:25 |
allenap | dimitern: I’ll take a look. | 11:26 |
dimitern | allenap, ta! | 11:26 |
=== fader is now known as fader_ | ||
=== fader_ is now known as fader | ||
=== cmagina-away is now known as cmagina | ||
=== sputnik1_ is now known as sputnik13net | ||
=== kevin is now known as Guest2886 | ||
=== roadmr is now known as roadmr_afk | ||
=== roadmr_afk is now known as roadmr | ||
=== CyberJacob is now known as CyberJacob|Away | ||
bigjools | mwhudson: did you try arm with the latest maas in trusty at all? | 23:13 |
bigjools | or the faily ppa | 23:14 |
bigjools | (sic) | 23:14 |
mwhudson | bigjools: not recently | 23:14 |
mwhudson | i will be soon | 23:14 |
bigjools | mwhudson: ok. I need a 3rd party to verify it works on arm | 23:14 |
mwhudson | once IS do something with networking | 23:14 |
bigjools | since there's a ton of changes in image imports | 23:14 |
mwhudson | bigjools: timeframe? | 23:14 |
bigjools | mwhudson: now? :) | 23:15 |
mwhudson | heh | 23:15 |
mwhudson | there is an rt you can make noise on if you like... | 23:15 |
mwhudson | bigjools: the hyperscale guys might have been doing stuff | 23:15 |
bigjools | ok | 23:16 |
mwhudson | i'll be testing on midway | 23:16 |
bigjools | the more the better | 23:18 |
bigjools | thanks | 23:18 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!