/srv/irclogs.ubuntu.com/2012/10/05/#maas.txt

bigjoolshi roaksoax and smoser00:11
smoserbigjools, hey00:11
bigjoolsthis morning would have been better had my wife not woken me up from the deepest sleep ever00:12
smoserbigjools, :-(00:17
smoserlp:~smoser/+junk/backdoor-image/00:17
smoserhas "backdoor"00:17
bigjoolscool00:17
* bigjools grabs00:17
smoserjust point it at an image and it will backdoor it00:17
smoseruser 'backdoor'00:18
bigjoolsI don't normally do backdoor00:18
smoser--user bigjools00:20
bigjools:)00:20
bigjoolsit's the ephemeral image I want in this case, right?00:21
smoserright.00:22
smoser/var/lib/ephemeral/.....disk.img00:22
smoserits probably most proper to 'restart tgt' afterwards00:22
smoserbut i've never actually had to do that.00:22
smoserbut clearly if it had open filehandles, it could be confusing.00:23
bigjoolsyeah00:23
bigjoolsI was going to ask - is it possible to proxy tgt via the clusters?  at the moment ephemerals get pulled from the region00:23
smoseri dont know. i think the cluster would just want to run tgt also.00:24
smoseryou want that to be as close as possible00:24
smoserits block level io00:24
smoserif i understand you correctly00:24
smoserbigjools, oh.00:24
smoseri thought about one thign that might be screwing you.00:25
smoseri dont think proxy settings or mirrors get sent down to commissioning00:25
bigjoolsit's just that it's a bottleneck right now - we might need to start copying ephemerals to clusters00:25
smoserright. thats what i was saying. i think you should plan on copying ephemerals to clusters.00:25
* bigjools tries commissioning again00:25
smoseri have to run00:25
bigjoolsthanks for the script00:26
smoseri've tested it on /00:26
smoserbut not actually on an ephemeral image.00:26
bigjoolsok :)00:26
smoseri've looked at it though (when pointed at an image)00:26
smoserbut on / it added a user that i could ssh in as00:26
smoserso it seemed ok.00:26
smoserlater.00:26
bigjoolscheers00:26
bigjoolsoooo00:27
bigjoolsI saw an error flash by00:27
bigjoolssomething about tty error00:27
* bigjools tries to log in00:27
smosertty error.00:28
smoserthat is strange.00:28
bigjoolsI think it mentioned stderr too00:29
bigjoolsOct  5 00:27:16 10-0-0-100 kernel: [   25.467047] init: cloud-config main process (899) terminated with status 100:29
bigjoolsOct  5 00:27:16 10-0-0-100 kernel: [   25.885676] init: cloud-final main process (1049) terminated with status 100:29
smosercopy off /var/log/cloud-init.log00:30
smoserand /var/log/cloud-init-output.log00:30
smoser(you cna proably sudo apt-get install pastebinit)00:30
smoserand do it that way00:30
bigjoolsProcessExecutionError: Unexpected error while running command.00:30
bigjoolsCommand: ['locale-gen', 'en_US.UTF-8']00:30
bigjoolsExit code: 100:30
bigjoolsthere00:30
bigjoolsI can scp it then paste, one sec00:30
bigjoolssmoser: no cloud-init-output file, but here's cloud-init.log: http://pastebin.ubuntu.com/1261098/00:32
bigjoolslook around line 246 onwards00:33
smoserbigjools, /var/lib/cloud/instance/cloud-config.txt and /var/lib/cloud/instance/user-data.txt.i00:36
smoseri really have no idea why 'local-gen' would fail like that. but its not deadly.00:37
bigjoolscloud-config.txt is empty00:37
bigjoolsuser data has plenty00:38
smoserah. yeah, it would be user-data.00:39
smoserthere is no cloud-config in that.00:39
smoserbut user-data has that script. and that *shoudl* get run.00:39
bigjoolshttp://paste.ubuntu.com/1261109/ is its contents00:41
smoserbigjools, something is stopping you from reaching runlevel 3 i think.00:44
smoserso the cloud-final job is not running00:44
smoserbigjools, what does00:45
smoser'runlevel'00:45
smoserand sudo status rc00:45
smosershow?00:45
bigjoolsN 200:47
bigjoolsand00:47
bigjoolsrc stop/waiting00:47
bigjoolsit must be that tty error00:48
bigjoolssmoser: ^00:48
smoseroh wait. that is wierd.00:49
smosercloud-final ran00:49
smosertry running it again00:50
smosersudo start cloud-final00:50
bigjoolsstart: Job failed to start00:50
bigjoolsI can't find any log for it00:56
smoserok. try it more manually. i guess.00:56
smosersudo cloud-init modules --mode=final --debug00:56
smoserneed --debug before modules00:57
bigjoolsboom00:58
bigjoolswell it has a lot of tracebacks but it did complete00:59
bigjoolsmachine powered off00:59
bigjoolssmoser: so something is wrong with the upstart conf perhaps?00:59
bigjoolslog: http://paste.ubuntu.com/126112901:00
smoserhow much memory on the system?01:02
smoserthe FALLBACK stff is not really a big deal. its operating as mostly designed.01:02
bigjools2Gb01:02
smoserthe issue is that it logs to rsyslog, but then from within it, the userdata its running calls /sbin/poweroff01:03
bigjoolsit's an HP cube01:03
smoserand rsyslog gets killed01:03
smoserso logging breaks.01:03
smoserbut id ont understand why it wouldnt have run01:03
smoseri have no idea really.01:03
bigjoolswhy does it only happen when console=ttyS0 is specified?01:03
smosersomething failed.01:03
smoserit doesn't make any sense to me.01:04
bigjoolsjeez, 29C at 11am, gonna be hot today01:04
bigjoolsI'll retry and see if I can see that error on the console, now I know when to look for it01:06
bigjoolsah it did it on enlistment too01:08
bigjoolsstty: standard-input: input/output error01:08
bigjoolsI am going to file a critical bug01:09
smoserthe stty error is maybe not related.01:10
bigjoolsI am thinking that01:10
smosercan you get dmesg01:10
smoserand basically just tar up01:10
smoser/var/log/01:10
=== matsubara is now known as matsubara-afk
smoserit just doesn't make sense.01:11
smoserbigjools, now that i think about it the tee /dev/tty2 is probably a bad idea.01:11
smoseras maybe the tee is in this situation too.01:11
smosersince we're /sbin/halting01:12
smoserthe tee might get killed and cause unnecessary angst.01:12
smoseri like the log though.01:12
smosermaybe we should have the script do01:12
smosersh -c 'sleep 10 && /bin/poweroff' &01:12
smoseri can come up with something better too, but basically just amke sure that cloud-init is done.01:13
bigjoolssmoser: why does "start cloud-final" fail?01:14
bigjoolsisn't that a hint?01:14
bigjoolshttps://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/106197701:14
ubot5Ubuntu bug 1061977 in MAAS "Machine fails to commission when console=ttyS0 is present on kernel opts" [Critical,Triaged]01:14
smosercan youg get dmesg01:15
smoserand the /var/log attachd there.01:15
bigjoolsyeah01:15
smoserit is /dev/console01:18
smoserit completely is that01:18
smoserthe fact that running it trhoug upstart died01:18
smoserand then manually did not01:18
smoserbecause its output will go to /dev/console when you do it via upstart01:18
bigjoolswhat from /var/log do you want?01:19
smosermaybe the kernel has a buffer and just gets fed up at some point.01:19
smoserjsut get it all01:19
bigjoolsk01:19
smoserif you dont mind.01:19
smoseryou didn't have any private data there :)01:19
bigjoolshaha :)01:19
smoseroh. yeah.01:21
smoserwhile you're on that machine01:21
smosercan you just try01:21
smoserecho HI MOM | sudo tee /dev/console01:21
smoserand see what it does01:21
smoseri think we're gonna get an input output error basically.01:22
bigjoolstee: /dev/console: Input/output error01:22
smoseryeah. so it makes sense.01:22
smoserbut i thought the kernel was supposed to be smarter.01:22
smoseryou do have 'console=tty1' on the cmdline also, right?01:22
bigjoolsyes01:22
smoserso there is no physical serial port, right?01:26
smoserjust to be sure.01:26
bigjoolsthere isn't - unless the BMC is providing one surruptitiosly01:26
smoserhm..01:28
smoserinteresting.01:28
smoserbigjools, you see kernel output on the monitor, right?01:28
bigjoolsI do01:28
smoserbigjools, the localegen died because it tried to write to its stdout01:30
smoserbigjools, so i've got to go afk.01:39
smoserbut i'd like it if you reviewd my patch for https://code.launchpad.net/~smoser/maas/preserve-sources-list/+merge/12782501:39
bigjoolssmoser: ok01:39
smoserparticularly, i'm not happy about the test, it seems long winded, but didn't know how to make better.01:40
smoseri dont really know what to do for your ttyS0 issue.01:40
smoserother than not specifying console= at all01:40
smoserwhich imo seems busted.01:40
bigjoolssmoser: we have to remove it01:40
smoseras a default.01:40
bigjoolsbusted, but less busted than not booting at all01:40
smoserwell, to be fair, you are the first person to see this.01:40
bigjoolshow much testing has it had though?01:41
bigjoolsnot on VMs01:41
smosermore than a few systems.01:41
bigjoolsI bet the others all have serial console01:41
smoserwell, yes.01:41
smoserand so will the real target audience.01:41
smoserso changing the default (which is hard coded in source code) to accomodate a little toy system01:42
smoserdoesn't make a lot of sense to me.01:42
bigjoolswell hang on01:43
bigjoolsyou don't know that every system will have a serial console01:43
smoseryou're right.01:45
smoserbut i don't that it generally fails so badly if that is the case.01:45
bigjoolskernel bug?01:46
smoserhttp://www.mjmwired.net/kernel/Documentation/networking/netconsole.txt01:48
smoseri've always wanted to play with that.01:48
bigjoolssmoser: I'll improve your test and land your branch01:54
smoserthanks. i'll poke the kernel guys tomorrow am.01:57
smoserbut i dont have any ideas.01:57
=== jtv1 is now known as jtv
smoserhttp://www.mail-archive.com/linux-kernel@vger.kernel.org/msg246433.html02:11
roaksoaxhowdy02:39
roaksoaxbigjools: howdy02:50
bigjoolsroaksoax: hey02:54
roaksoaxbigjools: so how do you feel about the packaging?02:55
bigjoolsroaksoax: positive02:55
roaksoaxbigjools: alright, i'm gonna try to upload to archive tomorrow so we can test upgrades from precise02:55
bigjoolsroaksoax: \o/02:56
roaksoaxand get community to find more bugs if any02:56
roaksoaxthat's would help us too02:56
bigjoolsI did a fix the other day to prevent questions when ugprading, fingers crossed :)02:56
roaksoaxbigjools: which one is it?02:57
bigjoolsroaksoax: config for cluster controller02:58
bigjoolsit looks at DEFAULT_MAAS_URL if it can find it02:59
roaksoaxbigjools: the autodetection of maas-url, yeah I saw02:59
roaksoaxbigjools: it works02:59
bigjoolsI did test it :)02:59
roaksoaxbigjools: yeah me too, it works :)02:59
bigjoolsare you still thinking about changing the sed stuff?02:59
roaksoaxbigjools: i don't know TBH.. doing so will require .d support for the py's03:00
roaksoaxbigjools: and would require a way to provide .d for the yaml conf's too03:01
bigjoolsI fear that is too much work at this late stage :(03:01
roaksoaxbigjools: Indeed!! the easier way right now is simply installing to /usr/share/maas and then symlink everything to /etc/maas/03:01
bigjoolsyeah03:01
roaksoaxbigjools: however, the problme with that is that user settings won't be preserved on upgrade03:02
roaksoaxbigjools: i also found another bug in upgrading but with dbconfig-common, for some reason it was not preserving the password info and was re-writing the config file.03:02
roaksoaxthe latter is expected, but should have presrved the password, but anyways, i work arounded it by simply sourcing that file and grabbing the password from there instead of letting dbconfig-common to rewrite the config03:03
roaksoaxbigjools: now to continue on the config, I think it would be the best for now... will have to check with Daviey to see what's his appreciation on this03:04
roaksoaxbigjools: maybe we can simply send the configs in /usr/share/maas and add some code in the configs itself that source whatever is in /etc/maas03:04
roaksoaxsmoser: still around?03:05
bigjoolsroaksoax: are you going to land your packaging branch?03:33
roaksoaxbigjools: yeah, i'm just testing one more thing03:33
bigjoolsok I put the bug back to in-progress :)03:34
bigjoolstarmac will set it when the branch lands03:34
roaksoaxcool03:34
bigjoolsroaksoax: I shall force a daily build now04:35
roaksoaxbigjools: sure04:36
bigjoolsbeing a build admin has its bonuses04:37
roaksoaxbigjools: indeed it is04:38
roaksoaxi envy you04:38
roaksoaxhahah04:38
roaksoaxi used to use PPA's extensively04:38
* roaksoax bed04:40
bigjoolswe bumped the prio permanently on the daily build ppa :)04:40
bigjoolsnn roaksoax04:40
roaksoaxnight04:40
jamallenap, rvba: So it turns out that MAASClient doesn't return an object like the Django Client return. It has a 'object.code' vs 'object.status_code' and 'object.read()' rather than 'object.content'06:23
jamso... mocking/stubbing/ bad for the real world :)06:23
=== lifeless_ is now known as lifeless
jamallenap: you lied to me... :), list data is not supported by MAASClient. When it gets down into 'make_payload' you support bytes/unicode/IOBase and callable.06:50
bigjoolshey lifeless, any idea wtf is going on here? http://paste.ubuntu.com/126141407:03
bigjoolsI think we have a nasty bug with omshell here :/07:21
bigjoolsgood grief, I thought I'd seen bad code but this is hideous. http://ftp.fr.netbsd.org/cvsroot/src/usr.sbin/dhcp/common/Attic/parse.c,v07:24
lifelessbigjools: looks like a url to me07:30
lifelessbigjools: maybe it needs quotes ?07:30
bigjoolslifeless: I think the base64 parser in omshell blows07:30
bigjoolsquotes don't help07:30
lifelessbigjools: that is very odd07:30
bigjoolslooking at the code it treats + and / specially07:30
lifelessoh joy.07:30
bigjoolsbut the code is sooooo bad that it's taking me a while to work out what07:31
allenapjam: Sorry, I didn't realise you were talking about MAASClient. The cli supports multiple values per key, and so does the server side; it's just MAASClient that needs a little love to get it there.08:22
jamallenap: yeah, I got it to work with https://code.launchpad.net/~jameinel/maas/maasclient-multipart/+merge/12817608:31
jamthough it only fixes POST, GET uses urlencode() which just strs the list08:31
jamallenap: is there a good way to generate a huge number of nodes for testing? Like I want to populate the DB with 1000, 10000, 100,000 nodes.08:32
allenapjam: There's a way. I'm not sure about a good way.08:33
jamallenap: well, pressing 'add node' in the web ui is pretty bad08:33
jamI can do it in SQL surgery, but I need to generate mac addresses, etc. for each.08:33
jamSo it makes me think I should do it in python, but via the API or just direct on the db08:33
jambut how hard is it to grab maasserver.model objects and play with them08:34
jamI imagine 'settings' needs to be set correctly for me to mutate the db08:34
allenapjam: How about: make syncdb && make harness; from maasserver.testing.factory import factory; factory.make_node()08:42
jamallenap: if make harness will do it, that works for me08:42
jamThat was the sort of command I was looking for08:43
jamw7z: mumble?09:08
jamallenap: so what can I do to land the multipart list stuff? I don't actually have a 'mapping' I have an Iterable10:40
jambut string is an Iterable10:40
jamthough apparently getattr(s, '__iter__') fails.10:40
mgzthere's no sane way to do it without isinstance10:42
allenapjam: If it's not a string type, assume it's iterable of string types, and let things blow up if it's not?10:42
jamallenap: well there are layering issues. The part that checks if it is a string type only returns 1 payload to attach, so you need to loop at a higher level.10:42
jamor change the lower thing to always return a list of payloads10:42
jamor..10:42
allenapBlast.10:43
jam(It also accepts files)10:43
jamwhich are iterable, and you want to upload the whole file as one payload10:43
allenapjam: A file isn't a collections.Sequence... but I guess stick with list and put it in the docstring.10:44
jamallenap: I could do "if isinstance(x, collections.Sequence) and not isinstance(x, basestring)"10:45
allenapjam: Yeah. Or change make_payload to gen_payloads, so allowing it to yield multiple things, then all the isSomething checks can be done there.10:47
rvbajam: as I said on the MP, I don't understand ~jameinel/maas/ignore_results, we have a global celery settings for that.10:49
jamrvba: where? (my system at least was improved by nuking the mnesia schema)10:49
jamrvba: if you can point me to it, I'll happily reject/revert my patch.10:49
rvbajam: CELERY_IGNORE_RESULT is set to True in etc/celeryconfig_common.py10:49
jamas it could be me killing things repeatedly.10:49
jamrvba: so right now, I have to install maas-dns in order to do 'make run' is that true?10:50
jamif I don't the service fails to startup properly10:50
allenapjam: Something like http://paste.ubuntu.com/1261655/10:50
jamand the only log I see is the 'you should install maas-dns'10:50
* allenap looks forward to "yield from"10:51
rvbajam: no, a local named instance is started when you run 'make run.10:51
rvba'10:51
jamrvba: well, I can't get 'make run' to talk to me, the last error in the terminal is the 'you should install maas-dns on the region controller'10:51
jamI had this problem for quite a while this week.10:52
jamuntil it magically fixed itself10:52
rvbajam: that message is for the packaged version, I did not bother changing that message in the dev instance.10:52
jam(my best guess at this point, is I got enough queues laying around that rabbit would say 'come back later' when asked to update dns, so it wasn't crashing the startup process.)10:52
jamallenap: I switched to the 'yield' form, care to approve?11:13
allenapjam: Sure.11:13
rvbajam: I'm seeing the dns error too, I'll investigate.  But this is only a task failing, it should not cause trouble for the other services.11:19
jamrvba: so I installed maas-dns, which installed maas, I then uninstalled it, but I now have a twistd process runinng.11:21
allenapjam: +1 :)11:21
jameverytime I kill it, it comes back with a new pid11:21
jamI'm guessing upstart is keeping it alive?11:21
jamah, maas-txlongpoll11:22
jamwhich didn't get uninstalle11:22
jamd11:22
rvbajam: as I said, you don't have to install maas-dns on a dev instance… also, you'll get into trouble if you install the maas package and try to play with a dev instance at the same time :/11:22
allenapGah, so buildout says it's installing testresources 0.2.5, but then uses the system one, 0.2.4. How completely useless.11:22
jamallenap: buildout config as it stands takes system packages over packages it installs.11:23
jamI brought it up earlier11:23
jamas being useless, too.11:23
jam(vs using system packages if there isn't a custom package installed)11:24
jamallenap: so you have to uninstall the system package to have buildout get the right one.11:24
jamand then re-install it in the future when you want to use the system package for some other project.11:24
allenapjam: Ah, right, thanks. At least there's a workaround.11:24
jamrvba: uninstalling package 'maas' is triggering maas-txlongpoll to *start*11:26
rvbajam: !11:26
jam(which naturally prevents uninstalling maas)11:26
rvbajam: I've spotted one problem: looks like the wrong celeryconfig is loaded on a dev instance (by the region worker at least).11:28
rvbaThat's why its trying to write to /etc/bind/maas and failing.11:28
jamrvba: I'm also getting failures in "_write_temp_file" in provisioningserver/utils.py11:42
jam(when I create a node, it triggers writing the dns config again, and I get a failure trying to write the temp file)11:43
jamrvba: so it looks like that is the N^2 behavior I was seeing. Every node added is regenerating a full DNS list.11:44
jam(it is failing, but it still is pulling together the data which is O(N) and when you allocate N nodes == O(N^2) by the end)11:44
rvbajam: the DNS writing task should only be triggered with the node gets an IP address.  Only then do we need to update the DNS config.11:48
rvbas/with the node/when the node/11:48
jamrvba: well I'm doing factory.make_node() in a loop11:48
jamwhich probably gives addresses?11:48
rvbajam: no, there is no lease creation in there as far as I can tell.11:49
jamrvba: maybe it knows the dns config didn't get written properly yet?11:49
jamit is definitely looping on failing to create it.11:50
jambut it may be unrelated to me creating 1k nodes.11:50
jam"Sending due task 'upload_dhcp_leases'" is in the log file an awful lot.11:51
rvbaThat's a celerybeat task that is run every minute.11:51
jamrvba: I'm seeing it roughly every few seconds11:58
jamhowever, 'make_node' also creates a new nodegroup if you don't pass one in.11:58
rvbajam: ah! right, that's what triggering all the DNS stuff.11:59
rvbajam: because each nodegroup gets created with a configured interface IIRC.12:00
rvbajam: I've just fixed the "wrong celeryconfig" problem.  A stupid mistake in the startup script.12:13
jamrvba: \o/12:13
jamso creating 1000 nodegroups takes about 15min, but creating 1000 nodes is <1min.12:13
jamthat seems reasonable.12:13
rvbajam: The plan is to remove the DNS config pre-population.  But post 12.10 release.12:14
jamrvba: sure, but you won't be creating 100,000 nodegroups in the immediate term, so 15min for 1000 seems fine.12:15
rvbajam: right.12:16
allenaprvba: Do you know if the [/+] needs to be on both sides for the [/+]no[/+] bug to manifest, or either side?13:20
rvbaallenap: from Julian's investigation (and the mailing list message we found), I think it's on both sides.13:21
allenaprvba: That's one of the weirdest bugs I've ever heard of.13:22
rvbaallenap: yeah :/.13:22
allenaprvba: Do you know anything about the code that submits commissioning results (op=signal)?13:24
rvbaallenap: not really, maybe I help you with something still?13:25
rvbaI can*13:25
allenaprvba: I'm fixing the code that sets the power parameters. Now, it was broken: it was checking for power_type.upper() in map_enum(POWER_TYPES), i.e. checking that the value of power_type given over the wire matched the Python name of the enum item.13:31
rvbaallenap: what's wrong exactly?13:33
allenaprvba: Well, that works for IPMI, because the Python name == uppercase(enum value).13:34
allenaprvba: But not for the others.13:34
allenaprvba: I mean, we should be expecting the enum *values* over the wire, right?13:35
rvbaallenap: right!13:36
allenaprvba: Okay, I'm not mad then. The reason I want to ask someone about it is to avoid breaking scripts somewhere else. tl;dr this is what tdd looks like without t.13:48
rvbaallenap: haha :).  I think Julian mentioned IDD recently.13:49
=== dpb_ is now known as Guest72485
rvbaallenap: hm,in his comment in the code, Julian said "if '/' or '+' appear either side of the word 'no'"14:02
rvbaallenap: rarg, I've tested it and it's either side indeed, not both sides.14:04
allenaprvba: Gah. Top marks for you though, for confirming.14:05
=== matsubara-afk is now known as matsubara
jammgz: any progress on search?14:10
mgzjam: getting there14:11
allenaprvba: Do you have any time this afternoon to review the extra changes I had to make to https://code.launchpad.net/~allenap/maas/anon-power-setting/+merge/128127?14:13
rvbaallenap: sure, I'll do it right now.14:13
allenaprvba: Thanks.14:13
roaksoaxrvba howdy¡¡ i pushed the branch and selected u as reviewer14:15
roaksoaxdid you see it¿14:15
roaksoax?14:15
rvbaroaksoax: I did :)14:15
roaksoaxrvba cool then :)14:15
roaksoaxthanks14:15
rvbaroaksoax: I'll get to it in ~30 minutes.14:16
roaksoaxrvba awesome thanks14:16
roaksoaxallenap thanks for taking care of the power settings for enlistment14:17
jammgz: so I just did a quick 'wrap process_node_tags in lsprof' and the results are: out of 107s under profiling, 7.8s is parsing and processing the XML, 67s is downloading the hardware details, 32s is uploading the system_ids results, and 2.3s is getting the system_ids list.14:20
jamI think the MAASClient handling of repeated tags is a bit slow (I think it encodes each value as a separate multipart message section.14:20
jamuploading 12,000 values is a lot, but not 32s a lot, given that we can read that many tags in 2.3s.14:20
allenaproaksoax: No worries. It hasn't landed yet... when I picked up that rock I found some bugs :)14:21
mgzso, it is the passing data around overhead... still seems very high14:22
jamAnd then getting the hardware details is *way* too expensive as well, given if it was a raw DB query, we could get the results in... checking.14:22
jamI get the first result here in about 10s.14:23
jam32s to get everything.14:23
jammgz: 32s just to get the content out seems a bit too expensive as well.14:23
jamtime xpath_exists is still 6.8s, time in lxml on the current code is only 7.8s.14:24
jambut "select hardware_details from maasserver_node" > /dev/null is 32s.14:25
jammgz: Is there a lot of quoting that Postgres is having to do?14:25
mgz...no, but maybe it's re-serialising? it shouldn't need to, but the 9.1 support does seem a little unpolished.14:26
jammgz: time to select from an xml column into a new text table is 4.1s.14:28
mgzthere might be a magic cast that will help?14:28
mgzwhat does adding ::text do if anything?14:28
jammgz: time select content from alt_hardware_details >/dev/null => 32s.14:29
mgzgr.14:29
mgzmight be sane to just ditch some of the db changes and pull from the original location :)14:29
jammgz: interestingly, using 'bytea' instead of text is *slower*14:31
jam1m24s14:31
mgzheh, I think that's probably django14:33
jammgz: this is in psql14:36
jamno django involved14:36
jampsql -c "select ..."14:36
mgz...that's very suprising then14:36
jammgz: so something very strange when 'select DATA from...' is slower than 'select xpath_exists(DATA)' on the same content.14:36
jammgz: if I alter column set storage plain, it fails because it can't fit the 24kB documents in an 8kb page.14:37
mgzjam: it would make sense were it storing some custom data structure optimised for querying14:37
mgzthat's not actually the case though as I understand it, but it's probably doing more work than it needs to for serialisation14:37
jammgz: maybe, but we've confirmed that lxml can take raw text and parse it with an XPath object in ~the same time.14:37
roaksoaxallenap: so the metadata server now imports the method from the maas API?14:38
jamI think the big cost is postgres turning the data into an SQL result.14:38
allenaproaksoax: Yeah, and the implementation has changed quite a lot.14:39
jammgz: select substring(content from 24000 for 100); is 3.7s14:40
jamso it is all about the number of bytes coming out of psql14:40
jammgz: -A flag14:42
jammgz: time psql -A -c "select content" >/dev/null is 1.5s14:42
jammgz: --no-align :)14:42
mgz:)14:42
jamso psql is iterating all the data, caching it, figuring out how to align it, before outputting it.14:42
Davieyroaksoax: can we remove maas-provision14:43
rvbaallenap: I'd like to check my sanity… can you run 'make lint' on trunk?14:44
jammgz: and potentially psql is operating in 'fixed max mem' mode, and so is spooling to disk to do that work.14:44
rvbaallenap: I'm getting: src/apiclient/multipart.py:74: undefined name 'make_payload'14:45
jambut yeah, 1.6s if not aligning, and piping to 'wc -l -c' (if you leave in -w then wc slows things down to 7s looking for word chunks)14:45
allenaprvba: Quite a bit of lint.14:45
allenaprvba: And I see that too.14:46
allenaprvba: I'm free to clean that up.14:46
rvbaallenap: cool.  Make you can clean the regexp in omshell while you're at it.  I know you like regular expressions :)14:47
allenaprvba: As in, the before-or-after thing, or just pretty it up?14:47
jammgz: ts = time(); [node.hardware_details for node in Node.objects.all()]; td = time()14:47
jamis 6.8s.14:47
rvbaallenap: the before-or-after thing :)14:47
jamWhich is a fair amount slower than 1.6, but nothing like the 60s we see14:47
rvbaallenap: I think it just needs a conditional expression based on a backref… but you know that better than me :)14:48
allenaprvba: Can we check for ([/+]no|no[/+)?14:49
jammgz: get_hardware_details is spending 67s total, 18s is reading the response and json.loads it. However, 48.7s is 'post'14:49
jammgz: so I think... MAASClient.post needs some serious poking.14:49
roaksoaxDaviey: from the archives?14:49
roaksoaxDaviey: i'd say we can14:49
roaksoaxand we should14:49
rbasaksmoser: are you planning to SRU bug 978127?14:50
ubot5Launchpad bug 978127 in cloud-init (Ubuntu) "incorrect time on node causes failed oauth" [High,Fix released] https://launchpad.net/bugs/97812714:50
jam(it is possible that some of the slowdown is the server processing our request, but 24s in dispatch to upload, 58s in 'encode_multipart'14:50
jamallenap: ^^ I'm guessing encoding wasn't tuned for handling 100,000 records, right?14:51
smoserrbasak, i suppose you need it after ephemeral also.14:52
smoserwhy can't you just get an architecture that doesn't suck, rbasak ?14:52
rbasaksmoser: that's the reason I ask, yes. Right now we're still defaulting to a precise ephemeral, and I keep needing to fix the RTC14:52
smoserprecise ephemeral should be fixed14:52
rbasaksmoser: can you add a bug task for precise, please? I don't ahve permission14:53
smoserif you're still seeing an issue, then the problem is not solved correctly.14:53
rbasaksmoser: seeing an issue out of today's maas daily14:55
smoserrbasak, can you show me?14:55
smoseri basically tested this.14:55
smoserby having cloud-init's upstart job first break the clock14:55
rbasaksmoser: half an hour please? I just worked around it for a juju test and d-i just started14:55
smosersure.14:56
allenapjam: Ha, wow.14:56
Davieyroaksoax: right, i will remove it from the archive... but i want to make sure you have an upgrade path14:58
Davieyroaksoax: can you raise a bug requesting removal as it is deprecated etc.. and i'll process it14:59
mgzhm, I should make fail an alias for tail -f, that was a funny tyop14:59
roaksoaxDaviey: right, so maas Conflicts/Replaces on maas-provision, which obviously causes it to be removed. However, it not being in the archives would make the transition smoother, wouldn't it? because the packages would simply be completely removed15:00
rvbaallenap: I think so yeah.  All I know about this problem is summarized right here ;) : http://paste.ubuntu.com/1262047/15:00
jamallenap: apparently 30s of the 80s is in 'set_type'15:01
allenapjam: That's... weird. Maybe it's recoding things at that point.15:02
jamallenap: well, it is under lsprof, so some things are more expensive than they are in 'real life'. I'm doing a quick test of casting the strings to bytes rather than unicode15:02
jamsee if that has a big difference15:02
jamsince bytes payloads don't have set_type called15:03
rvbaallenap: I've got a tiny branch up for review if you have time: https://code.launchpad.net/~rvb/maas/big-networks/+merge/12826915:03
jtvTrunk seems to be failing tests and have lint in it.  :(15:03
allenapjtv: I'm fixing those now.15:03
jtvAh15:03
rvbaroaksoax: did you put your branch up for review?  I don't see it in the "active code reviews" list.15:04
jamallenap: interestingly, rabbit can take ~1 minute from the time I put something in the queue, before it actually triggers on the provisioning worker15:04
roaksoaxrvba: I change the status to work in progress15:04
rvbaroaksoax: ok, got it.15:05
allenapjam: That's not good.15:05
jamallenap: (the background is, we can run xpath_exists() in the database taking ~7s to run, or we can dump the raw xml out in about 1.5s, but it takes 42s to read the data via APIS and upload the results back)15:06
jamso a 6x overhead15:06
jamand we only spend 7.8s in etree.XPATH()15:06
jamallenap: as for what causes the rabbit slowdown, I'm not sure. It looks a little like it is recoverying from trying to write the dns config or something.15:08
jam(waiting for celery-beat?)15:08
smoserrbasak, is there anything you're aware of that you'd like in SRU not on https://bugs.launchpad.net/ubuntu/precise/+source/cloud-init15:11
mgzokay, sorted apart from what to do with InvalidConstraint... does the view somehow need to catch it and do something clever... atm you just get a "we broke" error page which is not acceptable for a typo15:13
flacosteroaksoax: do you have everything you need for IPMI in enlistment?15:15
roaksoaxflacoste: yes, just waiting for it to land15:15
flacosteroaksoax: right, ok15:16
roaksoaxthanks :)15:17
Davieyroaksoax: state of bug 1052056 ?15:17
ubot5Launchpad bug 1052056 in freeipmi (Ubuntu) "[FFe] [MIR] freeipmi" [Undecided,In progress] https://launchpad.net/bugs/105205615:17
roaksoaxDaviey: need to address the unused variable compiler warnings, but haven't really have the time to investigate how to do it since I have been pretty much swamped with the other stuff15:19
Davieyok, thanks15:19
Davieyroaksoax: do you have your latest debdiff?15:19
roaksoaxDaviey: yes, hold on15:20
Davieythanks15:20
roaksoaxDaviey: http://paste.ubuntu.com/1262079/15:21
Davieythanks15:21
jamallenap: well, lsprof says that set_type() spends its time calling get_params set_param, __delitem__15:22
jamwhich appears to have to encode/decode the params15:22
jamhowever, it is an lsprof-ism, real-world time is 42s => 41.5s using 'bytes' instead of 'unicode', but lsprof time is 109 vs 85s15:27
jamso... ignore that one.15:27
rvbaroaksoax: http://paste.ubuntu.com/1262102/.  One question: why use the absolute path '/usr/sbin/ipmi-chassis-config' instead of 'ipmi-chassis-config'?  I thought Scott said it was a bad thing to do…15:34
roaksoaxrvba: following what was done before me15:34
rvbaFair enough :)15:34
roaksoaxrvba: however, i think usr/sbin is not in the path, and hence there was a problem executing the scripts, I don't know if yourecall having us discuss something like that15:35
rvbaroaksoax: yeah, it rings a bell.15:35
rbasaksmoser: I'm not so bothered about bug 1028501 any more, since MAAS doesn't need it to work any more15:36
ubot5Launchpad bug 1028501 in cloud-init (Ubuntu Precise) "cloud-init selects wrong mirrors for arm" [High,Triaged] https://launchpad.net/bugs/102850115:36
rbasaksmoser: I think that means that bug 978127 is the only one I care about in cloud-init for precise. Though I'm worried that I'm missing something else.15:37
ubot5Launchpad bug 978127 in cloud-init (Ubuntu Precise) "incorrect time on node causes failed oauth" [High,Triaged] https://launchpad.net/bugs/97812715:37
roaksoaxrvba: btw.. celeryconfig.py and celeryconfig_cluster.py are not meant to be modified by the user right?15:37
rvbaroaksoax: no. Only maas_local_celeryconfig_cluster.py and maas_local_celeryconfig.py should be modified by the user.15:38
roaksoaxrvba: not even celeryconfig_common right?15:38
rvbaroaksoax: no.15:38
smoserrbasak, did your install go? can i see failed oauth now?15:45
rbasaksmoser: having trouble getting to the point where I can reproduce. I set the clock back to 1970, and now it can't add my PPA for maas-enlist to work around the SRU not being in yet16:05
rbasakChanging the hardware clock is a little bit tedious16:05
rbasakI have to install some usable OS first, since there's no "recovery disk"16:06
smoserboot the ephemeral image.16:06
smosersilly.16:06
rbasak...which I can't log in to16:06
smoserhttp://bazaar.launchpad.net/~smoser/+junk/backdoor-image/view/head:/backdoor-image16:06
rbasakand yes, there are ways round it16:06
smoseruse that to add a user 'backdoor'.16:06
rbasakThe easiest way round it is to install a usable OS16:06
smoseri think that script is easier.16:07
smoserpersonally16:07
smoser./backdoor-image /var/lib/ephemeral/......./disk.img16:07
smoserthen ssh in16:07
smoseror login on console.16:07
rbasakWhat about disabling poweroff?16:08
rbasakAnother step to do16:08
smoserthis is true.16:08
smosermaas needs a "rescue" environment.16:09
rbasakI'm going to make an armhf recovery initrd when I get round to it16:09
smosercirros is probably really close16:09
smoserto what you ened there.16:09
smoserand if you ever do "get round to it" i'd rather you fix cirros to work for you.16:09
rbasakI was going to base it on ubuntu core16:11
rbasakWhich I think might Just Work out of an initrd16:11
rbasakNeed to test it though16:11
smoserwell, fo ryou particular use case here, you can probaly just boot the existing initramfs and kernel16:13
smoserand pass 'break' on the cmdline16:13
smoserif you have console access.16:13
smoserif not then you need ssh or the like.16:13
rbasakroaksoax: are you planning on landing a fix for the ipmi_si module parameters today? If not I need to file a bug.16:19
roaksoaxrbasak: that's actually what i'm doing now16:19
rbasakroaksoax: OK no problem. I'll leave it then.16:19
rbasaksmoser: OK, reproduced finally16:30
rbasaksmoser: now to get you in16:30
rbasaksmoser: which I presume will involve running your backdoor script ;-)16:30
roaksoaxrvba: updated the branch, ready for final review. Thanks a lot for working on it16:40
rvbaroaksoax: np.  Branch approved.16:46
roaksoaxallenap: are you gonna make changes to the anonymous power settings for enlistment or will you land it?17:07
roaksoaxrvba: we don't want users to modify the maas-http.conf either right?17:11
rvbaroaksoax: well, if they want to serve the site over HTTPS they will have to change maas-http.conf.17:12
roaksoaxrvba: ok, so i'll leave it as is17:13
roaksoaxrvba: btw.. https://jenkins.qa.ubuntu.com/job/maas-merger-quantal/127/console17:13
rvbaYeah, I've seen.  It seems to be related to the changes jam checked in.17:13
roaksoaxboomer17:14
rvbaBut it's really just a problem in the tests I think.17:14
roaksoaxyeah but that will be holding off all MP's17:14
rvbaLike… the tests were simply not updated.17:14
rvbaGood point.17:15
rvbahm, I wonder how I could even land a fix then :)17:15
rvbaI guess I just merge manually.17:15
roaksoaxrvba: a fix won't make the jenkings job fail, so it would land the branch17:17
rvbaah right.17:17
roaksoaxrbasak:17:25
roaksoaxhttps://code.launchpad.net/~andreserl/maas/maas_commissioning_modprobe_params/+merge/12829417:26
rbasakroaksoax: approved, thanks!17:28
roaksoaxallenap: re bug 105916817:32
ubot5Launchpad bug 1059168 in MAAS "MAAS should tell IPMI to PXE once" [High,Triaged] https://launchpad.net/bugs/105916817:32
roaksoaxallenap: basically, each team we tell a machine to turn on, it will tell it to PXE boot17:33
roaksoaxallenap: we don't have to manually configure the BIOS and tell it to PXE boot17:33
roaksoaxallenap: it will do it on demand17:33
roaksoaxallenap: sabdfl's request17:33
roaksoaxs/team/time17:34
rvbaroaksoax: I'm fixing the build now.  The fix should land shortly.17:34
roaksoaxrvba: awesome thanks17:34
Davieyroaksoax: maas is using squashfs by default now?17:48
roaksoaxDaviey: for quantal yes17:49
Davieyroaksoax: and juju deploy precise uses old method, right?17:50
roaksoaxDaviey: yes17:50
roaksoaxDaviey: there are no squashfs images for precise, are they? if there are we could enable it oo17:50
Davieythere are not17:51
Davieyjust wanted to check it worked17:51
roaksoaxalright :)17:51
roaksoaxsmoser: around?17:58
smoserhere17:59
roaksoaxsmoser: so, matsubara just pinged me about the commissioning stuff for IPMI since he mentioned that the cards lost its static address17:59
roaksoaxsmoser: since we are working with the assumption17:59
roaksoaxsmoser: that IPMI will alwas DHCP by design18:00
roaksoaxsmoser: so i think we need to address the fact that some people don't wan't to DHCP their IPMI cards, and will pre-configure them18:00
roaksoaxsmoser: what do you think?18:00
roaksoaxmatsubara: ^^18:00
smoserroaksoax, i thought you were wokring under that assumption.18:01
smoseri thought if the card had an IP you reported it.18:01
smoserright?18:01
roaksoaxsmoser: nope not really. The assumption I was told to work bsaed on was if the IUPMI card is set Static addres network source, we should change it to DHCP18:02
roaksoaxsmoser: because it could be pre-configured18:02
smoserah.18:03
smoserwell i like your new suggestion better.18:03
smoseri'd say if it is set to static, and appears to be configured that you should leave it as such.18:03
smoserideally such things are exposed in maas configuration18:03
smoserbut... its a bit late for that i think18:04
roaksoaxsmoser: right, so I was thinking on simply passing a parameter such as IPMI_DHCP="yes" in commissioning_user_data18:04
roaksoaxsmoser: and if so, send --use-static regardless of what it is18:04
roaksoaxto the command18:05
roaksoaxrvba: another build error :( https://jenkins.qa.ubuntu.com/job/maas-merger-quantal/129/console18:06
rvbaroaksoax: I've seen.  Looks spurious to me.18:06
matsubararoaksoax, that'd be helpful. I'm setting up dhcp in the lab ipmi network, but it'd be easier to just use the static ip already configured18:07
jamrvba: so I'm pretty sure I did land the change for '?op=' but I'm also surprised that it would have landed if tests would then be failing. I can land a fix if you haven't already18:08
rvbajam: doing it right now: https://code.launchpad.net/~rvb/maas/fix-broken-build/+merge/12829718:09
roaksoaxsmoser: something like this: http://paste.ubuntu.com/1262416/18:12
roaksoaxsmoser: or better yet, enable it by default, so if the card is set to static, just use that IP address18:16
smoserit looks reasonable.18:16
roaksoaxalright, i'll get that done then18:16
smosermaybe IPMI_CHANGE_STATIC_TO_DHCP=false18:17
smoseryou did put the ipmi header there, but the name 'use_static' could mean so many things.18:17
roaksoaxsmoser: right :). Will do change that for something more appropriate18:19
rvbaroaksoax: the fix has finally landed!18:28
roaksoaxrvba: awesome!18:29
smoserallenap, is this known:18:52
smoserhttp://paste.ubuntu.com/1262473/18:53
smoserhttp://paste.ubuntu.com/1262477/18:55
smoserit was known. fixed in 118519:02
roaksoaxmatsubara: could you please configure one of your IPMI cards to static, and apply this patch to the commissioning_user_data conffile http://paste.ubuntu.com/1262508/19:09
roaksoaxmatsubara: and re-enlist, and re-commission and see if it works (leaves it as DHCP and obtains the right address)19:09
matsubararoaksoax, sure, using the latest package: maas-0.1+bzr1170+dfsg-0+1192+117~ppa0~quantal1, I take?19:12
smoserroaksoax, anyone know anythign about "failed to upload?"19:14
smoserwell, never mind. looks like we have one.19:15
roaksoaxsmoser: huh? failed to upload where?19:21
roaksoaxmatsubara: yeah19:21
matsubaraok, waiting it to be published19:23
roaksoaxmatsubara: ok, it doens't really matter what version of maas as long as it is greater than bzr117019:24
roaksoaxmatsubara: so you can just patch the file19:24
matsubaraah ok19:24
roaksoaxsmoser: for enlistment, do we have a metadata file we can pass same as commissioning?19:32
smoseruser data for enlistment?19:33
smosercontrib/preseeds_v2/enlist_userdata19:33
roaksoaxsmoser: ah right lol, but we are pretty much going to use the same script for enlistment, commissioning19:38
roaksoaxsmoser: so it should probably live in a common place19:38
roaksoaxsmoser: any thoughts? maybe in a package?19:39
roaksoaxcause i was just thinking on shipping it with maas-enlist19:40
smoserroaksoax, thats not a bad idea.19:41
roaksoaxsmoser: i'll make another ipmi package19:43
roaksoaxerr19:43
roaksoaxanother binary package for maas-enlist19:43
smoser?19:43
roaksoaxsmoser: maas-commissioning19:43
smoserwhy separate?19:43
smoserdue to dependencies?19:43
roaksoaxsmoser: becuase maas-enlist pulls curl, archdetect-deb, libavahi-core7, libavahi-common319:43
roaksoaxyeah19:43
smosergood enough.19:43
matsubararoaksoax, enlisting now with your patch19:45
roaksoaxmatsubara: alright, enlistment wont be affected, only commissioning, but we need to enlist first so that commissioning changes take effect19:46
matsubararoaksoax, of the 4 nodes I booted, 3 are static and 1 is dhcp19:46
roaksoaxsmoser: oh btw... if the node is enlisted, and I make changes to commissionining-user-data they don't take effectd19:46
roaksoaxmatsubara: ok cool19:46
roaksoaxmatsubara: that should test it well19:46
roaksoaxsmoser: i have to re-enlist19:47
smoserroaksoax, relaly?19:47
smoserthat is really bad.19:47
roaksoaxsmoser: yep19:47
smoserplease open a bug for that.19:47
matsubararoaksoax, ok. the nodes are declared19:54
roaksoaxmatsubara: ok, now commission them please19:54
matsubararoaksoax, now I enter the ipmi configuration normally and commission?19:54
roaksoaxmatsubara: no19:55
roaksoaxmatsubara: don't enter IPMI, let it commission19:55
roaksoaxmatsubara: and see if the IPMI gets set19:55
roaksoaxand not changed to DHCP19:55
roaksoaxmatsubara: so power them on manually once you've accepted&commission19:55
matsubaraok. they're on19:56
roaksoaxmatsubara: did you ifle a bug for the juju issue with the CPU count?19:58
roaksoaxmatsubara: do you know if it's been fixed?19:58
matsubarayes, i filed20:01
matsubaradidn't test yet but i think so20:01
roaksoaxmatsubara: what's the bug number20:01
matsubarabug 106128620:02
ubot5Launchpad bug 1061286 in juju "juju bootstrap returned ERROR Invalid 'cpu_count' constraint '1.0'" [Medium,Fix committed] https://launchpad.net/bugs/106128620:02
matsubararoaksoax, ok, nodes are ready20:02
matsubaraand no IPMI config was set20:02
roaksoaxmatsubara: jum20:03
roaksoaxmatsubara: check in the commissioning-user-data that modprobe ipmi_si has arguments20:03
matsubara   modprobe ipmi_si type=kcs ports=0xca220:04
roaksoaxsmoser: my bad20:04
roaksoaxsmoser: it does work20:04
roaksoaxsmoser: as expected20:04
roaksoaxi wonder why i thouhght it didnt update20:04
smoserhey all20:53
smoserhttp://bazaar.launchpad.net/~smoser/maas/maas-pkg-test/view/head:/maas-ephemeral-test-quantal.txt20:53
smoserwas mostly functional walk through of ppa test for me20:54
smoseri now do not have to touch the UI by hand.20:54
smoserwhoowhoo20:54
roaksoaxsmoser: what do you think? http://paste.ubuntu.com/1262708/20:56
smoserthat looks reasonable to me, roaksoax20:59
smoseri'm out.21:01
roaksoaxsmoser: hold on21:01
roaksoax:/)21:01
smoserk21:01
roaksoaxsmoser: give me one min21:01
smoserholding21:01
roaksoaxfor you to approave21:01
roaksoaxsmoser: https://code.launchpad.net/~andreserl/maas/commissioning_improvements/+merge/128318 please :)21:02
smoserdid you test it?21:02
roaksoaxsmoser: yes21:03
smoseryou asked for review of "launchpad code reviewers" ?21:04
smoserwhat is doing that.21:04
roaksoaxsmoser: it does that automatically21:05
roaksoaxsmoser: julian (i think) changed it that way21:05
smoserwell that is busted.21:07
smoseranyway.21:07
smoseri'm out.21:07
smoserand i revierwed.21:07
roaksoaxsmoser: awesome, thank you. have a good weekedn21:07
roaksoaxjtv: if you are making changes to tx-tftp, please make htem in the ubuntu package in archive sa well21:37

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!