[03:29] Why isn't the packaging building bin/maas-probe-dhcp!? [03:30] bigjools: any ideas? ^ [04:47] jtv: where should it live do you reckon? cluster? [04:47] maas-dhcp? [04:47] I was thinking maas-dhcp. [04:47] k [04:47] I already have a branch installing it as part of that package. The problem is getting it built. [04:47] But feel free to write up your own branch; it's not exactly a lot of diff. [04:52] jtv: hmmm it needs to go in cluster controller [04:53] well doesn't *have* to but makes more sense to me [04:53] ah sod it maas-dhcp [04:53] Well I had been wondering, but you weren't around: did we have a particular purpose in mind for it? [04:54] Were we going to run it during install? Or before starting our DHCP server..? [04:58] jtv: show me your diff [05:02] I added to debian/maas-dhcp.install: [05:02] debian/extras/maas-probe-dhcp /usr/sbin [05:03] (Or various other things I tried). [05:03] I also had a lintian-overrides, but I reverted that already. [05:03] well you need a bit more than that [05:03] MP coming [05:12] jtv: https://code.launchpad.net/~julian-edwards/maas/packaging-add-maas-probe-dhcp/+merge/201538 [05:12] Thanks. [05:12] * jtv shall review forthwith [05:12] buildout scripts are a PITA [05:12] jtv: it's untested! [05:12] I figured since you were desperate ... [05:13] I can do it if you want [05:13] I see that you managed to include bash. :) [05:13] sadly [05:13] cargo culted existing ones [05:13] What exactly do you mean by untested — "not covered by test suite" or "I haven't tried this yet"? [05:14] I haven't tried it [05:14] Ah. Well, it'll sort of come naturally with my manual testing, so don't worry about it. [05:14] As long as the package builds... [05:14] ok [05:14] well let's do a test build first [05:15] I'll do it on cstack [05:17] OK [05:17] * jtv reminds self to write wrapper script for uvt-kvm [05:23] bigjools: it's surprising to me that you can't just use the maas-prove-dhcp tool that buildout builds for us. [05:23] jtv: buildout doesn't run when packaging [05:24] Ahhh. [05:24] it's a development convenience [05:24] ...Because package-building prefers setup.py over make. [05:24] If we didn't have setup.py, it'd have run "make," and then buildout would have run. [05:25] Silly me. [05:44] jtv: nearly done testing, so far so good [05:48] jtv: and it works [06:01] jtv: branch merged [06:20] Thanks bigjools [07:17] hi anyone knows why I deploy service, every time I got "agent-state-info: '(error: cannot run instances: gomaasapi: got error back from server: 409 CONFLICT)", by looking at maas.log, I gotOAuthUnauthorized exception [07:20] rawang: the two sound like they're different problems... MAAS returns 409: Conflict in particular if no nodes are available that match your request. [07:20] Unfortunately older versions of juju don't return the full error message. [07:21] jtv, there are 3 available server in maas's pool [07:21] And you specified no constraints? [07:21] jtv, and i have use tag to bootstrap on one of them [07:21] And the servers are all in Ready state, I guess... [07:22] jtv, well for juju bootstrap, i have constais [07:22] jtv, oh , you remind me maybe constraints is still there when boostrap the juju ... [07:23] You could also try simulating juju's request directly against the MAAS API (e.g. using maas-cli) and seeing what the actual error message is. [07:24] jtv, oh really, could you please give me a example? :) [07:25] jtv, after I remove the constraints, it works, thanks :) [07:42] Phew. :) [07:47] bigjools: any chance you could re-test my branch lp:~jtv/maas-test/second-dhcp-check against your hardware? [07:47] I'd be particularly interested in these scenarios: [07:48] 1. A regular, successful run (to see that nothing is broken). [07:48] 2. There's a dhcpd running, but on the same interface you're running maas-test against. [07:49] I'm trying it against scenarios where the network interface has no IP address. [08:08] bigjools: scenario 2 needs your updated package inside the VM. Easiest way to do that I suppose is to edit maastest/maasfixture.py and replace the apt-get with some other way of installing the package. === CyberJacob|Away is now known as CyberJacob [08:37] jtv: you can test scenario 1 in the lab yourself, that's precisely why the manual job is made for. [08:47] rvba: what is "the manual job"? [08:48] jtv: I was referring to the email I sent yesterday "Manual maas-test Jenkins job in the lab" [08:48] Ah, I'll have a look at that. [08:49] jtv: with that you can get the lab to test a maas-test branch for you. [08:49] Great. [08:49] That will test your scenario 1. [09:34] I have a problem, i have try 12.04 and 13.10 and get same error [09:34] [21/Dec/2013:00:49:57 +0100] "GET /MAAS/metadata//2012-03-01/user-data HTTP/1.1" 404 === mwhudson is now known as zz_mwhudson [10:22] rvba: looks like that lab run failed... Do we get the details that the tests attached anywhere? [10:31] bigjools: when I comment out the broker's details (in the config), I still get a segfault… so maybe the rabbit connection is not part of the problem after all. [10:31] rvba: interesting [10:31] rvba: can you run the celeryd itself up with minimal args? [10:32] bigjools: that's precisely what I'm doing. [10:32] cool [10:32] so this smacks of a build/dependency problem [10:32] rvba: can I ssh into your machine? [10:32] But it cannot run without some kind of config… [10:32] bigjools: sure, hang on. [10:33] bigjools: ssh ubuntu@10.55.60.167 [10:34] rvba: did you add my key? [10:34] Yes [10:34] it won't let me in [10:35] bigjools: ah, I'm logged in as root, just one sec. [10:35] My bad [10:35] bigjools: please try again [10:35] still can't get in [10:36] hum… [10:37] bigjools: can you try one more time (I'm watching the logs) [10:37] ? [10:37] there [10:37] Nothing [10:38] you gave me the right IP address? [10:38] ssh ubuntu@10.55.60.167 [10:38] Yes [10:38] PEBKAC [10:43] bigjools: when celeryd isn't configured to read MAAS' tasks, it doesn't blow up (!). [10:43] rvba: If I run "celeryd" on its own it crashes [10:43] no args [10:44] rvba: gah remind me how to remove core dump restriction [10:46] bigjools: ulimit [10:46] jtv: tried it already [10:46] -bash: ulimit: core file size: cannot modify limit: Operation not permitted [10:46] Hnyug [10:47] bigjools: what core dump restriction [10:47] Strange thing is, I can do that. [10:47] ? [10:47] rvba: size [10:47] you don't get 'em by default [10:47] Did you make it too big perhaps? Unit of size is 512KB blocks. [10:48] jtv: "1" doesn't even work [10:48] wha [10:48] I used "unlimited", doesn't work [10:48] Are you sudo'ing it perhaps? I guess that might refuse. [10:49] ulimit -S -c unlimited works [10:49] need the -S [10:49] gah no core file still [10:49] wtf [10:51] grah it's sending it to apport [10:51] /o\ [10:51] bigjools: I was wrong, it's just that sometimes it takes a long time to crash. [10:51] rvba: writing the core file no doubt [10:57] bigjools: I noticed that python-celery (3.1.6-1ubuntu1) is now released in Trusty's cloud archive. I started a new canonistack instance and installed python-celery. When I run celeryd there is doesn't crash. Maybe the package from our ppa is faulty? [10:57] s/is doesn't/it doesn't/ [10:58] rvba: quite likely, as I said, I suspect build or dependency problems [10:59] Trying to install MAAS with python-celery from the cloud archive. [10:59] (https://launchpad.net/ubuntu/trusty/+source/celery/3.1.6-1ubuntu1 says it was published 1 hour ago) [10:59] rvba: can I upgrade your other instance? [10:59] bigjools: sure [11:00] But I tweaked the configs. [11:00] So it's better to start from a fresh instance. === CyberJacob is now known as CyberJacob|Away [11:05] bigjools: arg, same crash with the python-celery from the cloud archive :/ [11:06] Why oh why oh why does Canonistack have the shits today? [11:07] bigjools: it's really weird, look: http://paste.ubuntu.com/6749871/ vs http://paste.ubuntu.com/6749868/ [11:07] rvba: did it work with the one in the main archive? [11:07] I still see a core [11:08] bigjools: two similar Trusty machines, one one celeryd crahsed, on the other it does not. [11:08] ! [11:08] this is python crashing [11:09] * rvba installs maas on the machine where python-celery does not crash… [11:10] gmb: what do you mean? [11:11] rvba: Instances keep dying on me. lcy01 must be out of RAM. Current one is staying up though, so far... [11:11] I gave up with 01 [11:12] gmb: hum, might be related to the problem we're seeing then… [11:12] bigjools: these are all 01 machines [11:13] rvba: Hmm, possibly. That said, if your instances are staying up, probably not. Nova is pretty proactive about pausing instances when it runs out of resources. [11:14] So you'd know if it was affecting you. [11:14] ls [11:14] Segmentation fault [11:14] I think it's affecting me :) [11:15] rvba: I have a clue [11:15] it's falling over in librabbitmq [11:15] #0 0x00007fb4eef716ab in amqp_pool_alloc () [11:15] from /usr/lib/x86_64-linux-gnu/librabbitmq.so.1 [11:15] so it's not celery's or python's fault at all [11:19] bigjools: how did you find out? In the code dump? [11:19] yep [11:23] it's rather annoying that someone bumped celery a whole major version [11:24] and didn't choose a new package name [11:26] bigjools: did you see that running celeryd with strace? [11:26] rvba: no I analysed the core [11:27] Ah okay, I see it now. [11:30] bigjools: so, technically it blew up in librabbitmq1 but this doesn't tell us why exactly does it? It might be a wrong interaction with another library. [11:30] rvba: amqp_pool_alloc [11:30] I suspect some esoteric part of rabbit changed [11:32] bigjools: the most recent successful run we had in the lab is this: http://d-jenkins.ubuntu-ci:8080/view/MAAS/job/trusty-adt-maas-manual/4/console [11:32] The console log gives us all the versions of all the packages used. [11:34] rvba: old celery? [11:35] Yes [11:35] An no trace of librabbitmq. [11:41] bigjools: celery's changelog says (for the upgrade to 3.1.6-1 from 2.5.3-4ubuntu1): 'Drop depenceny on python-amqplib, it has been replaced by python-amqp/python-librabbitmq in python-kombu.' And of course python-librabbitmq depends on librabbitmq1. [11:41] right [11:43] it needs a build with the -dbg library and then we can get which line of code blew up [11:43] my frantic googling is turning up nothing [11:43] Same here. [11:44] I gotta sleep man [11:44] good luck and good night [11:44] Yeah, I'll see if Andres can help us. Maybe by creating debugging packages like you said. [11:45] you'll need andres [11:45] bigjools: ^ [11:45] just install -dbg ones [11:45] librabbitmq-dbg [11:45] and run again [11:45] Oh, they exist already? [11:45] then when you get the core file: [11:46] gdb /usr/bin/python core [11:46] > bt [11:46] Yeah, I know that part. [11:46] ok [11:46] I'll try that. [11:46] ok I am going to bed [11:46] see you [11:47] night === CyberJacob|Away is now known as CyberJacob === zz_mwhudson is now known as mwhudson === CyberJacob is now known as CyberJacob|Away === CyberJacob|Away is now known as CyberJacob === mwhudson is now known as zz_mwhudson