/srv/irclogs.ubuntu.com/2015/09/30/#juju-dev.txt

wallyworldbeisner: i haven't looked sorry, thumper is on it afaik00:08
thumperwwitzel3: yes00:20
thumperwallyworld, beisner: I'll look this afternoon, just need to talk with menn000:21
thumperand write an email or two00:21
* thumper looks at bug now00:21
thumperto start the thinking process00:21
beisnerthumper, awesome, much thanks.00:21
beisnerthumper, i've got a repro underway, which basically bootstraps and destroys in a loop, 1.24.6 & openstack provider.  --debug enabled, will capture and add to bug if i can catch it that way.00:22
thumperbeisner: awesome00:22
thumperbeisner: does it happen every time?00:23
thumperor just some times?00:23
beisnerthumper, i've seen it > 5 times today in test automation.   that test ran 38 cycles.00:26
thumperhmm... interesting00:26
thumperbeisner: and every time it is deleted at the end even though the warning says it wasn't00:27
thumper?00:27
beisnerthumper, pure conjecture:   it seems that the opportunity for that to race has always been present, and that something got better/faster.00:27
beisnerexposing it more frequently00:27
* thumper nods00:27
thumpersleeps for the win?00:27
beisnerbut yes, 100% of destroys result in the "couldn't delete that thing" msg00:27
thumperoh...00:28
thumperso the warning is always there, what was it that happened > 5 times?00:28
beisneri don't control timing of amulet.  say it gets 10 jobs to run.  it bootstraps, deploys, execs tests, destroys, bootstraps, deploys, rinse and repeat.00:29
beisneroh to clarify:  failed to bootstrap 5.   all 38 complain that they couldn't delete sec groups (but always have)00:29
thumperthe failure to bootstrap is what?00:30
thumperis this the message you are trying to capture?00:30
beisnerno it's what i already logged in the bug00:30
beisnerwhat i'm trying to capture is the --debug output00:30
mgzthis is something of a well-known issue with the destroy code00:32
perrito666zomg how can be local provider so easy to break :(00:32
thumpermgz: well known by whom?00:32
thumpernot me00:32
beisnermgz - yep.  the failing to create sec group is new.00:32
mgzwe have a bunch of mitigation in the form of post-destroy cleanup00:32
mgzthumper: the bug is from 2014-0600:32
beisnermgz, until 1.24.6 it was just an annoyance.  now, it fails to delete Foo, then tries to create Foo, and fails to bootstrap, saying it couldn't create Foo.00:33
thumpermgz: well that isn't particularly useful to us now...00:33
mgzand anyone who does destroy-env immediately followed by bootstrap the same env on openstack will have seen it00:33
thumpernow we just look imcompetant00:33
mgzthumper: so, the only realy way to fix it is make destroy-environment take much longer00:34
thumpersure00:34
thumperwhich is the right thing surely00:34
thumpermake sure the freaking thing is dead00:34
mgzcloud providers will frequently refuse to destroy resources that are associated with other resources in the process of being destroyed00:34
* thumper grumbles 00:35
mgzso, kill a machine, you have to wait for some ammount of time before it will let you delete the groups that were attatched to it00:35
mgzlikewise block devices and so on00:35
mgzone thing that is possible with openstack, and I think the new ec2 vpc sec groups, is remove the groups from the machines before killing the machines00:36
mgzthat way you can reliably wipe them straight away00:36
mgzis a bunch more api calls though00:36
mgzthe other option is something more like what CI does to get juju reliable, which is before bootstrap, basically destroy-environment --force00:37
mgzthat's less elegant00:37
mgzthumper: I guess we really want a different bug for beisner's issue, which is certainly a newer thing00:41
beisneroh neat.  my bootstrap/destroy loop yielded something different:  http://paste.ubuntu.com/12620988/00:44
mgzbeisner: I know this is going to be annoying as you need the destroy cleanup race error first, but any idea if this started in a particular 1.24 minor version?00:44
beisnermgz i believe 1.24.5 was solid00:45
beisnerwould have to do some log digging to prove/disprove that observation though00:45
mgzbeisner: bug 1467331 bug 150061300:46
mupBug #1467331: configstore lock should use flock where possible <charmers> <ci> <reliability> <repeatability> <juju-core:Triaged> <https://launchpad.net/bugs/1467331>00:46
mupBug #1500613: configstore should break fslock if time > few seconds <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>00:46
beisnerok so that repro is simple.   loop a deploy/bootstrap.  took 8 iterations to hit that.00:47
beisnererrr em.  rather, a bootstrap/destroy loop00:47
mgzbeisner: http://reports.vapour.ws/releases/rule/34 for us hitting that in ci00:48
* beisner wanders off, to return in a bit00:49
mgzbug 1454323 is marked fixed but that was just to make the error less terrible and the followups are what I linked above00:49
mupBug #1454323: Mysterious env.lock held message <bootstrap> <ci> <destroy-environment> <repeatability> <ui> <juju-core:Fix Released by thumper> <juju-core 1.24:Fix Released by thumper> <https://launchpad.net/bugs/1454323>00:49
mgzthumper: so, I don't think the juju code around adopting existing security groups with the same name has actually changed,00:51
mgzsee ensureGroup in provider/openstack/provider.go00:52
mgzhowever, I think we hit the bad case of trying to create a group which is in the process of being deleted much more often with our storage code, and changes in newer openstacks00:55
beisnermgz, thumper - added accidental findings to bug 1500613.   after hitting that lock issue, my enviro is borked.  how do i unlock? ;-)01:34
mupBug #1500613: configstore should break fslock if time > few seconds <amulet> <openstack-provider> <tech-debt> <uosci> <juju-core:Triaged> <https://launchpad.net/bugs/1500613>01:34
mgzbeisner: just delete the lock01:35
beisneram i supposed to know where it is?01:35
beisneroh look there.  it tells me.  ha01:36
mgz:P01:36
beisnerso, not sure i can reliably catch the secgroup thing with this hopping out front so readily.01:37
mgzbeisner: you can just rm -rf the lock location in between each run01:38
beisnernot if i'm using another runner, such as bundletester or amulet01:38
beisneroh you mean in the repro, yes i can01:38
mgzyup01:39
thumperbeisner: I'm kinda surprised at how often this lock file problem is occurring01:58
thumperit should just work and delete the file01:59
thumperreally weird that it isn't01:59
thumpertime to go make a coffee and look at this bug01:59
beisnermgz, thumper, thanks.  i've got the repro looping for the secgroup race.  must sleep now.02:03
thumperbeisner: ack, thanks02:03
beisnermgz, thumper - successfully repro'd bug 1335885 with the same loop, added new comment.  now i'm really closing my screen.  thx again.02:28
mupBug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider> <security> <uosci> <juju-core:Triaged> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>02:28
thumperbeisner: thanks again02:28
beisnerthumper, yw, happy to help chase it.02:29
beisner\o02:29
thumpero/02:29
mupBug #1303787 changed: hook failures - nil pointer dereference <hooks> <local-provider> <ppc64el> <juju-core:Fix Released by dave-cheney> <https://launchpad.net/bugs/1303787>02:39
* thumper afk for a family thing02:56
thumperwill be back to finish bug02:56
wwitzel3thumper: thanks for that email03:10
wallyworldaxw: small review please https://github.com/juju/charm/pull/15904:04
axwwallyworld: looking04:21
wallyworldaxw: thanks for review, any idea for name? i don't like it either04:37
wallyworldSeriesForCharm maybe04:38
axwwallyworld: *shrug*  SelectSeries? not much more informative04:38
axwwallyworld: sounds fine04:38
wallyworldok, ta04:38
axwwallyworld: BTW, my point (regarding "any", "default", etc.) is that this function is not directly attached to the charm metadata. so the user has to ensure the order of supported series is maintained04:39
axwwallyworld: which is why I'm saying not to use "any" when it's really "the first item"04:40
axw(if that's true)04:40
wallyworldok, i'll reword04:40
wallyworldit is the first04:40
mupBug #1501173 opened: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>05:39
mupBug #1501173 changed: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>05:51
mupBug #1501173 opened: apiserver/common/storagecommon: StorageAttachmentInfo returns without error even if block device doesn't exist <juju-core:Triaged by axwalk> <juju-core 1.25:Triaged by axwalk> <https://launchpad.net/bugs/1501173>05:54
thumperwallyworld, axw, anastasiamac: http://reviews.vapour.ws/r/2789/05:57
thumperI'd like to build a version to make available for these folks to try with05:58
thumperto see if it does actually help05:58
axwthumper: reviewed06:01
thumperta06:01
thumperaxw: yes, I'm wanting to get it live tested first06:02
thumperthough observing things, it appears that what happens is this:06:02
thumpertry to terminate all the machines06:02
thumperemits warning saying security group in use06:02
thumperfinishes destroy, deletion of group works06:03
thumperso the end result is that the user is warned that it couldn't be deleted, but it has gone06:03
thumperalternatively it warns again, and doesn't delete it, next bootstrap fails06:03
thumperbut yes, I want to test it prior to landing06:03
thumperas I'm taking a wild stab at the numbers06:04
axwthumper: sure, sounds fine06:04
* thumper writes that on review board too :)06:04
thumperok, I'm done06:15
thumperlaters folks06:15
urulamawallyworld: http://www.theguardian.com/travel/2013/may/25/top-10-live-music-venues-seattle :)07:35
wallyworld:-)07:36
mupBug #1501203 opened: apiserver/storage/storagecommon: WatchStorageAttachment should filter block devices <tech-debt> <juju-core:Triaged> <https://launchpad.net/bugs/1501203>07:58
=== akhavr1 is now known as akhavr
voidspacedimitern: ping08:57
dimiternvoidspace, pong08:57
voidspacedimitern: in environments.yaml I have an environment called "amazon-eu" which is type "ec2" and region "eu-central-1"08:59
voidspacedimitern: yet when I bootstrap that environment I get a bootstrap machine in us-east-108:59
voidspacehmmm... it might be a yaml indentation issue08:59
voidspacedammit08:59
dimiternvoidspace, check if you have EC2_REGION set in the env08:59
voidspacedimitern: will do, thanks09:00
dimiternvoidspace, or EC2_URL09:01
TheMuehmm, HO dislikes me09:02
voidspacedimitern: that's set to: https://ec2-lcy01.canonistack.canonical.com:443/services/Cloud09:02
voidspace:-)09:02
frobwarejam, fwereade: joining standup today?09:03
jamomw09:03
voidspacedimitern: dooferlad: gah, Subnets bug is on 1.25 as well as master09:27
voidspacebetter retarget the work I'm doing and fix it in both places09:27
dimiternvoidspace, the addressable containers instId thing?09:28
voidspacedimitern: yeah09:32
voidspaceI assumed it was just master, should have checked...09:32
voidspacehah09:33
voidspacethe bug even says both09:33
voidspaceso it's a reading comprehension failure too... :-)09:33
dimitern:)09:34
axwfwereade: can you please have a glance at https://github.com/juju/juju/compare/master...axw:lp1500769-gce-default-block-source, and let me know if you're ok with this before I go much further?09:35
voidspaceaxw: o/09:36
voidspaceaxw: morning :-)09:36
axwfwereade: basically, I'm sick of using Validate to upgrade config09:36
axwvoidspace: hiya, how's it?09:36
voidspaceaxw: all is well, how's you?09:36
axwvoidspace: not too shabby. furious bug fixing before demo time at the sprint :)09:36
voidspaceaxw: heh, right09:38
voidspaceaxw: pretty much what our team is on as well...09:38
fwereadeaxw, ack09:42
fwereadeaxw, looks eminently sane to me09:46
fwereadeaxw, thanks09:47
natefinchfwereade: got a minute?10:06
axwfwereade: thanks10:18
ashipikajuju bootstrap for amazon reports the following: https://ec2.us-east-1.amazonaws.com?Action=DescribeInstances&Filter.1.Name=instance-state-name&Filter.1.Value.1=pending&Filter.1.Value.2=running&Filter.2.Name=instance.group-id&Filter.2.Value.1=sg-05ae1a61&Timestamp=2015-09-30T10%3A18%3A37Z&Version=2014-10-0110:21
ashipikaany ideas anyone? ^10:21
natefinchfwereade: gonna be out for a bit, but looking for tips on how to run workers during jujuconnsuite tests, since unit assignment is done in a worker now, a ton of tests fail due to units not getting assigned.10:21
=== natefinch is now known as natefinch-afk
axwashipika: is there an error missing from that line?10:22
ashipikasorry.. copy&paste mistake… here's the error message: ERROR failed to bootstrap environment: cannot start bootstrap instance: Get https://ec2.us-east-1.amazonaws.com?Action=DescribeInstances&Filter.1.Name=instance-state-name&Filter.1.Value.1=pending&Filter.1.Value.2=running&Filter.2.Name=instance.group-id&Filter.2.Value.1=sg-05ae1a61&Timestamp=2015-09-30T10%3A18%3A37Z&Version=2014-10-01: dial tcp: lookup ec2.us-east-1.amazonaws.com on 110:22
ashipikaaxw ^10:22
axwhrm10:22
ashipikaaxw: latest master.. go 1.5.110:23
axwashipika: looks like it's due to tagging10:24
axwashipika: that command should be retried though ...10:24
ashipikaaxw: tagging?10:24
axwashipika: we tag the instance and its root disk after it starts10:24
axw(can't do it while starting, which seems a bit brain dead)10:24
ashipikaaxw: https://pastebin.canonical.com/140850/10:25
ashipikaaxw: with —debug: https://pastebin.canonical.com/140853/10:26
axwashipika: erm actually that just looks like a host resolution error. can't tell more than that10:26
ashipikaaxw: rebooting.. who knows.. might help10:27
ashipikaaxw: did not help10:32
axwashipika: don't really know. it's attempting to resolve through DNS on localhost, is that intentional?10:34
axw"on 127.0.1.1:53"10:34
ashipikaaxw: i know… i saw that.. but cannot explain it10:34
axwashipika: don't know, sorry10:36
tasdomasashipika, ping ec2.us-east-1.amazonaws.com10:53
ashipikatasdomas: yes, fails.. switched to eu-west-1 and it seems to be working10:57
tasdomasashipika, but what does it resolve to?10:58
ashipikatasdomas: something must have messed up my resolve.conf, or sth10:58
rogpeppethis PR adds macaroon authorization to the charms endpoint, and continues with some cleanup of the apiserver package too. reviews much appreciated, thanks! http://reviews.vapour.ws/r/2794/11:06
rogpeppewallyworld: i've reviewed https://github.com/juju/charmrepo/pull/3212:15
wallyworldty12:15
wallyworldrogpeppe: i'm tired now and want to keep hacking on the juju side of things for a bit, but will come back to the charmrepo stuff tomorrow, thanks for looking12:20
rogpeppewallyworld: ok, cool12:20
wallyworldrogpeppe: one thing - name in meta doesn't have to be same as directory12:21
wallyworldso i'm not sure about yuor comment12:21
rogpeppewallyworld: yeah, but it's very confusing if it's not12:21
wallyworldhmm, ok, i habe test charms i have written where it doesn't match, so i guess i'm used to it12:21
frobwaredimitern, did you mention this morning that you got the spaces demo to work without having to have a public ip address? (Or perhaps I misheard you.)12:33
dimiternfrobware, yes, eventually - initially the machines in the subnets without auto-public-ip set were "pending", because they didn't manage to download some packages (no outbound access, just dns works)12:34
frobwaredimitern, aha. that's what I see.12:35
dimiternfrobware, so I presume after apt-get retried 30 or so times it gave up and cloud-init finished OK12:35
frobwaredimitern, so did you flick the switch for auto-public-ip on the subnets?12:35
frobwaredimitern, I'm not seeing a timeout though. machine state still in allocating.12:36
dimiternfrobware, no, but even if I did the flag is only honored when starting instances - not after they're running12:36
dimiternfrobware, is the instance running in the EC2 UI?12:36
frobwaredimitern, I was trying on my local account. HO?12:37
frobwaredimitern, yes instance is running12:37
dimiternfrobware, it might take 30m or so for apt-get retry script to give up I guess - I waited at least 30m with no change, but in the morning all machines shouled up as started12:38
dimiternfrobware, and it "worked" I guess just because I was deploying the ubuntu charm (which was pre-fetched by the apiserver and then the isolated machine got it from there - as usual), which doesn't need anything from the internet - wordpress I suspect won't work12:39
frobwaredimitern, so in the real world how is this supposed to / going to work?  service in the "private" subnet will need access on provisioning, installing packages, et al12:39
frobwaredimitern, so I was deploying the ubuntu charm, like we were doing yesterday.12:40
dimiternfrobware, in the real world we can do things like setting up squid-deb-proxy for apt + another proxy + nat + forwarding etc. on machine 0 (or another "public" machine)12:41
dimiternfrobware, the ubuntu charm is useful only for really simple tests - for more "real-world-like" tests, we need charms like in that bundle - scalable, with relations, config, etc.12:42
* dimitern needs to eat something - bbiab12:42
* frobware also needs to eat something too.12:44
frobwaredimitern, when bootstrapping a node with two NICs is it possible to configure which NIC gets selected?13:33
voidspacefrobware: no13:58
voidspacefrobware: that's why we need spaces13:58
frobwarevoidspace, :)13:58
voidspaceseriously :-)13:58
frobwarevoidspace, ok ok ook okkkkk13:58
frobwarevoidspace, I'm sold!13:59
voidspacefrobware: hah :-)13:59
frobwarevoidspace, I manually provisioned a machine with two NICs13:59
voidspacefrobware: right13:59
frobwarevoidspace, sent 'bootstrap-host: 10.17.17.117' in my environment.yaml13:59
frobwarevoidspace, then bootstrapped. Which indicates that the dns-name=10.17.17.11714:00
voidspacefrobware: so it's at least using the address you gave it14:00
frobwarevoidspace, however, both mongod and jujud are listening on all interfaces14:00
voidspaceright14:00
frobwarevoidspace, http://pastebin.ubuntu.com/12624701/14:00
frobwarevoidspace, whereas I was trying to coerce it to listen on the single NIC only.14:01
voidspacefrobware: yep14:02
frobwarevoidspace, OK answers my questions. thanks14:02
voidspacefrobware: not possible at the moment with a vanilla install14:02
dimiternfrobware, you mean in maas?14:02
frobwaredimitern, yes14:02
voidspaceAFAIK anyway...14:02
frobwaredimitern, well, no just maas14:02
dimiternfrobware, yeah - voidspace is correct actually :)14:05
voidspaceit does happen sometimes14:05
dimiternfrobware, one of the many goals of the model is giving you this sort of flexibility, while hiding the gruesome details :)14:05
voidspacedimitern: frobware: I'm just bootstrapping an EC2 environment with my fix in place (for the ec2 Subnets issue) to see if it actually works...14:05
voidspaceit should do...14:05
TheMuedimitern: btw, just recognized it. we still document the networks constraint as we still support this constraint. but shall I already remove it from the constraints documentation?14:06
cheryljwwitzel3: ping?14:09
wwitzel3cherylj: heya, in standup14:09
cheryljwwitzel3: kk14:09
dimiternTheMue, I think so, it should be dropped from the docs (as we're on that stage) and later from the code as well (I'm not too worried about this now)14:12
voidspacewell, the error is no longer in the logs - but the container still has a 10.0 address14:12
TheMuedimitern: yep, feels better to me so too, thx14:12
dimiternvoidspace, with the address-allocation feature flag set?14:13
voidspaceI thought so...14:13
voidspacegodammit14:14
voidspacemust be a different shell window14:14
voidspace*sigh*14:14
wwitzel3cherylj: ping14:17
cheryljwwitzel3: I heard that you've got experience using virtual MAAS?14:19
wwitzel3cherylj: yep14:19
cheryljwwitzel3: is there documentation somewhere on how to set that up?  What I've found seems to be out of date14:19
wwitzel3cherylj: did you sacrafice a chicken? first step ;)14:20
wwitzel3cherylj: yeah, one sec, I used the videos that Kirkland made, and they worked well for me14:20
cheryljwwitzel3: no, no chicken.  I've got some pigeons around here.  Will that work?14:20
=== natefinch-afk is now known as natefinch
wwitzel3cherylj: sorry, it was beisner who made them14:21
wwitzel3cherylj: https://www.youtube.com/playlist?list=PLvn2jxYHUxFlxNmc1dAbw524aoPmHxNpC14:21
cheryljwwitzel3: yay, thank you!14:22
wwitzel3cherylj: I've refered to them a few times, I just follow his steps and it has always worked14:22
wwitzel3cherylj: gl14:22
cheryljwwitzel3: thank you :)14:22
natefinchcherylj: just remember, wwitzel3 said it'll be really easy with absolutely no problems.14:25
cheryljnatefinch: so long as I remember the chicken14:25
wwitzel3that's the key14:25
natefinchcherylj: that must have been what I forgot when I was trying to do it at the sprint in Germany.14:26
natefinchnever did get it working14:26
dimiternfrobware, voidspace, I have a patched version of the gui which works and deploys the slightly modified bundle and respects spaces constraints!14:29
dimitern(writing down all the steps and will send them later)14:30
voidspacedimitern: awesome14:38
TheMuedimitern: great, sounds cool14:45
voidspacedimitern: yay, it worked this time14:51
voidspacedimitern: it's done properly (subnetIds also honours as well as instId) - just needs some tests14:51
dimiternvoidspace, you're the man! :) great14:55
aisraelHow does one go about getting something backported to 1.24.x? i.e., this fixes juju with osx 10.11, which comes out today: https://github.com/juju/juju/pull/296914:57
jcastrosinzui: heya, El Cap just went gold today, IMO we should probably send a mail to the list telling people they should be fine with 1.25.x14:59
jcastroany issues you think I should bring up?14:59
sinzuijcastro: 1.24.6 in homebrew is fine. I delievered the patch to them personally15:00
jcastroI saw that, that's why I wanted to mention it15:00
sinzuijcastro: 1.25 is a beta15:00
jcastrosinzui: I was meaning more like "this is the last time you'll have to care about this, future juju versions won't break on your beta OS."15:01
sinzuijcastro: I WONT say that unitl it is true15:01
jcastroheh15:01
jcastrook, I can not say that then.15:01
sinzuiel capitan is hardoded in 1.25. I read the coew15:01
mupBug #1501381 opened: panic: cannot pass empty version to VersionSeries() <blocker> <ci> <intermittent-failure> <juju-core:Triaged> <https://launchpad.net/bugs/1501381>15:11
alexisbmgz, ^^^ is this bug in master? or all branches15:17
mgzalexisb: master and feature branches off master15:18
alexisbmgz, ok thanks15:18
mgzalexisb: clarified the bug15:22
dimiternvoidspace, dooferlad, TheMue, frobware, you should've all received demo prep instructions15:22
TheMuedimitern: +1, great, thanks15:23
mgzalexisb: I'm not clear if it will only happen on maas, or if it's just our testing on maas that happens to hit this15:23
frobwaredimitern, received, queued (and not quite read). :)15:27
dimiternTheMue, frobware, cheers :)15:31
* dimitern is outta here ;)15:31
frobwaredimitern, thanks; great to see the demo coming along :)15:31
dimiternfrobware, yeah - I'm happy we won't be the only team not showing interesting stuff :D15:32
natefinchkatco: you had mentioned enabling worked for the lease feature tests.... where is that code?  I can't find it15:35
natefinchs/worked/workers/15:35
katconatefinch: let me tal15:36
katconatefinch: err... looks like they were deleted?15:37
katconatefinch: here: https://github.com/juju/juju/blob/1.22/featuretests/leadership_test.go15:37
natefinchkatco: lol, well, that explains why I couldn't find them :)15:37
katconatefinch: don't forget to submit your sick leave15:38
natefinchkatco: oh yeah, I'll do that right now15:38
mupBug #1501398 opened: stateSuite setup fails on windows with WSARecv timeout <blocker> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1501398>15:50
frobwarevoidspace, you still about? Regarding the multi-nic question from above: am I wrong in thinking that spaces should allow for: juju bootstrap --constraints mem,cpu,etc,spaces=my-network-with-nic-192.168.1.12315:58
mgzhm, it's not possible to be in more than one hangout at once16:01
alexisbmgz, ping16:02
mgzalexisb: omw16:02
frobwaremgz, it's odd though - you would think computers should be good at multitasking. :)16:03
mgzapparently not :)16:04
beisnero/ hi mgz -  fyi, i pulled thumper's binaries, re-ran loop, hit that bootstrap fail.  updated @ bug 133588516:30
mupBug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider>16:30
mup<security> <uosci> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>16:30
alexisbbeisner, thanks for the update16:31
mgzbeisner: thanks.16:31
beisneralexisb, mgz - yw.  thx for the focus on this.16:32
voidspacefrobware: still around16:37
voidspacefrobware: that question would be better directed to dimiter I think, but I don't see why that shouldn't work16:37
voidspacefrobware: hmmm... although thinking about it16:43
voidspacefrobware: our implementation of spaces is at the "juju model" level - which requires the state server to be in place16:43
voidspacefrobware: so making it work at bootstrap time will require making the client "spaces aware" (i.e. able to discover spaces and resolve constraints)16:44
voidspacefrobware: so it isn't going to work initially, would require specific work16:44
mupBug #1500843 changed: Windows ftb due to unused import is diskmanager <blocker> <ci> <regression> <windows> <juju-core:Fix Released by gz> <https://launchpad.net/bugs/1500843>17:35
mupBug #1501432 opened: BootstrapSuite tests fail on non-ubuntu platforms with no matching tools <blocker> <centos> <ci> <test-failure> <windows> <juju-core:Triaged> <https://launchpad.net/bugs/1501432>17:35
cheryljthanks for the quick review, cmars18:01
cmarscherylj, thanks for the bug fix18:10
mgzcherylj: where will that error propogate to exactly?18:33
mgzcherylj: I'm wondering if we're still not logging enough information to work out what the bad data actually is18:33
=== cmars is now known as cmars_noodles
mgzcherylj: (code change looks sensible regardless)18:34
cheryljmgz: it will cause the image to be ignored when we update stored image metadata18:35
cheryljmgz: I was thinking I should update the logging to indicate the ID of the ignored image18:35
mgzcherylj: sounds good to me - can be a seperate branch18:36
cheryljmgz: I'm going to include it in the branch that updates dependencies.tsv18:36
mgzcherylj: one thing that comes to mind from what you've found so far,18:39
mgzour maas has a windows image which will obviously not have an ubuntu series18:39
mgzhow that would cause panics some of the times but not others though I have no idea, so may be unrelated18:39
cheryljmgz: it shouldn't.  This panic was because we were trying to determine the version from a series of "" (empty string)18:40
cheryljmgz: if there's some data in simple streams that just doesn't make sense, (like having nothing for the version), we should ignore it18:42
cheryljerm, my previous comment should have been that we were trying to determine the series from an empty version18:43
cheryljI had it backwards18:43
mupBug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>18:50
mupBug #918386 changed: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>19:00
natefincharg.... I have a feeling the jujuconn tests are somehow mucking with the database in just the right way to break my worker19:02
mupBug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>19:03
mupBug #918386 changed: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>19:06
mupBug #918386 opened: config.yaml should have enum type  <charmers> <pyjuju:Triaged> <juju-core:New> <https://launchpad.net/bugs/918386>19:09
natefinchmy country for SOME of our code to have unit tests... you know, so they don't break when totally f'ing unrelated code is changed.19:12
marcoceppiwow, mup, calm down, the bug isn't that important19:12
natefinchahh, hmm I think I got it.  Interesting difference between a real environment and the test environment19:18
mupBug #1501475 opened: Status presents unnecessary MAAS API info for machines <juju-core:New> <https://launchpad.net/bugs/1501475>19:30
marcoceppiwhy can't I bootstrap local as root? ERROR failed to bootstrap environment: bootstrapping a local environment must not be done as root19:42
natefinchmarcoceppi: I forget, but it messes up permissions of certain things, and probably puts things in the wrong directories.  Why would you want to, anyway?19:45
perrito666natefinch: its not like local wont do that for you anyway19:45
marcoceppinatefinch: because I'm in an LXC container as root and I want to bootstrap as the root user19:45
perrito666marcoceppi: can you bootstrap local inside an lxc container?19:46
marcoceppiperrito666: well, I was going to find out (it's a LXD container, so should work)19:46
perrito666famous last words19:47
marcoceppiworst case scenario it doesnt' work19:47
marcoceppibut stopping me because I'm root makes me sad19:47
natefinchmarcoceppi: looks like it doesn't ;)19:47
perrito666marcoceppi: just adduser19:47
marcoceppiI get that19:48
marcoceppibut because of the way these mounts outside the system work I need to be root to access them anyways19:48
perrito666marcoceppi: sudo?19:48
perrito666but as a rule of thumb, any question matching with: "why .* local .*?" is answered with: because local provider sucks19:49
natefinch+119:50
marcoceppiperrito666: I know how to work around this, I'm saying it's silly that juju would stop me as the root user, it should try to detect sudo vs root to discourage old local provider behaviou19:50
marcoceppialso "lol its local so deal with it" isn't really a great answer19:50
marcoceppifurthermore, local bootstraps in a LXD container19:50
marcoceppican we get a LXD provider now plz19:51
perrito666marcoceppi: it was more in the tone of an apology than a mockery19:51
=== cmars_noodles is now known as cmars
marcoceppiI see19:51
perrito666local provider is the number one cause of my screweing my work computer during the past year or so19:51
natefinchmarcoceppi: funny you should ask19:53
natefinchmarcoceppi: moonstone started work on an LXD provider as of today19:53
marcoceppiperrito666: which is why I'm running it in a LXD container19:53
marcoceppinatefinch: yes, 1000 times yes, I will happily test anything you throw at me19:54
* natefinch screenshots for later19:54
marcoceppiI stand by my assertion! ;)19:54
perrito666marcoceppi: I could totally use a brief howto for what you are doing19:55
marcoceppiperrito666: well, if I could juju bootstrap local as root the howto would be way easier :P19:56
marcoceppiperrito666: I'll write a blog19:58
perrito666marcoceppi: tx19:58
thumperbeisner: that binary I created for you was me taking wild guesses at times, I'd like to tweak and get you to try again, keen?20:00
mupBug #1501490 opened: juju-local can't bootstrap as root user <juju-core:New> <https://launchpad.net/bugs/1501490>20:00
beisnerthumper, indeed20:01
beisnerthumper, i suspect 2s may not be enough, just based on observing nova compute, et al, after nova deleting an instance.20:01
thumperbeisner: how long do you think we need?20:01
beisnerthumper, i think it's variable, depending on the hardware, and load on that cloud20:02
beisnerthumper, how do we handle similar needs with other providers?20:02
beisnerie. is there an existing max_wait / retry_interval approach in any other provider?20:03
beisnerthumper, i'll do a little ditty on serverstack to see if i can measure timing20:07
thumperbeisner: awesome20:07
* thumper otp20:07
thumperbeisner: we handle similar things in other clouds terribly IO20:12
thumperIMO20:12
thumperwe should be treating many other cloud calls as retryable calls, but in most cases we don't20:13
beisnerthumper, ah i see.  so i think a max_wait and retry_sleep would work well.  it's a matter of how long you're comfortable blocking on destroy.20:17
thumperbeisner: you think having them configurable by config?20:18
beisnerthumper, i'd aim for a resilient default.  ie.  say ...  max_wait 30s, recheck every 1s or 2s.   but hold the line, i'm about to have data.20:20
beisner;-)20:20
beisnerbootstrap: http://paste.ubuntu.com/12626772/20:27
beisnerdestroy: http://paste.ubuntu.com/12626773/20:27
beisnernova instance: http://paste.ubuntu.com/12626774/20:27
beisnernova secgroup: http://paste.ubuntu.com/12626775/20:27
beisnerthumper, ^ checking and timestamping nova secgroups and nova instances as fast as apis will allow, while bootstrapping and destroying20:28
* thumper looks20:28
* beisner too20:28
thumperok, so 2s is no where near enough20:30
thumperbeisner: let me build you one with 30s max :)20:30
beisnerthumper, a-ok.  i'll put together a timeline from those ^20:30
thumpercopying files now20:31
thumperbeisner: it appears to be as small as instant, but as large as 4s20:32
thumperI'm doing 30s max with 1s retry20:32
thumper*should* be solid enough20:32
thumpergetting about  702.1KB/s up to chinstrap20:33
thumperbeisner: the binaries are up, in the same place as before20:34
beisnerthumper, timeline @ https://bugs.launchpad.net/juju-core/+bug/1335885/comments/1720:37
mupBug #1335885: destroy-environment reports WARNING cannot delete security group <amulet> <cloud-installer> <destroy-environment> <landscape> <openstack-provider>20:37
mup<security> <uosci> <juju-core:Triaged> <juju-core 1.24:In Progress by thumper> <juju-core 1.25:Triaged> <https://launchpad.net/bugs/1335885>20:37
beisnerthumper, ack, will pull bins20:37
beisnerthumper, fyi 3 iterations in.  seeing 3s, 11s, 5s between 'terminating instances' and 'command finished'  ... going to let that run.  i'm eod, but will prob check back in late evening.20:59
thumperbeisner: ok, cool21:00
beisnerthumper, thanks again!21:00
thumperwallyworld: before I merge this openstack retry branch22:34
thumperwallyworld: perhaps we should chat about exponential backoff?22:35
wallyworldthumper: ok, give me a minute22:37
thumperwallyworld: although, I'm tempted to land this and discuss the exponential backoff as part of a bigger picture provider retry system22:38
thumperas I'm starting with the 1.24 branch22:38
wallyworldsgtm22:38
thumperk22:38
* thumper does that22:38
wallyworldthumper: storageprovision/schedule.go22:38
wallyworldis the storage solution22:38
wallyworldthat we can discuss moving to utils22:39
wallyworldstorageprovisioner/schedule.go i mean22:39
thumperack22:41
axwfuuuuuuuuuuuuuuuuuu. sick of blocked master23:07

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!