[00:07] thumper: yup, this environment is now broken [00:07] it doesn't respond to any cli commmands [00:16] fwereade: still there? [00:43] davecheney: hmm... interesting [00:45] perrito666: I'd expect not, fwereade is probably sleeping, and if he isn't, he should be [00:49] thumper: normally i'd be expecting hte unit agents to be freaking out and restarting like crazy [00:49] but they are all connected, and just sitting there [00:49] weird [00:50] thumper: do you want to take a look [00:50] sure [00:51] what is your lp id name [00:51] ie, ssh-copy-id $WHO [00:51] thumper [00:51] :-) [00:51] ubuntu@winton-02:~/charms/trusty$ ssh-copy-id thumper [00:51] /usr/bin/ssh-copy-id: ERROR: No identities found [00:51] maybe i'm doing it wrong [00:52] yup [00:52] i was [00:52] thumper: machine is winton-02 [00:52] Hostname 10.245.67.2 [00:52] copy your .ssh/config stanza and replace the hostname [00:54] * thumper sshes in [00:57] davecheney: something here is lying [00:57] davecheney: I can status [00:57] and it tells me that the machine agent for 0 is down [00:57] which it isn't [00:57] becaues it responded to status [00:57] :-) [00:58] davecheney: you rebooted about an hour ago? [00:58] the three lxc machines are all showing as started [00:59] davecheney: juju ssh 1 works, and both the machine and unit agent are running according to upstart [01:05] thumper: yeah, but they don't do anything [01:05] i did juju remove-unit mysql/1 [01:05] and it's still there [01:05] * thumper is reading logs [01:05] it's liek status is jammed at some point in the past [01:06] 1016 juju status [01:06] 1017 juju remove-unit mysql/1 [01:06] 1018 juju remove-machine 3 [01:06] 1019 juju status [01:06] ^ did jack === marco-traveling is now known as marcoceppi [01:11] thumper: try to do stuff with that enviornment [01:11] hmm... [01:22] davecheney: none of the watchers are firing [01:32] urk [01:32] but they are poll driven, right ? [01:36] kinda [01:42] thumper: being told I have to go to the shops to get food for our family [01:42] afk for a bit [02:04] waigani, wallyworld_: axw has power issues, will be online later [02:04] ok [02:04] power as in electricity, not ppc64 [02:04] davecheney: hey [02:04] what's that concept that's not bundles but is bundles [02:04] but is like bundles [02:05] stacks [02:05] thanks thumper [02:05] np [02:06] marcoceppi: I had to turn aufs off by default with lxc-clone [02:06] marcoceppi: too many weird edge-cases [02:06] marcoceppi: it is however, btrfs aware [02:06] thumper: yeah, saw the release, but it will still use lxc-clone [02:06] so will use fast snapshots [02:06] woo [02:06] yes, still use lxc-clone by default [02:06] but lots of i/o to create a machine as it copies ~800M [02:07] * thumper goes to make a coffee [02:27] agent-state on new machines (except 0) is stuck on pending [02:27] trying to ssh into machine one fails: [02:27] ERROR machine "1" has no public address [02:28] all-machines log logs this error: [02:28] ERROR juju runner.go:220 worker: exited "environ-provisioner": no state server machines with addresses found [02:30] waigani: how long did you wait? [02:30] lxc-ls and "uvt-kvm list" list no containers [02:30] waigani: if you haven't done things before, it takes a while [02:30] waigani: you need to do 'sudo lxc-ls' [02:30] thumper 10min? [02:30] 'sudo lxc-ls --fancy' [02:30] still not up [02:30] ah, I'll try that [02:30] is it downloading the image? [02:31] if you haven't run the local provider before, it is downloading the cloud image [02:31] ah, there is a lot of network activity [02:31] tail the log [02:31] all-machines ? [02:31] sure [02:32] thumper: tailing and seeing activity - cheers [02:57] thumper: back [02:57] are you still using winton-02 ? [02:57] davecheney: I have wallyworld_ and waigani looking in to replicating this locally while I finish off a much needed patch [02:57] no, I'm out of winton-02 [02:57] thumper: ok [02:57] can I manually destroy that environmet [02:58] or do you guys stil need it [03:00] davecheney: no, kill it [03:00] I think we have enough info to work from [03:00] kk [03:01] thumper, davecheney: lxc-start/stop machine 1 -agent-status showed started/down as expected. Restarted my machine, machine 0 agent-status: started. [03:01] waigani_: which mongo are you using ? [03:02] waigani_: are you on trusty yet? [03:02] no :( (busted) [03:02] waigani_: recommendation: spin up an environment on ec2 [03:02] waigani_: can I get you to cause a change to the machine that has started up? [03:02] deploy cs:ubuntu/trusty [03:02] and use that to test [03:02] davecheney: MongoDB shell version: 2.4.6 [03:02] waigani_: just to confirm that the bits are hooked up [03:03] waigani_: dpkg -l | grep mongo [03:03] davecheney: 1:2.4.6-0ubuntu5 [03:04] waigani_: wrong version [03:04] you neet juju-mongodb [03:05] ah, how do I get that? [03:05] waigani_: 1. use trusty [03:06] hehe, okay [03:06] 2. sudo apt-get install juju-mongodb [03:06] tried that [03:06] could not find it [03:06] waigani_: it wont if you aren't using trusty [03:06] it may be available in the cloud archive if you are using a cloud image on hp cloud or ec2 [03:06] oh these are steps, not options [03:06] right [03:07] okay, so I need to update to trusty to debug [03:07] waigani_: i'd recommend deploying the ubuntu charm on a cloud [03:07] it's faster and less likely to ruin your afternoon debugging upgrde programs [03:07] problems [03:08] okay, do we have ec2 creds we can use? [03:08] waigani_: i can ask for a new ppc vm for you [03:08] probably take longer than today [03:09] then you can debug the problem at the source [03:09] davecheney: :D [03:09] waigani_: someone (not me) can give you the hp cloud credentials [03:09] that might be another solution [03:09] hp, okay [03:09] thumper: ? [03:09] i only say ec2 because I *KNOW* they have working trusty images [03:09] hp cloud, less certain [03:09] * thumper has no hp clould stuff [03:10] wallyworld_: ? [03:10] yes? [03:10] cummon folks, we're developing a tool to managed public clouds, and nobody has the credentials to test on the clouds ? [03:10] * wallyworld_ reads backscroll [03:10] wallyworld_: do you have HP creds you can share with waigani_ [03:11] waigani_: you need to be on trusty to get the juju-mongodb [03:11] hehe [03:11] i do. [03:11] okay, I need to update to trusty anyway... [03:12] davecheney: except it's my own user name with my password [03:12] best to get a new account added by asking antonio [03:13] he has a master account he can create sub accounts from i believe [03:20] i think antonio is still online [03:36] sorry, I was off downloading trusty, what is antonio's nic? [03:37] arosales: [03:39] msg hi arosales, I'm part of tim's team. I'm told you are the keeper of cloud credentials. Would I be able to get some for ec2 please? [03:39] * davecheney sadtrombine [03:39] doh forgot the / [03:39] waigani_: good thing you didn't include your payment details [03:39] lol - facepalm === vladk|offline is now known as vladk [03:57] thumper: on trusty, i can restart via juju run reboot and also via lxc-stop machine 1 and it shows and down and then started again and config changes seem to be propagated [03:57] i haven't retried rebooting the host yet [03:57] wallyworld_: the test case is [03:57] 1. use trusty [03:57] tick [03:57] 2. use juju-mongodb [03:57] 3. reboot [03:58] ok, 2 out of 3 ain't bad [03:58] i'll reboot [03:58] good bye cruel world [03:58] * davecheney plays revelry [03:58] wallyworld_: you're going to win! hmph [04:02] davecheney: well that didn't go so well, no bootstrap agents restarted after reboot [04:02] well [04:02] they are running but juju status fails [04:03] can't connect to state api port [04:04] actually, machine 1 agents are running but not machine 0 [04:09] wallyworld: so the containers came back up correctly, but not the host agent. What is "service juju-agent-wallyworld-local status" say? [04:10] as well as "service juju-db-wallyworld-local status" [04:10] jam: i checked those, db was running, agent was not [04:10] i started agent by hand [04:10] but stats still fails, loking into it [04:10] wallyworld: hopefully there is something in machine-0.log ? [04:11] waigani_: hey [04:11] nope :-( [04:11] hello [04:11] waigani_: sorry tab fail [04:11] was curious about wallyworld's status [04:12] wallyworld: http://askubuntu.com/questions/207143/how-to-diagnose-upstart-errors says "/var/log/upstart/JOBNAME.log" [04:12] ah found it [04:12] * thumper goes back to write more tests [04:12] wallyworld: found the failure [04:12] or the log file [04:12] mine has: /bin/sh: 1: /bin/sh: cannot create /var/log/juju-jameinel-local/machine-0.log: Directory nonexistent [04:12] the upstart script has the wrong log dir [04:12] which doesn't bode well, I don't know why it would be trying to log there. [04:12] so the job fails [04:13] WTF? [04:13] mine also says [04:13] /home/ian/jujulocal/tools/machine-0/jujud machine --data-dir '/home/ian/jujulocal' --machine-id 0 --debug >> /var/log/juju-ian-local/machine-0.log [04:13] sad trombone [04:14] and /var/log/juju-ian-local doesn't exist? [04:14] nope [04:14] i have a ~/jujulocal/log [04:15] which is where the logs were written before i rebooted [04:15] so its just the upstart script that is wrong [04:15] ah... [04:15] I know what it is [04:15] the upstart script is being rewritten [04:15] with the wrong log dir [04:15] yep :-) [04:15] i fucking knew it [04:15] it shouldn't use the log dir from the agent [04:15] as that is wrong [04:15] wallyworld, thumper: I see the same thing in my upstart, it is trying to redirect to /var/log/$STUFF but we should only be writing to /home/jameinel/.juju/local/$STUFF [04:15] it isn't what is used by the local provider [04:16] I have to finish this branch [04:16] thumper: you mean it is using the $DIR that is bind mounted inside the LXC ? [04:16] well, machine-0 can't be bind mounted [04:16] jam, wallyworld: hangout? and I can explain what I think is the problem [04:16] then I can go back to work [04:17] sure, i think i can find it anyway [04:17] but let's talk to be sure [04:17] thumper: technically, I'm not working and I have to go have breakfast, but feel free to chat with wallyworld [04:17] haha [04:17] kk [04:17] thumper: https://plus.google.com/hangouts/_/calendar/am9obi5tZWluZWxAY2Fub25pY2FsLmNvbQ.mf0d8r5pfb44m16v9b2n5i29ig?authuser=1 [04:17] calendar? [04:25] * thumper puts head back down [04:27] davecheney: sooo, there is one showstopper - juju machine agent update script is wrong for local provider, so after a reboot, no machine 0 agent for you. that doesn't explain everything you are seeing perhaps, but that's the focus right now to be fixed [04:45] wallyworld: https://codereview.appspot.com/78660043 [04:45] looking [04:45] thumper: i've fixed the issue, i think. gotta write a test and test live [04:45] ugh [04:45] I've missed a test [04:45] wallyworld: ack, and very cool, thanks [04:45] * wallyworld will wait for test to be added [04:47] lboxing now [04:47] * thumper waits... [04:48] thumper: did you now destroy env for local doesn't remove the upstrat scripts? [04:48] know [04:48] wallyworld: it should [04:48] well, it didn't just now [04:48] that's a bug... [04:48] yeah [04:49] i'll file 2 bugs [04:49] it should use the same mechanism that manual provider uses [04:49] to remove the script [04:49] it is possible that manual agent removal is broken too [04:49] wallyworld: test updated and pushed [04:49] 1 bug for upstart script creation for local, one for removal [04:49] looking [04:49] wallyworld: the only drive-by there, is a change of an existing job [04:49] from "host units" to "all" [04:49] ok [04:50] as the lock dir is needed on all machines [04:50] * thumper thinks [04:50] oh... ick [04:50] * thumper thinks some more and looks at code [04:51] ugh [04:51] I remember why we did it... but it is icky [04:51] the only machine that is not "host units" is a local provider state machine [04:52] as normal state machines also host units [04:52] so we don't try to run it on machine 0 for local [04:52] which is right [04:53] hmm... [04:53] actually doesn't hurt [04:53] because it uses agent dir [04:54] and it only tries to chown if there exists a /home/ubuntu [04:54] so I'd rather keep it as "all machines" as it better describes what is intended [04:56] sounds reasonable [04:59] thumper: my fix works and we now have a valid all-machines-log symlink :-D [05:00] it wasn't valid before [05:00] so looking good, Vern [05:01] thumper: sadly, i changed the code and no tests failed :-( [05:01] :( [05:02] possible to write a test to save us next time? [05:02] yep, that's the plan [05:02] i always write a test when fixing a bug [05:02] that's because you're AWESOME! [05:02] * wallyworld blushes [05:04] thumper: that destoy env thing - i used --force cause machine agent wasn't running. it left behind upstart scripts as well as mongo process etc. so not really a show stopper i guess [05:04] wallyworld: we already have a bug for making --force clean up more [05:04] lets just make sure we do it next week [05:04] yep [05:05] * thumper has hit EOW [05:05] I've approved that branch and hope it lands [05:06] later folks... === vladk is now known as vladk|away === vladk|away is now known as vladk [05:26] waigani_, hello [05:26] sorry I missed your ping [05:26] arosales: hello :) [05:27] I was looking for some ec2 creds. Are you the right person to talk to? [05:27] waigani_: hp man, hp [05:28] arosales: s/ec2/hp [05:30] waigani_, hp I can do let me get those to you. [05:35] ok, lxc-clone: true is like, the best thing EVAR. You guys have made me a non-trival % more productive. 2 months before review time too! Thanks! :D [05:45] no power and no coffee makes axw go something something === vladk is now known as vladk|away [05:47] sigh, gotta restart - hotplugging display link doesn't seem to work since too well since I upgraded === vladk|away is now known as vladk [05:58] axw: what's going on? [05:58] waigani_: with hotplugging? [05:59] thumper said you lost power or something today? [05:59] waigani_: oh yeah, power outage all morning [05:59] ha, that sucks for you [06:00] pen and paper coding... [06:00] I had like 10 minutes left on my laptop before it came back on [06:00] oh nice [06:00] couldn't do much without coffee though ;p [06:00] haha [06:01] caffeine dependence is the price for flavoursome mornings === vladk is now known as vladk|offline [07:58] fwereade: i'm off to soccer. here's a small mp that fixes a critical 1.17.7 issue to do with local provider logging and upstart config https://codereview.appspot.com/78730043 [08:05] wallyworld: what makes you think it's incorrect? thumper found that rsyslog has an apparmor profile that only allows it to write in /var/log/... [08:05] * axw plays with it [08:05] axw: the symlink was wrong [08:06] the branch fixes it [08:06] ok, I'll take a look. [08:06] the symlink in ~/.juju/local/log didn't point to /var/log/... [08:07] the upstart file was also wrong so that a reboot didn't restart the local machine agent [08:07] * wallyworld -> soccer [08:07] later [08:15] mornin' all [08:20] morning === vladk|offline is now known as vladk [08:34] good morning [08:42] morning vladk [09:05] waigani: I've already done the local provider Destroy vs. broken environments (except for one last thing which I'm doing now), so reassigning the card to myself [09:06] kk [09:35] hello [09:36] * davecheney waves to wwitzel3 [09:38] hey wwitzel3 [09:43] davecheney: did you got to the bottom of why local on ppc doesn't survive reboot? [09:56] axw: not yet [09:56] it's hard to reproduce the problem [09:57] it might be juju-mongodb [10:03] davecheney: ok, just wondering if it was connected to the bug wallyworld raised [10:04] (#1295501) [10:04] <_mup_> Bug #1295501: local provider upstart script broken [10:24] mgz, hey [10:25] mgz, how is it going with the state changes? [10:27] good morning [10:27] fwereade, hey [10:28] dimitern, heyhey [10:28] fwereade, re state changes for vlans [10:28] dimitern, yeah [10:28] fwereade, you're thinking of having 2 new collections - serviceNetworks and machineNetworks? [10:28] fwereade, or the latter will just be a couple of fields in the machine doc? [10:28] dimitern, yeah -- serviceNetworks needs NoNetworks, machineNetworks just needs Networks [10:28] morning perrito666 [10:29] dimitern, I'm generally against extending entity documents [10:29] dimitern, we have a history of screwing up watcher behaviour by doing so [10:29] fwereade, why just networks for machines? [10:29] fwereade: turns out the rsyslog thing has actually changed, apparently restore is broken, but since tests didn't pass before we did not know [10:29] fwereade, we should list both included and excluded ones i think [10:29] axw: thanks for the mail :) [10:29] dimitern, because the machine stuff is the record of reality, while the service stuff is the specification [10:30] dimitern, it's like hardware characteristics vs constraints [10:30] fwereade, so to get both we need to fetch the service's excluded networks and use the machine's included networks [10:31] perrito666: no worries [10:31] dimitern, ah-ha, thank you, something has crystallised in my mind [10:31] fwereade, yeah? [10:31] dimitern, so, looking forward, we'll want to be able to add machines with net/nonet specifications [10:31] fwereade, not so forward even [10:32] fwereade, i thought that was one of the basic features we're aiming to have for maas [10:32] dimitern, at the moment we take that info purely from the assigned units [10:32] dimitern, we kinda elided that in favour of servce-only [10:33] dimitern, but the forces in play are actually the same as for constraints [10:33] dimitern, are you familiar with machine constraints? [10:33] fwereade, hmm.. but how about the networker worker - where will it record what networks it started/not started for a machine? and where to get which ones to process in the first place? the service? [10:33] fwereade, not that much [10:34] dimitern, ok, so when we create a new machine for a unit to live on, we record the constraints in play (env/service combination) and subsequently use those when provisioning the machine [10:35] dimitern, this is a bit different to the model we thought we'd have for networks, but it shouldn't be, I think [10:35] fwereade, yeah - so we compute the effective set and save it with the machine [10:35] dimitern, exactly [10:35] dimitern, same deal [10:35] dimitern, and this means that it's trivial to create a machine without units, but with net/nonet specification, and store that directly [10:35] dimitern, so in fact my "call it serviceNetworks" thing on mgz's review was wrong [10:36] fwereade, why would you do that? [10:36] fwereade, net/nonet spec should always go with a service [10:36] fwereade, well... except in case "i know what i'm doing, just give me a machine like that" [10:36] dimitern, people do sometimes like to create machines ahead of time and leave them idle, even if they know what they will want to do with them in future [10:36] dimitern, yeah [10:37] fwereade, why was your comment about serviceNetworks wrong? [10:37] dimitern, because we need to store net/nonet data for both services and machines [10:38] fwereade, true [10:38] dimitern, (mgz, perrito666): and this means we want globalKey ids, not serviceName keys [10:38] fwereade, what will the key be? either serviceName or machineId ? [10:38] dimitern, (mgz, perrito666): and *then* subsequently we want to store, separately, what-the-machine-actually-got, analogous to HardwareCharacteristics [10:38] dimitern, it should be the entity's globalKey [10:39] dimitern, like constraints/settings/any other collection that assoicates with multiple entity types [10:40] fwereade, i see [10:40] fwereade, sgtm [10:41] fwereade, i'll look some more into constraints / hardwarecharacteristics [10:46] dimitern, fwereade, mgz: standup? [10:46] wallyworld: ^ [10:47] jam, ^^ [10:48] oh, he's probably off today right [10:52] hey, can I bzr switch to a tag? [11:14] natefinch: I merged in trunk and fixed some conflicts on my copy of 030-MA-HA and I'm pushing that up now [11:15] natefinch: I need a copy of your latest fixes though, I don't have the environment fixes [11:22] rogpeppe: grabbing coffee, stretching my limbs and returning === vladk is now known as vladk|lunch [11:26] natefinch: actually you should probably merge your stuff with trunk before pushing it up, since you will actually have all the changes and I'll just have to merge again when I pull your branch [11:27] wwitzel3: ok === psivaa_ is now known as psivaa [11:54] axw: you still around? [12:05] wallyworld: I am now [12:05] axw: the reason the upstart script failed is because the output file is is being redirected to doesn't exist [12:06] jam had the same problem [12:06] wallyworld: why doesn't it exist? [12:06] cause local provider creates a log file elsewhere [12:06] not in /var/lib/juju [12:06] /var/log [12:07] my changes fix the issue and also repair the brokem symlink [12:07] wallyworld: see my comment in launchpad - it's not broken on my system. so there's something more at play here [12:07] ie all-machines.log -> /var/log/juju-ian-local/all-machines.log [12:07] wallyworld, it /var/log/juju- full path: /log/all-machines.log -> /var/log/juju-ian-local/all-machines.log [12:08] the above was broken [12:08] wallyworld: machine-0.log is written into ~/.juju/local/log. all-machines.log is written into /var/log/juju- [12:08] wallyworld: and symlinks in either direction [12:09] yes [12:09] mgz, i'll take over your lp:~gz/juju-core/networks_state_doc branch to finish it off and land it, cause it's blocking 2 of my other branches [12:09] i agree with the first bit [12:10] rsyslog writes to /var/log/juju-blah/all-machines.log [12:10] dimitern: it's done, I'll land [12:10] the symlink in juju local log points to that [12:10] mgz, ah, good morning :) [12:11] before my changes the symlink pointed somewhere that didn't exist (can't recall where now) [12:11] well that is not the case on my machine [12:11] and my changes also fix the broken upstart script [12:11] I don't know what you mean about the upstart script being broken [12:11] wonder why it's differnet for you? it was broken for me john dave [12:11] that is also not broken on my machine [12:12] dimitern: hm, I need an lgtm still [12:12] the upstart script is broken because it redirects output to a non existant log file [12:12] hence it fails [12:12] and machine agent doesn't start [12:12] mgz, dimitern: can you scroll back? I diuscussed some stuff that needs to be a touch different with dimitern [12:12] I don't know, but we need to get to the bottom of it. because the change you proposed will break in another way when we do the agent.conf LogDir->rsyslog worker change [12:12] mgz, can you propose your changes from last reviews, i'll take a look [12:13] mgz, if you would land that joyent branch instead I would be most grateful [12:13] wallyworld: right, so the same issue in both cases [12:13] wallyworld: i.e. the root cause is that the log file doesn't exist [12:13] mgz, yeah, and look at the scrollback [12:13] wallyworld: sounds suspiciously like permissions. did you change root-dir at all? [12:13] k [12:13] not that i know of [12:14] morning all [12:14] morning [12:14] fwereade: ah, okay [12:14] axw: i'm not recalling the rsyslog worker change you mention [12:14] wallyworld: is this on your machine, or on ppc? [12:14] my machine [12:14] er... so, I'm not sure how to untie it from servie [12:14] wallyworld: the bit about propagating MachineConfig.LogDir -> agent.Config -> worker/rsyslog [12:14] i think it failed for dave on ppc, not sure [12:14] wallyworld: atm worker/rsyslog hard codes /var/log/juju- [12:15] ok, i didn't realise we were doing that [12:15] wallyworld: it will need to, to support debug-log on local [12:15] makes sense to change it [12:15] axw, actually it's worse - it hardcodes agent.DefalultLogDir, which is /var/log/juju/ [12:16] dimitern: yeah, I think log/syslog tacks on the namespace [12:16] * axw checks [12:16] so, weird that it works for some but not others it seems [12:17] mgz, i have some ideas and we discussed the way forward with fwereade, so if you're willing, just propose what you have so far and I can take it over and finish it, so you can do the joyent change [12:17] fwereade: in fact, apart from undoing hte renaming I've done, I'm not sure what you want, it already uses the same keys as constraints etc [12:17] axw: so this was the upstart bit that was wrong [12:17] /home/ian/jujulocal/tools/machine-0/jujud machine --data-dir '/home/ian/jujulocal' --machine-id 0 --debug >> /var/log/juju-ian-local/machine-0.log [12:17] machine 0 log should not be in /var/log [12:18] it is in juju local log [12:18] wallyworld, it's not, it's symlinked there from local log dir [12:18] that's what my change does [12:18] wallyworld: right, but there's a symlink... [12:18] not on the broken systems [12:18] the only symlink i had/have is the all machines one [12:18] which was also wrong [12:19] wallyworld, and it really should be in /var/log/juju-/, because rsyslog have access only to /var/log/ [12:19] wallyworld: so you have stuff going to /home/ian/jujulocal rather than /home/ian/.juju/local, which suggests root-dir has been changed [12:19] wallyworld: I'll see if that has anyhting to do with it [12:19] yep [12:19] i did change root dir [12:19] but john didn't [12:19] and it broke for him the same way [12:20] wallyworld, was there 1.16->1.17 upgrade involved with this local env my any chance? [12:20] no [12:20] s/my/by/ [12:20] i just started a local provider from turnk [12:20] ah, ok [12:20] fwereade: the joyent storage branch has conflicts [12:21] wallyworld: still works when I do that... [12:21] hmmm === vladk|lunch is now known as vladk [12:21] * wallyworld shrugs [12:21] dimitern: axw: i just got back from soccer and need to go eat. i'll bbiab [12:21] the bigger issue is I'm not sure if dstroppa wants the move of gojyent in or not [12:21] question [12:21] juju-mongodb is broken on 14.04 [12:21] wallyworld: nps, I'll keep investigating [12:21] I'm thinking of building from source just so I can make use of my workstation [12:22] I guess I just try to land as is, and we can fiddle with deps later [12:22] do I need to remove juju-mongo from my path until things are cleared up? [12:23] rather, juju is broken wrt juju-mongo... [12:24] fwereade: I am unable to replicate bug 1294776 on my local MAAS even after upgrading the provider and node to 14.04 [12:24] <_mup_> Bug #1294776: No debug-log with MAAS, 14.04 and juju-core 1.17.5 [12:24] jamespage: ^ [12:24] dimitern: btw, worker/rsyslog does do the appending of namespace to logdir (at the bottom of newRsyslogConfigHandler - a little bit non-obvious) [12:25] wwitzel3, I see that on openstack as well btw [12:25] just updated that bug [12:27] jamespage: did you see my comment? [12:27] axw, it does have rsyslog-gnutls installed [12:28] (the paste was from 1.17.6 openstack deploy) [12:28] jamespage: thanks, hadn't seen the paste [12:29] axw, only just did it :-) [12:29] jamespage: and the ls -l please, if you didn't see that [12:29] * jamespage is testing 1.17.6 prior to asking the release team for an ack on an upload [12:30] axw, that paste is foobar [12:30] (the apt-cache policy one) [12:30] wrong box [12:30] heh, so I see [12:30] axw, pasted right one this time! [12:31] hrm, ok, doesn't really shed any light unfortunately [12:33] wallyworld: when you get back, can you please destroy your env, remove any /var/log/juju*, remove ~/jujulocal, and bootstrap with --debug? [12:34] then pastebin that and ~/jujulocal/log/cloud-init-output.log [12:34] wallyworld: just gonna dist-upgrade and see if that's it [12:34] jamespage: thanks for the update [12:35] wwitzel3, np - if you want me to poke anything else just ping me here - don't always look at bugs straight away [12:40] wallyworld: no difference for me after dist-upgrading, so it's not a new policy... [12:40] axw: trying again now, just deleting stuff [12:44] axw: deleting all that stuff and trying again, seems to have fixed it. there's now a /var/log/juju-ian-local/machine-0.log symlink and the all machines one is correct [12:44] must have been left over root stuff [12:44] hmm [12:44] stange [12:45] but bad that we didn't error [12:45] strange even [12:45] i think we need to do a local 1.16 or 1.17 set up and destroy and then try again with 1.18 [12:45] I'd like to repro... there's obviously something that needs to be fixed [12:45] ok [12:46] I'll try that in a bit [12:46] ok, i'm a bot too weary to do much more tonight [12:46] bit [12:46] nps, I will dig in and let you know how I get on [12:47] ok, i'll update the bug [12:52] confirming juju-core works on 14.04 with source build of mongodb [12:57] wallyworld, axw, btw I have this handy cleanup-juju script to obliterate mercilessly any remnants of a local environment: http://paste.ubuntu.com/7130453/ (call as sudo cleanup-juju and change localenv to your envname - preferably unique for pgrep's sake, and obviously dimitern to your username) [12:58] cool, will be handy thanks [12:59] sometime you need to run it twice in a row if the agents/mongodb are doing stuff to clean up all [12:59] thanks dimitern [13:00] * dimitern thinks it's about time for another "snippet" type blog post [13:16] gah, damn criss-crosses [13:17] I think I;m screwed [13:19] lunch [13:21] mgz: if you have a moment at some point, i'd really like to find out why the 'bot keeps getting "lock held" messages, which break the config-changed hook [13:21] mgz: sample message: Unable to obtain lock held by go-bot@bazaar.launchpad.net on taotie (process #21172), acquired 4 seconds ago. [13:21] <_mup_> Bug #21172: libnotify pops balloons on top of fullscreen window (e.g. screensaver ) [13:21] rogpeppe: because it's doing things [13:21] we talked about this the other day, you can manually take the lock on the machine, but it should mostly be harmless [13:24] fwereade: so, I;m being partly hosed on the joyent storage landing because ian/daniele landed some of the history, without the changes, as an unrelated bug fix in r2401 [13:24] which is requiring some major unpicking [13:24] mgz, ouch [13:25] mgz, btw, anything significant in the call yesterday? [13:25] no, didn't happen as neither antonio or the other joyent guy could make it [13:25] wanted to catch up with daniele but missed him === psivaa is now known as psivaa-afk [13:40] jamespage, do you have any advice for Bug #1295609 [13:40] <_mup_> Bug #1295609: Unable to bootstrap local provider, missing juju-mongodb dependency [13:41] I was just hit by that after the last update - had to install juju-mongodb to be able to bootstrap [13:42] dimitern, I was misguided. I restored the previous packaging rule believing juju *preferred* juju-mongodb, but would fall back to mongodb-server [13:42] Without fallback we need to create different packaging rules for the series. [13:43] sinzui, yeah seems so [13:43] well, maybe jamespage has a clever idea [13:59] sinzui, I've switched to have only juju-mongodb in the packaging to enforce the dependency [13:59] but I've not got a release team ack on that yet [14:00] wwitzel3: I made the fix William suggested in the meeting and I merged from main, rerunning the tests now [14:00] jamespage, you did that for just the trusty packaging? [14:00] sinzui, yes [14:00] natefinch: great, is that pushed up? [14:00] sinzui, I've not got an ack for the upload just yet [14:00] hmm, in my case mongosb-server would be listed as removable, but I shouldn't do that if I have an env up [14:01] sinzui, obviously that breaks on older releases so its not a clean backport [14:01] wwitzel3: the fix william suggested is (though there's one syntax error that I missed somehow.... but it'll be obvious). After the merge I'm getting compile errors in the build, so I'm figuring those out now [14:01] sinzui, indeed it would [14:01] jamespage, I can update the packaging in a few hours for trusty users of the ppa. [14:02] sinzui, OK _ hoping for an ack on the 1.17.6 upload [14:02] I tested on local, maas and openstack OK [14:03] jamespage, oh, I was pondering using this on HP cloud to create a 5 node maas for Juju CI http://manage.jujucharms.com/~virtual-maasers/precise/virtual-maas [14:03] jamespage, Do you think it would work? [14:03] sinzui, ah - we might have a good plan on that [14:03] sinzui, rharper in my team has hacked together a openstack integration for MAAS [14:04] sinzui, so he can have MAAS control openstack instances that netboot and install like regular nodes [14:04] jamespage, He has been updating http://manage.jujucharms.com/~canonical-ci/precise/virtual-maas [14:05] sinzui, yeah - he started looking at that stuff [14:05] sinzui, and then he and smoser hacked on this approach as well [14:05] lemme ping him [14:05] sinzui, the nice thing about this is we could just charm it all up and deploy multiple different series on serverstack as test environments [14:06] sinzui, hp cloud doesn't expose nuetron does it? [14:06] the limiting factor for all of this is that you have to get a second network from the primary. [14:06] because you're not going to successfully run a dhcp server on your primary interface. [14:07] rogpeppe: what's with this error? It's not the normal syntax error string... there's no filename or anything: src/launchpad.net/juju-core/cmd/juju$ go test [14:07] # testmain [14:07] launchpad.net/juju-core/testing/testbase.PatchEnvPathPrepend(0): not defined [14:07] type.launchpad.net/juju-core/testing/testbase.Restorer(0): not defined [14:07] FAIL launchpad.net/juju-core/cmd/juju [build failed] [14:07] it will work on canonistack or server stack, but i could'nt find a public cloud that exposed both neutron and used kvm. [14:07] perhaps you could make this work on xen hvm (rackspace). [14:07] smoser, yeah, the network has been a factor in other things I need to test. [14:08] natefinch: waigani was getting that, had to delete the pkg dir [14:08] axw: I thought it might be something like that. Thanks. [14:08] nps [14:09] but what rharper has done is really REALLY cool. [14:09] if we could get a public cloud that had kvm and neutron, then anyone with a credit card could trivially deploy maas [14:09] and it tests pretty much all of maas (from pxe boot to power control) [14:11] smoser, :/ so I probably need to wait, but with some research I might find a cloud that will work [14:11] smoser, I am not familiar with joyent's setup, might that be viable when they are available? [14:12] i went looking a few days ago. [14:12] and didn't find anything really. [14:12] the closest to having hte necessary pieces are EC2 and rackspace that i know of [14:12] and both of those are xen. [14:12] which adds a wrinkle. [14:12] and you can't specify '--kernel' on EC2, so thats another wrinkel [14:13] thank you very much smoser. At least know the criteria to pull this off. Not next week though as I hoped [14:13] * sinzui copies conversation [14:14] on a "full openstack" like you get on canonistack or server stack, its doable. [14:14] and really very cool to see. [14:24] mgz, you've got a review on https://codereview.appspot.com/77270046/ [14:24] jamespage: sinzui re:virtual-maas charm; I'm currently using the non-juju mode of it; I've got an update to setup-maas and pulled in a few scripts (partially based on the maas-to-libvirt-tools) to automate the creation/registration of the nodes [14:25] rharper, great news. That would be more helpful to me [14:26] rharper, I hesitate to run a 5 node vmaas on canonistack. I moved most of juju-ci off that cloud because we were using most of the resources [14:28] sinzui: I've got a deadline I'm working toward next week, but after that, I can push a branch against virtual-maas charm with the changes for review -- I have a branch, but not pushed against upstream maas; though I'm currently applying the nova power type patch in-line in virtual-maas charm since we're working with 1.4.X maas out of the cloud-archive === psivaa-afk is now known as psivaa [14:29] * sinzui nods [14:31] wwitzel3: well, frig, the fakeuploadtools thing was working before I merged from main :/ [14:33] natefinch: I'm in a similar boat, I can't bootstrap a 14.04 node on maas with trunk .. it was working before I pulled [14:33] wwitzel3: I just pushed up a merge from main. [14:33] wwitzel3: it doesn't fix anything, but at least we'll both be broken with the same code [14:34] natefinch: k, will pull it down .. is the fakeuploadtools broken in a test as well? or just when trying to use local provider? [14:39] wwitzel3: the fake upload tools stuff is not related to the local provider. There seemed to be some confusion about that in the standup. The local provider is broken. What was fixed yesterday was just being able to run the bootstrap tests we'd been working on. Of course, now those tests are broken again [14:41] natefinch: ok, that was my fault [14:42] ah I finally caught the bug :D [14:42] rogpeppe: have a sec? [14:42] natefinch: I will take a peak at these tests, see what I see [14:49] wwitzel3: there's a panic in bootstrapsuite not being able to find the tools , even though the first thing we do is upload the tools. I tested that before I merged from main and it worked. I don't know why it's different now [14:49] rogpeppe: part one of removing things we don't need to do now that we have synchronous bootstrap :) [15:39] natefinch: tried to do a bisect for more details, but sadly it doesn't know how to step through the commits that make up the merge, so it just tells me that r2288 is bad and gives me the diff between that and r2286 .. lol, not helpful [15:46] dimitern: on your review, I can revert the naming easily enough, but I think I need your help understanding some state sublties [15:46] if you have a mo to walk me through some things [15:52] dimitern: er, you porbably missed that, I'll requery [15:53] mgz, yep, my internet started acting funny [16:04] natefinch: fixed the BootstrapSuite [16:04] wwitzel3: wow sweet. What did you do? [16:05] natefinch: we were uploading the tools to newly created stor and then never doing anything with it. So I changed it to just upalod the tools to env.Storage() [16:06] natefinch: pushed [16:06] natefinch: that was a lie .. I can't type my password right .. pushing now [16:08] natefinch: ok, you can grab it lp:~wwitzel3/juju-core/030-MA-HA [16:08] wwitzel3: you shouldn't need to type your password every time.... I haven't typed mine in months... not even sure what my password is, to be honest [16:08] natefinch: i know, i know, I just haven't setup an agent yet [16:09] natefinch: I type mine every time, default ubuntu server install lacks an agent and I am too lazy to set up one [16:11] can't you just get it to use your launchpad key? maybe I set up an agent and it's been too long, I don't know [16:12] natefinch: I get a decent linux work machine this afternoon :) Ill take the time to set up my workspace correctly [16:13] natefinch: my lp key has a passphrase, I'm not typing my lp password [16:13] wwitzel3: oh, I see. Yeah, I don't think I put a passphrase on mine. Too lazy. [16:14] natefinch: ah, yes, same here than wwitzel3 [16:14] natefinch: we are differents kind of lazy [16:14] I only use my LP key for lp, so no one could do anything with it except commit as me ;) [16:15] wwitzel3: thanks for the fix. I must have messed it up when I converted to using upload tools, I swear it worked at one point, but maybe I'm just crazy :) [16:15] natefinch: could of been an artifact of the merge [16:15] eval `ssh-agent`; ssh-add [16:15] is not very hard... [16:16] * perrito666 crafts a particularly egregious commit that angers all of the most violent devs... using natefinch key [16:16] amazon restored [16:16] restore missing state server [16:16] fwereade: :D it worked [16:17] mgz: thanks, done :) [16:17] * fwereade cheers at perrito666 [16:18] natefinch: ok, so now it is just local provider? [16:31] dimitern, fwereade: state networks branch should be good to go, you may just want to look over last changes/comments [16:42] wwitzel3: sorry, had to go for a bit for some family stuff. back now. Yes, just local provider [16:42] dimitrin: do you have a moment to look over https://codereview.appspot.com/78660045/ ? [16:42] dimitern: do you have a moment to look over https://codereview.appspot.com/78660045/ ? [16:47] wwitzel3: well, I found the source of one bug, possibly several... environs/cloudinit had its own constant for the mongo service name [16:48] vladk, reviewed [16:48] wwitzel3: er environs/cloudinit.go [16:48] and this is why you only define constants in *one* place [16:49] wwitzel3: local provider was also creating its own mongo service name [16:50] wwitzel3: in provider/local/environ.go [16:50] wwitzel3: I haven't fixed anything yet, just finding some problems so far [17:01] natefinch: ok wil start poking around from there [17:14] wwitzel3: I'm working on basically removing all reference to MongoServiceName everywhere, no one needs to know it except the thing creating the upstart service. I added a RemoveService() function to the mongo package to replace one place where we were using the name outside that package [17:22] hey, could anyone with bash fu take a look at https://codereview.appspot.com/78870043 ? [17:22] thank you [17:28] perrito666: I do not fulfill the requirement :) [17:28] mgz: my fault, I just committed go code and yet I request a bash dev to check it :p [17:30] perrito666: sorry, my bash is possibly worse than my spanish, and I haven't taken spanish in 20 years. [17:30] we need scott or someone :) [17:30] natefinch: well I havent taken spanish in about 13 yrs and I am pretty good at it :p [17:31] living is south america is cheating [17:31] then again, living in the US should be cheating too [17:31] *in [17:31] perrito666: I can ask where the bathroom is, and that's about it. If bash involves more than calling commands and optionally piping into other commands, I'm out [17:32] well `find . -iname "bathroom"` [17:33] I never understood why they didn't write the command so I could just say find bathroom like a normal person. [17:33] or possibly find bathroom . [17:34] perrito666: looking [17:34] natefinch: normal people dont use find to find out where the bathroom is [17:34] perrito666: I've never been accused of being normal :) [17:52] perrito666: reviewed [17:52] natefinch: bash is a ridiculous shell [17:53] rogpeppe: lovely thank you [17:54] rogpeppe: this is why I don't write shell scripts. Why write in some wacky language when you can write in a real language? [17:54] natefinch: 'cos the basic shell syntax is almost perfect for human use [17:55] natefinch: and pipes are awesome [17:55] rogpeppe: yes, just humans who are not suitable for bash [17:55] rogpeppe: [[ $agent = unit-* ]] && [ -d "$agent/state/relations" ] does not do the same as my test [17:55] rogpeppe: the only part of the shell syntax that seems at all logical is pipes [17:55] some people practically only use bash [17:55] i only use rc [17:55] relations will exist, its the folders under it that wont [17:55] bodie_: some people sleep on beds of nails, too. [17:55] I worked at digitalocean with a guy from IBM who pretty much only knew Bash, but muddled along with other languages when necessary [17:56] i tried to reason with him [17:56] :( [17:56] perrito666: i think my test is equivalent to yours [17:56] perrito666: (your test only checked that ls could read the relations directory, AFAICS) [17:56] pretty much any time I need a conditional statement, I drop into a real language (python or go, pretty much, and my python's getting rusty) [17:57] who needs conditionals when you can have one-line perl maps from hell? heh [17:57] bodie_: I also never write regexes unless someone is twisting my arm [17:58] bodie_: which pretty much puts perl off limits, which is fine with me [17:58] rogpeppe: nope, ls -A folder/ lists all inside it excepting . and .. [17:58] haha, avoiding regex is always good. [17:58] perrito666: yes; if the relations directory is empty, your test will succeed [17:59] perrito666: it's possible you want if ls -d $agent/state/relations/*/* 2> /dev/null; then [18:01] perrito666: or maybe go the same route as the other one and do: find $agent/state/relations -type f | xargs sed -i ... [18:01] yeah find is more elegant in that case too, thank you [18:12] voidspace: https://codereview.appspot.com/78890043 === vladk is now known as vladk|offline [18:28] EOW [18:28] g'night all, have a good weekend [18:38] wwitzel3: ug, accidentally deleted one extra line when I was removing references to MongoServiceName, and it took forever to figure out what was making things blow up in weird ways. [18:48] review appreciated, if anyone has some time to spare: https://codereview.appspot.com/78890043/ [18:49] and that's me for the week [18:49] happy weekends all! [18:50] rogpeppe: happy weekend [18:53] natefinch: I just got back from lunch, I dug a little bit on the local provider stuff before but no break throughs [19:04] wwitzel3: yeah, me either. I managed to break a lot of tests by changing the MongoServiceName stuff (mostly I think it's just that I need to mock out the new RemoveService()) method, since I'm getting access denied errors) [19:34] wwitzel3: so, it looks like it's hanging on the line where we create transaction log collection. [19:34] natefinch: ok [19:34] natefinch: we had a problem there before too, before we were initializing the replicaset properly [19:35] natefinch: probably related somehow [19:35] wwitzel3: yeah, that's my thought, I'm looking now to see if maybe the local provider is somehow missing that step === vladk|offline is now known as vladk [19:56] hi natefinch wwitzel3 , last juju devs awake. is rsyslog-gnutls a hard dependency for juju-local? [19:56] I ask because it does not exist for arm64 === vladk is now known as vladk|offline [20:00] wwitzel3: figured out at least part of it. THe logic in ensure mongo server was bailing out before it actually initiated the service [20:01] sinzui: with not great confidence, but based on my initial poking around, it looks like yes, it is a hard dep for juju-local [20:01] natefinch: I thought we told it to wait there? [20:02] natefinch: well, I guess .. can we tell it to wait there? :) [20:02] wwitzel3: if it's already installed, but not running, we just run it, but we don't check to see if it's initiated. I'm guessing the local provider is installing the service early or something [20:02] natefinch: ohhh [20:03] natefinch: good find [20:03] wwitzel3, natefinch I am also looking into packaging/publishing issues. The package might be available but the test images cannot see it [20:04] sinzui: we definitely try to install the gnutils, but I have no idea how critical their use is. Definitely things currently will break if it's not there [20:05] natefinch, I suspect the problem is ec2/ami. I don't think any of the universe packages are being seen [20:06] sinzui: ahh, huh. I have no idea if that's normal or not. [20:06] natefinch, nothing about the arm64 image is normal [20:07] hah [20:19] wwitzel3: I pushed up my code. it doesn't actually fix things yet, but I think it's better. Mongo doesn't seem to like localhost as its hostname in the replicaset, which is something I remember from when I was twiddling with replicasets locally earlier. I think it needs to use the local machine's hostname [20:20] natefinch: yeah you can only used localhost/127.0.0.1 if all of the members of the replicaset use localhost [20:21] wwitzel3: well, you'd think a replicaset of one would be ok with localhost then [20:21] natefinch: in theory it should be [20:22] natefinch: http://docs.mongodb.org/manual/reference/replica-configuration/#local.system.replset.members[n].host [20:23] wwitzel3: 2014-03-21 20:00:21 ERROR juju.cmd supercommand.go:300 failed to initiate mongo replicaset: couldn't initiate : can't find self in the replset config my port: 37017 [20:23] wwitzel3: I think I remember there needs to be a command line flag set for it to accept localhost [20:25] natefinch: can you see what host it is trying ti use? is it for sure trying to use localhost? [20:27] wwitzel3: double checked, definitely is "localhost" [20:27] natefinch: ok [20:33] natefinch: looks like the bind_ip would have to be 127.0.0.1 for the localhost in replicaset to work. [20:33] natefinch: so you're right, we need to use the actual machine hostname [20:35] wwitzel3: I tried setting the bind_IP to 127.0.0.1 too [20:36] natefinch: well that *should* have worked [20:37] natefinch: I was able to do it on my local machine that way anyway, which means squat [20:37] wwitzel3: the trick looks to be that you need to include the port in the hostname when you use localhost [20:37] wwitzel3: then bootstrap finishes on locval [20:37] natefinch: ahh, I did do that [20:38] wwitzel3: we were just passing in the hostname to initiate [20:38] natefinch: nice so local is worky now too? [20:38] wwitzel3: sort of, I hard coded adding the port to see if it would work, so I have to find a way in real code to do it. But at least we know what the fix is [20:40] wwitzel3: and bind_ip can stay 0.0.0.0, that works fine [20:41] wwitzel3: actually, not that hard. if address == "localhost", append :port. [20:42] natefinch: :) [20:46] wwitzel3: gah, bootstrap works but then juju can't connect to the API, so like juju status returns connection refused [20:46] wwitzel3: pushed, at least [20:47] natefinch: ok, I will take stab at it for a bit here [20:55] EOD for me. Have a good weekend everyone. [21:00] same here [23:43] Hello everyone, I just uploaded a fix for the password logging bug. I would appreciate if anyone would review it. https://code.launchpad.net/~jwharshaw/juju-core/fixlogbuild/+merge/211655