/srv/irclogs.ubuntu.com/2017/09/12/#juju.txt

stokachu	Dweller_: Yea if you can confirm if like to know	00:12
Dweller_	results so far, with lxd/lxd-client purged before snap install lxd / snap install conjure-up / conjure-up kubernetes-core localhost == hang, apt-purge of just lxd-client before same == hang.. apt-purge line commented out = no hang.. currently testing with apt-purge lxd-client moved to after the conjure-up kubernetes-core is complete..	00:14
stokachu	Dweller_: this is on edge channel too?	00:33
Dweller_	aye.. snap install conjure-up --classic --edge	00:33
stokachu	Is this through vagrant as well?	00:33
stokachu	Dweller_: and virtual box?	00:35
Dweller_	still running tests tho.. so bear with me.. it takes quite a while for each conjure-up to complete (especially if it hangs ;p)	00:36
stokachu	Dweller_: np, if it's vagrant and you have a vagrantfile to share I can try to reproduce here as well	00:37
Dweller_	and contradicting everything I've said so far.. I've got one terminal that I think should have hung, that's currently on the 00_deploy-done step (thats traditionally very quite for quite a while)	00:37
stokachu	Dweller_: you can do a watch juju status in a another terminal to make sure things are progressing	00:37
Dweller_	Yeah I did when it had hung.. it said everything was up and active	00:38
stokachu	Ah	00:38
Dweller_	the "Running step: 00_deploy-done" takes quite the long while	00:40
Dweller_	I wonder if you could have each sub container output something as they complete their init step or whatever during the 00_deploy_done phase	00:51
Dweller_	like .. I started this current vm off back at 14mins past, and it's now 52mins past .. ;p	00:52
Dweller_	it took about 7 mins to do the apt-get upgrade, snap installs, etc.. (I dump date out before launching the conjure-up kubernetes-core localhost)	00:52
Dweller_	at 21mins past it started the conjure-up kubernetes-core localhost	00:53
Dweller_	54mins past now, and still on "Running step: 00_deploy-done"	00:54
Dweller_	ooh.. I dont think its going to complete...	00:54
Dweller_	from juju status	00:55
Dweller_	Machine State DNS Inst id Series AZ Message	00:55
Dweller_	0 down pending xenial failed to setup authentication: cannot set API password for machine 0: cannot set password of machine 0: read tcp 10.193.182.114:33282->10.193.182.114:37017: i/o timeout	00:55
Dweller_	1 started 10.193.182.34 juju-46dac4-1 xenial Running	00:55
Dweller_	2 pending pending xenial	00:55
Dweller_	3 pending pending xenial	00:55
Dweller_	i/o timeout would sound bad..	00:56
Dweller_	I'll have another look tomorrow	01:09
Dweller_	k8s-local: [error] cannot add relation "flannel:cni kubernetes-master:cni": read tcp 10.4.78.199:32894->10.4.78.199:37017: i/o timeout	01:49
Dweller_	(diff attempt)	01:49
Dweller_	meh.. off to sleep.. will try again tomoz.. feels like sommat isnt waiting long enough anymore	01:49
=== skay is now known as Guest64671
=== salmankhan1 is now known as salmankhan
erik_lonroth_	rick_h: I've sent you an email with details on our problems connecting to AWS. We have found it relates to a session token which is provided for time-limited API keys. The environment variable used is "AWS_SESSION_TOKEN" and according to AWS documentation site it is used to sign API requests. I've commented in the bug report: https://bugs.launchpad.net/juju/+bug/1714022	10:17
mup	Bug #1714022: Juju failed to run on aws with authentication failed <juju:New> <https://launchpad.net/bugs/1714022>	10:17
erik_lonroth_	rick_h: I'm looking into the code of juju and can't yet see if support for AWS_SESSION_TOKEN/KEY is in juju yet. It prevents us from using AWS in our current federated setup. How would you suggest we proceed?	11:48
rick_h	erik_lonroth_: this is what I need to get to the core folks. I don't believe it's supported and so I want to get engineers looking into what it'll take to support. We need to.	11:49
rick_h	jam: ^	11:49
rick_h	erik_lonroth_: ty for updating the bug with details.	11:49
erik_lonroth_	We can start looking into this also from our end, however, we just need to double check that there is indeed a need for this before we start up a pull request.	11:50
erik_lonroth_	The documentation on on AWS if "pretty" clear on how the extra signing of API calls need to happen, but we are not that experienced developers of juju so we don't want to fuck your code up and waste your time fixing our code. =/	11:52
erik_lonroth_	spelling is great	11:52
Dweller_	[error] cannot add relation "flannel:cni kubernetes-master:cni": read tcp 10.4.78.199:32894->10.4.78.199:37017: i/o timeout	12:01
Dweller_	:(	12:01
stokachu	Dweller_: hmm	12:31
stokachu	Dweller_: are you running this with vagrant+virtualbox?	12:32
Dweller_	yep =)	12:38
Dweller_	still on edge, but hadn't seen timeouts until yesterday evening	12:38
Dweller_	the system is the twin xeon rig, 24g of ram, and the cpu's are bearly breaking a sweat running the vagrant box.. plenty of ram left, no swap in use, and the only disk is an ssd	12:39
=== Guest64671 is now known as skay
Dweller_	hmm.. mebbe networking issues?	12:45
Dweller_	[error] cannot get resource metadata from the charm store: Get https://api.jujucharms.com/charmstore/v5/~containers/easyrsa-15/meta/resources: dial tcp: lookup api.jujucharms.com on 10.157.242.1:53: read udp 10.157.242.193:41730->10.157.242.1:53: i/o timeout	12:45
Zic	hello here: one of my kubernetes-master is blocked in "maintenance" in juju status, but in fact it's ok	12:52
Zic	(was after a juju upgrade-charm kubernetes-master)	12:53
Zic	I have two other master in this K8s cluster which are "idle"	12:53
Zic	"Starting the Kubernetes master services." is the message for "maintenance"	12:54
Zic	I'm seaching something like "juju resolved kubernetes-master/0" but "resolved does not work on status "maintenance"	13:27
kjackal	Hi Zic	13:35
kjackal	Zic: is this deployment one that got updated?	13:37
kjackal	Zic: can you show me the output of this: juju run --unit kubernetes-master/0 'charms.reactive --format=yaml get_states'	13:39
Zic	yeah, it was from 1.6.2 to 1.7.4	13:48
Zic	but I found something new/weird : all my nodes are in NotReady in kubectl get nodes :/	13:48
Zic	and their logs say:	13:48
Zic	kubelet_node_status.go:106] Unable to register node "ig1-k8s-01" with API server: the server has asked for the client to provide credentials (post nodes)	13:49
Zic	seems they lost their certificate	13:49
Zic	http://paste.ubuntu.com/25521019/ <= kjackal	13:49
kjackal_	Zic: I do not see the kube-api-server relation between the master and the workers	13:58
kjackal_	Zic: is this a production cluster? If not I would remove and re-add the kubernetes-master <-> kubernetes-worker relations	13:59
kjackal_	Zic: from 1.7 we did harden the auth mechanism between master-workers and admins	14:01
kjackal_	that means you should also grab the updated config file from the master: juju scp kubernetes-master/0:config ~/.kube/	14:02
Dweller_	hmm.. can't bring up vagrant with conjure up kubernetes core since yesterday.. seems something is now taking too long, causing a timeout that leads to failure	14:02
stokachu	Dweller_: can you pastebin your vagrantfile?	14:02
stokachu	i can try to reproduce	14:02
Dweller_	sure.. give me a mo..	14:03
Dweller_	it's in github, but our enterprise one .. which wont help you.. lemme paste it =)	14:03
Dweller_	stokachu: https://pastebin.com/DvuSEWvs	14:07
stokachu	is bento/ubuntu-16.04 like an official image?	14:08
Dweller_	aye.. it's ubuntu-16.04	14:08
Dweller_	http://chef.github.io/bento/	14:09
Dweller_	but it was all working pretty well until yesterday evening.. but since then I've not been able to bring up a vm	14:09
stokachu	Dweller_: ok ill try with this bento project but you should use https://app.vagrantup.com/ubuntu instead	14:10
Dweller_	I just tried one reverted to the non edge version (you'll need to edit the vagrant file if you want to try edge.. looks like I lost the --edge too in my hackery)	14:10
stokachu	those are the ones we build	14:10
Dweller_	sure.. will try that one now..	14:10
stokachu	ok i need to install virtualbox and vagrant it'll be a few minutes	14:11
Dweller_	no probs.. I've got it attempting with ubuntu/xenial64 at the mo	14:15
Zic	kjackal_: was a dev cluster yup, testing if upgrading directly from 1.6.2 to 1.7.4 was possible	14:15
Dweller_	although given it used to work, and stuff is timing out.. I'm wondering if networking somewhere is causing me grief	14:16
Zic	kjackal_: I think I miss something, I just read the upgrade page from kubernetes.io about Ubuntu/Juju and note the specific release note of every release	14:16
stokachu	Dweller_: yea im running it now	14:16
Zic	do you have a link to read all upgrade-step between CDK versions or do I need to find old articles on Ubuntu Insights?	14:20
Zic	I know that this blogpost sometime have more extra-step than https://kubernetes.io/docs/getting-started-guides/ubuntu/upgrades/	14:20
kjackal_	Zic: Here is the anounce ment we had when 1.7 came out: https://insights.ubuntu.com/2017/07/07/kubernetes-1-7-on-ubuntu/	14:22
kjackal_	Looking for the upgrade and release doc	14:22
Zic	yup, just found it, just saw the auth/cert part, don't know what happens with my charms relation so :(	14:23
Dweller_	stokachu: so the official box uses 'ubuntu' as the user, not 'vagrant' the file will need changes for that..	14:23
Zic	I also juste noted that my Juju GUI is down on https://<host>:17070/	14:23
Zic	just*	14:23
stokachu	Dweller_: yea just ran into that :)	14:23
Zic	< HTTP/1.1 400 Bad Request	14:24
Zic	* no chunk, no close, no size. Assume close to signal end	14:24
stokachu	Dweller_: it's deploying now	14:28
Zic	(fixed for the Juju GUI, some part of the full URI to access it missed)	14:28
Zic	kjackal_: can you confirm the juju remove-relation / juju add-relation ? I fear to do something nasty :)	14:30
Zic	even if it's non-prod, I prefer to try to solve properly in case if it happens one day in prod	14:31
Dweller_	==> k8s-local: error: cannot perform the following tasks:	14:31
Dweller_	==> k8s-local: - Download snap "core" (2844) from channel "stable" (Get https://068ed04f23.site.internapcdn.net/download-snap/99T7MUlRhtI3U0QFgl5mXXESAiSwt776_2844.snap?t=2017-09-12T16:00:00Z&h=4d4b35a936b3094a2dcbba86a2d9063de4b843ac: dial tcp: lookup 068ed04f23.site.internapcdn.net on 192.168.1.1:53: server misbehaving)	14:31
kjackal_	Zic: I am not sure why that deployment went into this state. However, I see a state missing indicating this relation is not in place	14:32
kjackal_	so...	14:32
Zic	yup, and it seems logic so that master does not recognize its nodes	14:33
stokachu	Dweller_: what's your bridge defined as?	14:35
stokachu	Dweller_: i picked lxdbr0 for mine	14:35
Dweller_	enp2s0 .. the adapter with access to my lan	14:36
stokachu	Dweller_: hmm ok so that's one thing i did differently	14:38
stokachu	Dweller_: picked a virtual bridge	14:38
stokachu	Dweller_: oh, are you running out of space on the device?	14:40
stokachu	Dweller_: because that just happened to me	14:40
Dweller_	184g available	14:40
stokachu	9.7G for / is not enough	14:40
stokachu	what does `df -h` show	14:40
Dweller_	oh.. you mean inside the vm ?	14:41
stokachu	yea	14:41
Dweller_	/dev/sda1 9.7G 1.3G 8.4G 14% /	14:41
stokachu	yea you're going to run out of space	14:41
stokachu	that's one issue	14:41
stokachu	that's probably why it seemed like it was hanging	14:42
Dweller_	I wonder how big the bento image was ;p	14:43
Dweller_	still 8.3g available on / tho.. how much does conjure-up kubernetes-core need?	14:44
stokachu	well i was at 00_deploy-done and all 9.7G was used	14:45
stokachu	i dont know how much exactly but i would do at least a 40G /	14:45
Dweller_	https://github.com/sprotheroe/vagrant-disksize =)	14:47
stokachu	Dweller_: cool!	14:50
stokachu	Dweller_: yea that gave me a 40GB partition	14:54
stokachu	re-running now	14:54
Dweller_	same	14:54
kjackal_	Zic: did it work?	14:55
Zic	kjackal_: oh oops, I questioned you about the exact juju remove-relation/add-relation command since I fear to do something nasty, habitually I only use the Juju GUI to prepare new deployment	14:58
stokachu	Dweller_: did you get past that snap download error?	14:59
Dweller_	not this time..	14:59
Dweller_	==> k8s-local: error: cannot install "conjure-up": Get	15:00
Dweller_	==> k8s-local: https://api.snapcraft.io/api/v1/snaps/details/core?channel=stable&fields=anon_download_url%2Carchitecture%2Cchannel%2Cdownload_sha3_384%2Csummary%2Cdescription%2Cdeltas%2Cbinary_filesize%2Cdownload_url%2Cepoch%2Cicon_url%2Clast_updated%2Cpackage_name%2Cprices%2Cpublisher%2Cratings_average%2Crevision%2Cscreenshot_urls%2Csnap_id%2Csupport_url%2Ccontact%2Ctitle%2Ccontent%2Cversion%2Corigin%2Cdeveloper_id%2Cpri	15:00
Dweller_	vate%2Cconfinement%2Cchannel_maps_list:	15:00
Dweller_	==> k8s-local: net/http: request canceled while waiting for connection (Client.Timeout	15:00
Dweller_	==> k8s-local: exceeded while awaiting headers)	15:00
stokachu	Dweller_: im thinking you got some network issues happening	15:00
kjackal_	Zic: removing and readding relations is safe, should always work	15:00
Dweller_	yarp.. gonna add some changes to my lan & see if I cant route that box out via a different network provider	15:01
Dweller_	(I have 3 exits from my lan to the internet, loadbalanced using mwan3 on openwrt)	15:02
stokachu	Dweller_: ok, it all came up for me	15:09
stokachu	Dweller_: oh i also added `apt-get remove -qyf lxd lxd-client`	15:10
stokachu	so that it doesn't get confused there	15:10
Dweller_	yeah.. thats what broke me yesterday evening.. although I'm suspecting now thats when my network went nuts, rather than it being the cause	15:11
stokachu	Dweller_: ack	15:13
Zic	kjackal_: "kube-control relation removed between kubernetes-worker and kubernetes-master."	15:16
Zic	is it good?	15:16
kjackal_	Zic: sure add it back	15:19
kjackal_	there is also the relation kube-api-endpoint missing between master and worker	15:20
kjackal_	Zic: ^	15:20
kjackal_	actually is you do a juju add-relation kubernetes-master kubernetes worker you will see the two relations that need to be added between master and worker	15:21
Zic	kjackal_: yup, I did that, now the master is in "blocked / Waiting for workers"	15:27
Zic	I'm waiting a bit :)	15:27
Zic	but nothing much happen now in "juju debug-log"	15:29
kjackal_	Zic: did you add the kubernetes control relation between master and worker? This message comes is shown when the relation is not there: https://github.com/kubernetes/kubernetes/blob/master/cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py#L420	15:36
Zic	kjackal_: yup, I redit the get_states command after: http://paste.ubuntu.com/25521483/	15:38
kjackal_	Zic: did you also added the kube-api-endpoint relation? https://github.com/kubernetes/kubernetes/blob/master/cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py#L432	15:40
kjackal_	Zic: workrs need to know where the api-server is	15:41
Zic	# juju add-relation kubernetes-worker kube-api-endpoint	15:44
Zic	ERROR application "kube-api-endpoint" not found (not found)	15:44
Zic	hmm?	15:44
kjackal_	wait Zic this is a relation between master and workers	15:44
kjackal_	should be something like: juju add-relation kubernetes-master:kube-api-endpoint kubernetes-worker:kube-api-endpoint	15:45
kjackal_	Zic: ^	15:45
Zic	thanks, it works for now	15:46
kjackal_	Awesome	15:46
kjackal_	I have to go Zic	15:47
Zic	thanks anyway for your help kjackal_ ;)	15:47
kjackal_	Should be back in a few hours, sorry	15:47
Fallenour	@jamespage hey Im back, survived hurricane minus modest house damage.	16:10
Fallenour	I didnt see any of the last replies though after you went to your hotel. Did you get the ceph output?	16:10
Fallenour	can anyone help me out with a ceph issue?	16:50
stormmore	o/ juju world... hey rick_h I have started to build my first charm :)	16:52
rick_h	stormmore: woot woot	16:54
rick_h	stormmore: whatcha building?	16:54
stormmore	rick_h: sub-ord charm for etckeeper	16:54
Fallenour	@rick_h @stormmore hopefully not ceph, talk about a rough start	16:54
stormmore	Fallenour: thankfully not, I know that charm well though	16:55
stormmore	Fallenour: well well-ish even	16:55
Fallenour	@stormmore its giving me nothing short of pure hell. Built an entire openstack charm, got it working EXCEPT for ceph LOL	16:56
rick_h	Fallenour: yay on openstack but :/ on ceph. Sorry, not an expert there.	16:57
Fallenour	@stormmore so pretty much, I have a fully blown, rocking hard openstack, it just has the memory of a goldfish LOL	16:57
stormmore	Fallenour: why build when there is a good bundle already avialable?	16:57
Fallenour	@stormmore I did use that build, but I also built a separate one with hyperscalability. The current issue with the current trusty charm built in for newton is that its not ocata, and its trusty, at least last I checked. As for the openstack base from charmers, its only for 3, and I need at least 5 ceph-mon boxes to handle all the future storage add	16:58
rick_h	stormmore: cool on the sub for etckeeper.	16:58
stormmore	Fallenour: ah Trusty enough said! As far as being able to scale, I haven't seen any issue with add-unit from the main bundle	17:00
Fallenour	@stormmore @rick_h Current upgrading is being a huge pain in the ass though, and its holding up my project really bad	17:00
Fallenour	@stormmore The concern isnt the add to, the issue is drive management requests once you get over several Petabytes. My end objective is currently over 500PB for the current project build as is. 3 Ceph-Mon systems cant manage that many requests.	17:01
stormmore	Fallenour: oh I get that but you should just have to manipulate the bundle yaml	17:02
stormmore	put in the number of units you want and the placements	17:03
Fallenour	@stormmore yea the overall idea is fix the issues at small scale, then implement to yaml, push to Juju / Salt power combo, and scale like a mad man	17:03
stormmore	I have been playing with OS on LXD on my poor laptop	17:03
Fallenour	@stormmore the current issue is if I cant get 5 ceph-osd / ceph-mon nodes to work, theres not way im gonna get 5000 to play nice	17:03
stormmore	Fallenour: on I understand the problem, just don't really see it as a charm problem, more of a charm bundle	17:06
stormmore	oh*	17:06
Fallenour	@stormmore Well it kind of is. The issue is if the basic charms dont deploy correctly, which they didnt, I cant trust them to work at scale	17:07
catbus	Fallenour: how do they not deploy correctly? What's the symptom?	17:07
* stormmore is still wondering why Trusty not Xeniel		17:08
Fallenour	@catbus ceph is down, even though they show 3 of the 5 nodes in openstack horizon, but are missing two of the larger nodes.	17:08
Fallenour	@catbus what makes it more confusing , is it shows 5/5 and 5/5 respectively	17:08
zeestrat	@Fallenour is there a copy of the bundle you're deploying?	17:09
Fallenour	@catbus checks of the /etc/ceph/ceph.conf show all configs configured properly, but health outputs show pgs not building properly, all 196 pgs are stuck for some reason, and its not responding to ceph pg repair commands	17:09
Fallenour	@zerestrat yea, its the standard that ships with juju, so like millions of copies	17:10
Fallenour	@zerestrat the only difference is after build failure, I simply ran the upgrade commands and let it upgrade to xenial.	17:10
Fallenour	@zerestrat my first thought was that it needed to upgrade in order to work, so I pushed upgrade across all, but still didnt resolve issue.	17:11
Fallenour	@catbus @zerestrat @stormmore I can dump ceph health, ceph tree, and ceph -s if that helps	17:12
catbus	Fallenour: I am no ceph expert but I can look up if I see something suspicious. Can you also show juju status?	17:17
catbus	in a pastebin.ubuntu.com	17:18
Fallenour	@catbus @zeestrat @stormmore @rick_h Here is the full paste, includes: juju status, ceph health, ceph health detail, ceph ph dump_stuck unclean	17:26
Fallenour	http://pastebin.ubuntu.com/25522179/	17:26
Fallenour	my thoughts on a nuclear option: ceph osd force-create-pg <pgid>	17:27
stormmore	Fallenour: have you looked at the juju logs for the osds? /var/log/juju/unit-*?	17:28
catbus	Fallenour: You see "No block devices detected using current configuration" for ceph-osd units in juju status?	17:28
Fallenour	@catbus yea, I saw that. I saw this too: juju run --unit ceph-mon/1 'ceph osd stat' osdmap e35: 0 osds: 0 up, 0 in flags sortbitwise,require_jewel_osds	17:28
stormmore	Fallenour: what catbus said is what I am getting at. I suspect that it can't create the PGs cause it doesn't know what block devices to use	17:29
catbus	Fallenour: thanks for using conjure-up. conjure-up with openstack-base is using this bundle https://api.jujucharms.com/charmstore/v5/openstack-base/archive/bundle.yaml, which specifies /dev/sdb for ceph-osd to use.	17:29
Fallenour	@catbus @stormmore No, not yet. Power issues kept me from going much further.	17:29
Fallenour	@catbus yea, conjure up is pretty amazing. I used the current one that ships with the up to date conjure-up. Thats my big question	17:30
catbus	Fallenour: you can modify the ceph-osd config when you select openstack with novaKVM.	17:30
Fallenour	@catbus why are 3 of the 5 working? but yet none of them workign?	17:30
catbus	after you select the spell, it will present all the service configurations, you can select ceph-osd and find the 'osd-devices' to modify accordingly to your environment.	17:31
Fallenour	@catbus Thats exactly what I did, I did the configure, added 2 machines, added OSD to bare metal, mon to lxd	17:31
Fallenour	@catbus thats exactly what I did	17:31
Fallenour	@catbus other than that, and assigning the specific machines, I let it auto deploy the rest of the way	17:31
Fallenour	@catbus @stormmore do you think I should rebuild?	17:33
catbus	Fallenour: what are the block devices you set for ceph-osd units?	17:33
Fallenour	@catbus Dell R610 and R710s.	17:33
catbus	Fallenour: how many hard disk drives are in these servers?	17:33
catbus	each.	17:34
Fallenour	@catbus 8,8,2,2,2. the 2, 2 , and 2 are the 610s, and their drives are raided. Should I break the raid?	17:34
zeestrat	And what are the device names for the block devices?	17:34
Fallenour	@zeestrat systems were autonamed with maas, so like fresh-llama and hot-seal (not my idea I swear).	17:35
catbus	Fallenour: you can have raid for driver number >=2, but for 2 drives only servers, break the raid, so you can have 1 drive to ceph-osd to use.	17:35
Fallenour	@catbus, soo, break the raid, which blows away OS, let it rebuild those three, or just rerun the charm install?	17:36
zeestrat	@Fallenour, those sound like the hostnames of the servers. the 'osd-devices' needs a list of block devices	17:36
catbus	Fallenour: i'd start over since the OS will be gone.	17:36
Fallenour	@catbus #sadpanda :'( its gonna take a while with my current connection at 6/1	17:37
stormmore	catbus: couldn't Fallenour use a loopback device at least for a PoC before going to the extreme of a rebuild?	17:38
stormmore	or resize the OS drive and make a data partition?	17:39
catbus	stormmore: re-build usually takes ~1 hour for me, so I usually prefer to start over. It's up to Fallenour.	17:40
stormmore	those might be faster than a rebuild to confirm the setup but I would defintely do a rebuild	17:40
stormmore	catbus: oh I am with you there ;-)	17:40
catbus	Fallenour: and you can put '/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh' for 'osd-devices' in ceph-osd configuration. It will take whatever is available on the host.	17:40
Fallenour	@catbus If the rebuild is a confirmed kill, Ill take that for sure. Whatever it takes to get it up and running, I have a lot of people depending on me to get this system live, I cant keep pushing back anymore	17:41
catbus	Fallenour: if you need professional service with SLA, Canonical provides that, you know? ;)	17:42
Fallenour	@catbus I really wish I could, but Im already eating massive losses by giving everything away for free now as is	17:43
catbus	Fallenour: https://jujucharms.com/ceph-osd/245 for ceph-osd charm configuration reference.	17:45
stormmore	just out of curiousity, does anyone have an example of a charm with an action unit test using action_get?	17:47
Dweller_	stokachu: network gremlins must like me again.. back up and running, and managed to use the registry action on the worker node to add the registry running in the host, and have done a simple test of a custom image pushed / built to that docker from outside the vm, and then deployment/exposing of that image via kubectl from outside.. and it worked (more or less.. I need to sort out my image a little more, but it did send back my	17:50
Dweller_	ctx root not found error page from liberty running in the container)	17:50
Dweller_	so at this point.. I now have a kube-in-a-box =)	17:50
stokachu	Dweller_: \o/	17:53
stokachu	Dweller_: if possible please post a blog post or something on your setup	17:55
stokachu	so we can share that out	17:55
Dweller_	yep, there'll be one via the gameontext blog site .. I'll ping you a preview if possible to give you a chance to point out if I've been idiotic somewhere ;p	17:56
Dweller_	gameontext.org is a microservice based text adventure that a bunch of us contribute to as a way to learn about microservices, and related technologies.. each room in the game is its own microservice, and the core is built with a 12 factor approach.. it's currently running in IBM Bluemix, and we regularly post bits about it =)	17:58
Dweller_	in the not too distant future, we're planning to move the core from docker-compose to k8s, and part of that story includes figuring out a sensible local development story for that	17:58
Fallenour	@Dweller_ Once I get all this running, id be more than happy to host your project for free. its right up our alley of projects we support	17:59
Dweller_	and having kube in a box that we can target for docker builds & k8s deploys, fulfils that pretty neatly	17:59
Dweller_	Thanks for the offer, but we're not paying for it at the mo either =) (Full disclosure, I work for IBM, in the Cloud Native Java team, so this kinda stuff is important to us too)	18:00
Dweller_	At somepoint, I should really look into what it would take to add bluemix to the set of juju supported clouds too =)	18:00
Fallenour	@Dweller_ LOL! Well thats definitely a good benefit to have XD. You guys ever think of cleaning out your DCs to upgrade, give me a call. We will decom and drag the gear off for free. A lot better than paying 100000k+ a quarter for decom.	18:01
stokachu	Dweller_: really happy conjure-up helps with that	18:01
stokachu	and by extension juju	18:02
Dweller_	Yeah.. problem with IBM is its huge.. like 400k employees worldwide huge, which means I have virtually no visibility over that stuff.. I work remote out of Montreal CA, my team is based in UK, Austin, and New York =)	18:02
Dweller_	stokachu: its a nice solution.. I like it.. it has a lot of scope for expansion and experimentation.. which makes it much better suited to my goals, than say, minikube	18:03
Fallenour	@dweller_ yea I saw a similar issue with AT&T and Verizon. Nobodies knows where the gear comes or goes from or to, just that it does haha	18:03
stokachu	Dweller_: awesome, feature requests welcomed too if you think something conjure-up could do to help out	18:03
Dweller_	aye.. when I worked out of UK we used to know a few ppl in goods inwards.. and once or twice heard about equipment being skipped that we got to salvage	18:04
Dweller_	but that was like once or twice in 17 years	18:04
Dweller_	that said.. I ended up with a large back of 72pin simms which is still handy for Amiga's and stuff today	18:05
Dweller_	back/bag	18:05
Dweller_	stokachu: from an education perspective, is there a way to have conjure-up say what it's doing? .. I love that it can do it all for me.. but I also want to know what it did.. I can mostly figure it out now by going and looking at the stuff in github.. conjure up feels like a big macro engine for juju ;p and juju is like a swiss army knife.. where I don't totally understand what the blades are, or how many it has ;p	18:11
stokachu	Dweller_: yea true, cory_fu and I were kicking around an idea at one point where we basically record what the juju equivalent commands are during each step of the deployment	18:12
Dweller_	mebbe a variant on headless where it writes out a script with the juju commands it would have run.. then you can edit & run the script	18:14
Fallenour	@catbus @stormmore @rick_h @Dweller_ The issue is definitely my raid. If someone has the same issue as me in the future, tell them to break the raid on their Nova-Compute nodes, and make sure their dedicated storage nodes have at least 2 PDs per Span, with at least 2 Spans. Otherwise the install will fail, and the rebuild will be a sad drink of coffee	18:14
rick_h	Fallenour: :( sucky	18:15
rick_h	Fallenour: glad you found some root cause to attack	18:15
rick_h	ty catbus stormmore and Dweller_ for helping out	18:15
stormmore	not a problem rick_h	18:15
Dweller_	aye.. fwiw, I stopped using RAID, in favour of file duplicating stuff like snapraid / drivepool etc	18:15
stormmore	glad to when I can	18:16
Fallenour	@rick_h yurp. on the funny note, when I pulled one of the drives out to physically check it, the sled was empty LOL, so I got a great laugh outta that one. Lesson Learned, Dont build boxes will beer:30, will not end well XD	18:16
rick_h	Fallenour: hah	18:16
stormmore	Fallenour: lol	18:16
rick_h	Fallenour: #lifelessons	18:16
Dweller_	have over 70tb running under windows using drivebender pooling.. and abotu half that again using snap/flex raid on linux	18:16
rick_h	anyone gotten ssl and haproxy playing nice? I'm trying to get the charm to proxy something with ssl termination on it	18:41
bdx	rick_h: I've put a bit of time in there	18:48
rick_h	bdx: I've put a ssl_key and ssl_cert and I see the unit did create a valid .pem file but the config written doesn't do any ssl config	18:48
rick_h	bdx: I'm missing flipping some bit in the charm I'm thinking	18:49
bdx	rick_h: https://gist.github.com/jamesbeedy/d587cbf048038fb274ef4cd55c4ee3dd	18:49
rick_h	bdx: ah, so you setup the services yourself bummer	18:50
bdx	yeah ...	18:50
bdx	my way is the simplemans way	18:50
rick_h	bdx: heh, I wanted simpler the charm to go "oh I see you like you some ssl here" :P	18:51
Fallenour	@dweller_ @rick_h @catbus @stormmore @stokachu do any of you happen to know of a guide or collection of guides to publish a charm bundle? Im building a self-configuring, audit compliant cloud environment, and I want to make it available via the charm store, how do I do that?	18:51
bdx	rick_h: you can provide that info via relation too, instead of making it static in the config	18:51
bdx	the reverseproxy relation is difficult because of the formatting	18:52
rick_h	Fallenour:https://jujucharms.com/docs/stable/charms-bundles is the start	18:52
rick_h	bdx: yea, gotcha. K, I'll poke at it. TY for your sample config	18:52
Fallenour	ok here we go	19:13
Fallenour	wish me luck. if this all goes well, im live. If not, im probably gonna cry. grown men shouldnt cry. at least not over spilt bits	19:13
rick_h	mhilton_: interesting, that looks close to what I'm doing. I wonder what I've got off.	19:44
rick_h	mhilton_: does the controller website charm add the service with https then I wonder?	19:44
rick_h	mhilton_: oh hmm, that seems to be non-https setup. interesting	19:46
xarses_	so I changed the password that is in my cloud credentials yaml file, ran juju update-credentials, how can I ensure that these are the credentials the model is using now? P.S. the credential keeps locking out and is shared with some other systems, so I can currently neither prove or disporve that juju is the problem	20:15
rick_h	xarses_: hmmm...juju add-unit and check the dashboard?	20:19
rick_h	xarses_: this is something we're actively working to improve right now as it's come up with folks that need to swap credentials on running models so I admit it's kind of sucky atm	20:20
xarses_	story of my life ...	20:20
rick_h	xarses_: we need to get you a better life	20:22
Fallenour	@rick_h @stokachu wait..we? Youve both said "we" and "working", are both of you part of the official juju team? o.O	20:22
rick_h	Fallenour: yes, stokachu works on conjure-up and I work on jaas	20:22
Fallenour	8O	20:22
Fallenour	so...	20:22
rick_h	Fallenour: so we're canonical folks working around the juju community of projects	20:22
Fallenour	if plebian is like...1	20:23
Fallenour	and godmode is like a 10	20:23
Fallenour	you guys are like...35?	20:23
rick_h	hah, no. we're like 5 or 6	20:23
xarses_	rick_h: if you are going to keep a bucket list, credential validation, always using env_vars for openstack, and supporting clouds.yaml + secrets.yaml (from openstack_cloud_config)	20:27
xarses_	autoadd doesn't support the last	20:27
rick_h	xarses_: interesting on the openstack bits. Can you file a bug on those and I can group them into the credential mgt discussion?	20:27
rick_h	xarses_: no, but does normal add-credential support a file?	20:28
rick_h	xarses_: if not that should be the right way to grab a standard file I think	20:28
rick_h	it's kind of how the gce one works. We just accept the json file it dumps out	20:28
xarses_	rick_h: file? ya in the juju format, sure but thats not how any of the other providers lead you to storing credentials to use against them	20:42
rick_h	xarses_: huh? I missed that sorry	20:44
rick_h	xarses_: you mean the secreta.yaml?	20:44
rick_h	oh sorry, I thuoght you meant that openstack_cloud_config would dump a file	20:44
xarses_	"does normal add-credential support a file" I think so, but not the clouds.yaml format	20:45
rick_h	xarses_: gotcha	20:45
xarses_	rick_h: they use different key names	20:46
xarses_	id hope that auto-add would be able to scan it, but add file, I wasn't holding my breath	20:47
Fallenour	@rick_h @stokachu @catbus @stormmore so far it looks like exact same issue, even with raids broken.	20:56
catbus	Fallenour: can you show juju status in a pastebin.ubuntu.com?	21:00
Fallenour	@catbus Give me a bit, it looks like its finalizing now. might take a moment.	21:01
Fallenour	@catbus It was exact same issue as last time. Same output. This time all im gonna do is conjure-up, novakvm install, assign devices via standard configure, no extra machines, deploy all 16. If this fails, Im not insane, and something via conjure-up simply doesnt work	21:41
Fallenour	Any ideas as to why I keep getting neutron-gateway/0 "hook failed: "config-changed" error? This time it was totally native, no changes or additions run of conjure-up @stokachu	21:56
stokachu	Fallenour: you need to `juju ssh neutron-gateway/0; cd /var/log/juju; pastebinit unit-neutron-gateway-0.log`	21:56
stokachu	i imagine it is because it can't find the interface	21:56
stokachu	that's that whole port mapping thing i pointed you to on sunday	21:57
Fallenour	@stokachu alright, Ill do that once the install is finished. hopefulyl that will be the only issue. I imagine if it is, a simple change of the interface, and a reboot will change that?	21:57
Fallenour	@stokachu I hate to ask, but can you relink me that info? I lost everything when the hurricane hit and wiped out power :(	21:58
stokachu	Fallenour: so you'll want to juju config neutron-gateway <key>=<value>	21:58
stokachu	then juju resolved neutron-gateway/0	21:58
stokachu	Fallenour: https://jujucharms.com/neutron-gateway/238 look under Port Configuration	21:59
stokachu	specifically note:	21:59
stokachu	If the device name is not consistent between hosts, you can specify the same	21:59
stokachu	bridge multiple times with MAC addresses instead of interface names. The charm	21:59
stokachu	will loop through the list and configure the first matching interface.	21:59
catbus	Fallenour: can you confirm 'same output' as "No block devices detected using current configuration" for ceph-osd units in juju status?	22:01
Fallenour	@catbus yea on the previous build it was the exact same. Im doing a very generic conjure-up this time, no additional machines, no additional services	22:12
Fallenour	@catbus so far, it looks good, but only time will tell.	22:12
stokachu	For ceph do your machines have 2 disks?	22:13
catbus	ok.	22:13
xarses_	@rick_h: never created the machine, so I'm not sure but I think juju is stuck with my dead credentials	22:40
xarses_	and now it wont destory, because the a machine is in pending state	22:41
stormmore	does anyone know how install a trusty charm? I am aware of the Ubuntu charm but I have only managed to get to install Xenial so far	22:43
xarses_	just set the series, or declare it explicitly if the charm has multiple	22:46
xarses_	https://jujucharms.com/docs/2.2/charms-deploying	22:46
Fallenour	@catbus @stokachu @stormmore @rick_h same exact issue, completely native install this time. What gives? Is the conjure-up instance simply not working natively? Shoul dI use a different install bundle?	22:48
catbus	Fallenour: what's the issue exactly? Please show error messages.	22:50
Fallenour	@catbus hang on. I did a raid 0 thinking it would split the disks, lemme rebuild....again	22:50
stormmore	Fallenour: I would have to see what you are putting in for the devices for ceph-osd	22:50
stormmore	Fallenour: it still sounds like ceph-osd is not finding the drives where it is looking to me	22:51
catbus	stormmore: juju deploy cs:trusty/ubuntu?	22:52
stormmore	catbus: that is what I was wondering :) trying it now	22:52
Fallenour	@stormmore I used a raid 0 instead of simply leaving the drives as is.	22:53
stormmore	Fallenour: then ceph needs a loopback device of some sort to act as a virtual drive	22:54
=== hml_ is now known as hml

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!