[07:14] <bigjools> jamespage:
[07:14] <bigjools> ah
[07:14] <bigjools> jtv: this one
[07:14] <jtv> OK
[07:15] <bigjools> so let's discuss the actual symptoms
[07:15] <bigjools> rvba has the most experience so far
[07:15] <bigjools> and we can fill out the bug report as it's one of those "report the solution" rather than "report the bug"
[07:15] <jtv> I guess a broken download could easily be hidden behind a retry...
[07:15] <rvba> symptoms: you run the import-pxe-files and you get: http://paste.ubuntu.com/6244329/
[07:17] <jtv> Might be worth keeping the failed download, and seeing what "strings" and "cmp" can tell us.
[07:18] <bigjools> rvba: ok and is the file obviously the wrong size, or just corrupt>?
[07:18] <bigjools> all useful info missing from the bug :(
[07:18] <jtv> Because the file itself is missing by the time we see it.
[07:18] <rvba> bigjools: obviously the wrong size, very small whilst the file should be 200M or so
[07:18] <bigjools> ok, so maybe it crapped out early
[07:18] <bigjools> has it ever worked?
[07:19] <rvba> Yes.  I worked at some point.
[07:19] <rvba> It*
[07:19] <rvba> Let's see if we can reproduce it in a clean environment like a fresh canonistack instance.
[07:19] <bigjools> there's a start.  we need to find a scenario to recreate first then
[07:19] <bigjools> I'll play there too
[07:19] <rvba> Yes.
[07:20] <rvba> Then we need to report a proper bug with all the relevant info.
[07:20] <jtv> I guess we should also check which seems to be the correct checksum — for all we know we could be getting the wrong one from the index.
[07:21] <bigjools> very very very unlikely :)
[07:22] <jtv> All of this is, isn't it?
[07:22] <jtv> Anyway, we should learn more as soon as we can compare files.
[07:22] <bigjools> I used debtree on the maas package today.  Holy cow.
[07:24] <jtv> Try to compare to other, similar applictions before judging.  :)
[07:24] <bigjools> not judging just noting
[07:25] <bigjools> so maas-import-ephemerals - not a sausage of debug output?
[07:25] <bigjools> or any indication it's doing anything
[07:25] <bigjools> IIRC smoser filed a bug about it
[07:26] <jtv> I'm looking for the retry/partial-downlod logic.  Should find out how to get more information out of it.
[07:26] <bigjools> rvba: this happened for you in the lab, right?
[07:27] <rvba> Correct.
[07:27] <bigjools> rvba: is it still failing there?
[07:27] <rvba> Define "still" :)
[07:27] <rvba> Last time I used the lab, it failed.
[07:27] <rvba> Last time Diogo used the lab, it failed.
[07:28] <bigjools> can you try removing the ephemerals completely and try again?
[07:28] <jtv> Proxy weirdness could be involved...
[07:28] <jtv> That still fails.
[07:28] <bigjools> I still think we need a lock BTW
[07:28] <rvba> bigjools: you mean removing the existing ephemerals?
[07:28] <bigjools> rvba: yes
[07:29] <bigjools> rm -rf /var/lib/maas/ephemerals/*
[07:29] <rvba> In the lab, we create a new VM every time we run tests.
[07:29] <rvba> So this is long gone.
[07:29] <bigjools> oh!
[07:29] <bigjools> ok
[07:29] <bigjools> thought you were playing around on one instance
[07:30] <rvba> No.  Yesterday, I saw it on one instance, then Diogo saw the same problem on a completely different instance.
[07:30] <bigjools> dandy
[07:30] <jtv> Nor is it anything like a single, incidental flipped bit in the proxy — we've seen it on different images.
[07:30] <rvba> jtv: it's clearly related to using a proxy.
[07:30] <rvba> I ran the script without problems on canonistack instances yesterday (without using proxies).
[07:30] <jtv> I believe you — but it's not a single, incidental flipped bit in its cache.
[07:31] <rvba> Now I'm trying again on a canonistack instance but with a proxy this time.
[07:31] <jtv> Now, proxies may easily do weird things with partial downloads, right?
[07:31] <rvba> Maybe.
[07:31] <rvba> I suspect the weirdness in on simplestreams' side.
[07:31] <rvba> But I could be wrong.
[07:32] <jtv> I wonder if it could be something like "proxy may not be sending exactly the same segment that you asked for, but script assumes that it does."
[07:32] <rvba> Maybe.  Let's first reproduce the problem in a clean environment.  Then we can start reasoning about the problem.
[07:33] <bigjools> rvba: install the same proxy in canonistack and get maas to use it
[07:33] <bigjools> in fact we install a deb proxy by default anyway
[07:34] <rvba> That's precisely what I've done :).
[07:34] <bigjools> ha!
[07:34] <bigjools> also is it my imagination or is the python script really slow?
[07:35] <jtv> It's doing humongous downloads.
[07:35] <rvba> It is painfully slow.  But it has huge files to download.
[07:35] <bigjools> I mean, it's taken 8 minutes to do two distroarchseries so far and the old script took about 1 minute TOTAL
[07:35] <jtv> Quarter-gig files.
[07:35] <bigjools> (on canonistack)
[07:35] <bigjools> I smell a rat
[07:35] <bigjools> I'm going to look at the simplestreams code
[07:36] <rvba> Hum, I got a TCP_MISS_TIMEDOUT in squid and the script failed with unexpected checksum 'sha256'.
[07:36] <rvba> But I don't remember seeing TCP_MISS_TIMEDOUT in the lab…
[07:36] <bigjools> hmmmmm
[07:36] <rvba> I probably need to configure the proxy better.
[07:36] <bigjools> this is all related
[07:36] <rvba> So that it can cope with very long downloads.
[07:36] <jtv> bigjools: look for contentsource.py
[07:37] <jtv> Unfortunately it has a lot of different ways of doing things.
[07:37] <jtv> Fundamentally, it creates a file-like object with the URL, and does a seek() for the offset of the current batch.
[07:38] <jtv> Oh, that's for "file://" URLs.
[07:38] <jtv> There's a lot of different data paths here.
[07:38] <rvba> bigjools: http://paste.ubuntu.com/6249803/ this is squid's logs
[07:38] <rvba> The first request is made by simplestreams.
[07:38] <rvba> The second one by wget.
[07:38] <bigjools> batch?
[07:39] <rvba> And simplestreams explodes with the checksum error right after the download occurs.
[07:39] <bigjools> my conclusion is that it's buggy as hell
[07:40] <bigjools> rvba: the log doesn't say if it's using Range:
[07:40] <jtv> If we're going to look at the source, we'll need to know which method it chooses for reading URLs.
[07:41] <jtv> File size for that first download is rather small, isn't it?
[07:41] <jtv> I'm surprised that gets a 200 response.
[07:41] <bigjools> the fact that this is slow as hell means it's doing something stupid
[07:42] <jtv> And the wget request was faster?
[07:42] <rvba> bigjools: you're right about the download time as well… it takes 11s to download an image with wget (without any proxy).
[07:42] <bigjools> massively
[07:42] <jtv> Actually, I've been assuming that the really huge number is the response size... what's that number right before the IP address?
[07:42] <rvba> So simplestreams is really doing something stupid.
[07:44] <jtv> I'm also a bit concerned about there being a deflate-aware version of the URL reading stuff.
[07:44] <jtv> The combination with batches is one of those things that look like they could easily go wrong
[07:48] <jtv> Ouch.  The code that selects the http download mechanism is nontrivial, but quietly swallows any error and makes assumptions about the reason.
[07:48] <jtv> Not a recipe for failure at all, that.
[07:50] <bigjools> IOError: [Errno 28] No space left on device
[07:50] <bigjools> crap
[07:50] <rvba> bigjools: yeah, you need a large instance, and use /mnt instead of /var/lib/maas
[07:51] <bigjools> what are you putting on /mnt?
[07:51] <rvba> I create a symlink so that the images are stored on /mnt
[07:52] <bigjools> is it already mounted then?
[07:52] <rvba> bigjools: cd /mnt/ ; mv /var/lib/maas .; cd /var/lib/; ln -s /mnt/maas/
[07:52] <rvba> bigjools: yes
[07:52] <bigjools> ok
[07:53] <bigjools> oh I already have a /mnt with lots of space
[07:53] <rvba> bigjools: yes you do :)
[07:53]  * bigjools moves 10Gb over
[07:55] <jtv> We really should sabotage the code that cleans up the broken file.
[07:56] <jtv> I think that's simplestreams/objectstores/__init__.py:
[07:56] <jtv>         if not cksum.check():
[07:56] <jtv>             os.unlink(partfile)
[07:56] <jtv>             if orig_part_size:
[07:56] <jtv>                 LOG.warn("resumed download of '%s' had bad checksum.", path)
[07:59] <bigjools> https://bugs.launchpad.net/simplestreams/+bug/1240838
[07:59] <rvba> Yay!
[08:00] <bigjools> ahem
[08:03] <jtv> We've got batching, checksumming, and compression all handled at more or less the same level in the code.  It'd be nice to have a clear indication of what's happening.
[08:03] <jtv> I can think of horror scenarios for any combination.
[08:06] <bigjools> jtv: prefer functions to do one simple thing
[08:08] <jtv> Doesn't that lead to a puzzle of lots of small functions?  :)
[08:09] <bigjools> the puzzle remains the same - the reading it becomes easier
[08:09] <bigjools> of it*
[08:09] <jtv> In that case I shall give less attention to complaints about my small functions.  :)
[08:10] <bigjools> jtv: I like them :)
[08:10] <bigjools> it's a trade-off of course
[08:11] <jtv> Down with spaghetti code, yay for macaroni code.
[08:11] <bigjools> more like lasagna
[08:11]  * jtv wonders whether it's a good idea to define "pizza code" in any detail
[08:12] <rvba> bigjools:  I really wonder what that script is doing… it takes minutes before I see a single hit in the proxy's cache.
[08:12] <bigjools> !
[08:12] <bigjools> please add that as a data point to the bug
[08:13] <jtv> The proxy only reports at the end of the request, not the start.
[08:13] <rvba> bigjools: of course the log entry is written *after* the download but since we know that the download itself it pretty quick, it does not explain anything…
[08:13] <jtv> Or so it seems to me — Apache does that, plus, it reports a return code doesn't it?
[08:13] <rvba> bigjools: I will, of course.
[08:14] <bigjools> rvba: or is it quick ...
[08:14] <bigjools> we need debugging in the main script
[08:14] <rvba> Yeah, that's the only option left…
[08:14] <rvba> :/
[08:16] <jtv> I think we need debugging code in the simplestreams code... For starters, which "opener" we get.
[08:16] <jtv> Well no, for _starters_, we need that file so we can look at it.
[08:26] <gmb> allenap: When you wrote the comment on line 248 of http://bazaar.launchpad.net/~maas-maintainers/maas/trunk/view/head:/src/maascli/api.py#L248, what were you actually meaning?
[08:27]  * bigjools guesses unicode
[08:28] <bigjools> and print leaves a \n IIRC
[08:28] <bigjools> gmb: he's not around today BTW
[08:28] <gmb> bigjools: Right, hence the "files downloaded with the API get an extra newline"
[08:28] <gmb> OIC.
[08:29] <bigjools> yeah - easy to fix I reckon :)
[08:29] <gmb> bigjools: Oh, definitely :). Question: do we _want_ that extra newline for prettification of textual output to stdout?
[08:29] <gmb> (If so, I'll put it on stderr, which is what we do for headers when printing those out)
[08:29] <bigjools> there's a fun question
[08:30] <bigjools> we probably do
[08:30] <bigjools> and that's a neat solution
[08:30] <gmb> Cool beans.
[08:34] <allenap> gmb: I think it was two things, "trailing newline" and "might encode", but you seem to be there already.
[08:40] <gmb> allenap: Right - I'm on dangling modifier patrol today, so I didn't know if it was two statements or one.
[11:28] <rvba> smoser: ping
[12:33] <smoser> rvba, here.
[12:33] <rvba> Hi smoser, I'd like to talk about bug 1240652.
[12:34] <rvba> smoser: my testing shows that simplestreams is simply stuck in contentsource.py:RequestsUrlReader.read
[12:35] <smoser> rvba, can i see a system where you reproduced ?
[12:35] <rvba> After a while, the proxy times out, closes the request;  then simplestreams tries to checksum a partial file and that fails.
[12:35] <rvba> smoser: yes, one sec
[12:35] <smoser> i can follow your doc its fine in comment f4 excetp for squid config.  that'd be the only part dependent on maas.
[12:36] <rvba> smoser: right, I didn't do any modification to the config, so it's whatever maas ships.
[12:37] <rvba> smoser: and we also saw the same problem in the lab, using a different proxy (i.e. not the one shipped with maas).
[12:38] <rvba> smoser: ssh ubuntu@10.55.61.23
[12:38] <rvba> smoser: join the byobu session there, I'll show you around
[12:39] <smoser> rvba, k. there are 2 things.
[12:39] <smoser> which url reader are we using.
[12:40] <rvba> smoser: the default one, python-requests
[12:41] <rvba> smoser: I put lots of debugging output :)
[12:41] <smoser> yeah. so i got rid of that, and now it seems functional.
[12:41] <smoser> this sucks.
[12:43] <smoser> rvba, so, it uses python-requests if it is found with sufficient version.
[12:43] <smoser> the reason for using it is that  it supports compressed encoding
[12:43] <smoser> (which clearly isn't useful on already compressed data, but for metadata it is)
[12:44] <rvba> smoser: I see… well, using a different url reader it seems to work indeed.
[12:44] <rvba> Which, like you said, sucks.
[12:51] <smoser> rvba, i found http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712915
[12:51] <smoser> earlier in the cycle.
[12:51] <smoser> so its fairly clear that this is unfortunately not really well tested code
[12:51] <smoser> i'll debug, rvba
[12:51] <smoser> thanks for your debugging.
[12:52] <smoser> whats weird is that we could not reproduce in the lab after the first time.
[12:52] <rvba> Thanks for your help smoser.
[12:52] <rvba> I think that's because the partial download stuff hides the problem somehow.
[14:15] <smoser> rvba,
[14:16] <smoser> why cann't i reproduce with
[14:16] <smoser> http://paste.ubuntu.com/6251217/
[14:17] <smoser> any ideas? the "this fails" doesn't really fail.
[14:25] <rvba> smoser: let me try and see if I can spot a difference with when we run maas-import-ephemerals…
[14:25] <smoser> found it
[14:26] <smoser> python2 -> python3 fixes
[14:26] <smoser> joy.
[14:26] <rvba> Great joy indeed :)
[14:53] <smoser> rvba, fwiw, https://bugs.launchpad.net/simplestreams/+bug/1240838
[14:54] <smoser> "it takes several minutes before the proxy is actually used" is probably wrong
[14:54] <smoser> i've been confused by this before.
[14:54] <smoser> squid prints to log only after downloading
[14:54] <rvba> smoser: yeah, that's the same isse.
[14:54] <rvba> issue*
[14:54] <rvba> Yep.
[14:54] <smoser> it is very confusing when you're tailing logs :)
[14:54] <smoser> and downloading large files.
[14:54] <rvba> But since the download was supposed to take 10 seconds, it didn't make sense for the script to be blocked for several minutes.
[14:55] <rvba> Now we know this is all because of python-requests acting up.
[15:01] <jpds> smoser: Hey.
[15:01] <smoser> matsubara, rvba fudge.
[15:01] <smoser> https://bugs.launchpad.net/maas/+bug/1240652
[15:01] <smoser> "
[15:01] <smoser> FWIW, I ran into this bug yesterday while testing the ISO and there was no proxy involved.
[15:01] <smoser> "
[15:01] <smoser> i sort of dont believe that. as i cannot recreate it, and *with* a proxy, its trivial to recreate.
[15:01] <smoser> oh. wait. that is the locking bug.
[15:02] <smoser> that is basically unerstood.
[15:02] <smoser> never mind.
[15:02] <smoser> jpds, whats up?
[15:02] <jpds> smoser: You heard any reports about pserv failing to come back up on an upgrade from yesterday's package?
[15:02] <rvba> smoser: really?  There is no cron-like system running the script at the moment, so I don't believe in the locking thing :)
[15:03] <rvba> smoser: I don't see why the bug we saw couldn't be triggered from time to time without the usage of a proxy.
[15:03] <jpds> smoser: I just upgraded, and maas-pserv went away: http://paste.ubuntu.com/6251438/
[15:03] <jpds> smoser: But doing: service maas-pserv start --- made it come back.
[15:03] <rvba> We just know it can be reproduced with a proxy.  We're not sure it only happens with a proxy.
[15:05] <smoser> roaksoax, ^ see jpds above, thoughts?
[15:05] <jpds> I can see in line 86 that pserv was suppose to come back, but then it disappeared...
[15:07] <smoser> jpds, /var/log/upstart/maas-pserv.log have nything?
[15:07] <roaksoax> uhmmm
[15:08] <roaksoax> yeah the upstartlog should show more
[15:08] <jpds> http://paste.ubuntu.com/6251469/
[15:10] <smoser> release week is fun.
[15:11] <roaksoax> yeah
[15:13] <roaksoax> uhmm thays a new message ive never seen before
[15:13] <jpds> Of course, there's no timestamp which helps.
[15:16] <smoser> timestamps are for weenies
[15:31] <smoser> hm..
[15:31] <smoser> it would seem that if maas is expecting any value out of squid3 that it needs to configure it.
[15:31] <smoser> #Default:
[15:31] <smoser> # maximum_object_size 4 MB
[15:32] <smoser> anything over that size (from /etc/squid3/squid.conf)  is not getting cached.
[15:32] <smoser> so small debs will get cached. but images not nor large debs.
[15:35] <matsubara> smoser, the proxy in the qa lab have maximum_object_size 1000 MB which should be enough to cache the images (not sure if your comments above are related to bug 1240652)
[15:36] <smoser> well, they are related.
[15:36] <smoser> but the point still stands.
[15:36] <smoser> squid3 default config is basically useless to maas.
[15:36] <smoser> so if we're installing it, we need to be configuring it
[15:36] <smoser> (also, default is no on disk cache, memory only)
[15:38] <matsubara> smoser, in any case, when I tested the ISO and got the checksum error, I wasn't using a proxy but my connection to the datacenter is not so great. Could it be that I managed to trigger the bug locally without a proxy because m-i-e was waiting for data, didn't get it for some time, then it retried the download, triggering the bug?
[15:39] <matsubara> smoser, I think squid3 is part of the dependencies for maas because of the s-d-p. I don't think we require the default install of MAAS to have any proxy configured.
[15:40] <smoser> ah. i forgot that we have squid-deb-proxy
[15:40] <smoser> yeah, the dfeault config of that should make more sense.
[15:40] <smoser> but still wont help for images
[16:19] <smoser> i think i have a fix for bug
[16:19] <smoser> http://paste.ubuntu.com/6251802/
[16:20] <smoser> matsubara, where did you do this test ?
[16:20] <smoser> where you saw it from the iso
[16:20] <smoser> i think there is a proxy in your path
[16:20] <smoser> in your network path
[16:21] <matsubara> smoser, locally on my laptop. I have a proxy running on my laptop too but I didn't use it during the install (i.e. the installer asks if I want to use a http proxy for which I answered no)
[16:22] <smoser> you could be proxied along your path
[16:22] <smoser> ?
[16:22] <smoser> ie, another transparent proxy.
[16:24] <roaksoax> Ruetobas: ping
[16:24] <roaksoax> err
[16:24] <roaksoax> sorry
[16:24] <roaksoax> rvba: ping
[16:29] <matsubara> smoser, I might. I'm pretty sure there's no transparent proxy inside my network. If my ISP has a tranparent proxy running, then it's beyond my power to change any config (and would make this bug even more critical). Is there any way to check if I'm going through a transparent proxy? I guess not right, that's the point of it being transparent
[16:29] <smoser> matsubara, only to find bugs like this :)
[16:30] <roaksoax> allenap: around?
[16:30] <smoser> you're right. clealy its not your fault, and you have no control. i was just trying to explain what i think is wrong but didn't make sense for your path.
[17:09] <roaksoax> adam_g: where's the wikipage that you were maintining showing how to install openstack with maas& juju? https://help.ubuntu.com/community/UbuntuCloudInfrastructure ?
[17:10] <adam_g> roaksoax, here? https://wiki.ubuntu.com/ServerTeam/OpenStackHA
[17:11] <roaksoax> adam_g: that's for the HA stuff.. i mean the one you were maintaining on help.ubuntu.com/community?
[17:14] <adam_g> roaksoax, there was one at the URL you posted, dunno where it got moved to
[17:14] <adam_g> roaksoax, jorge also started https://help.ubuntu.com/community/UbuntuCloudInfrastructure/JujuBundle.
[17:14] <roaksoax> adam_g: thanks!
[19:52] <Spideyman> does anyone here have much experience with MAAS restful API? What authentication header do I need to send with the key?
[19:52] <Spideyman> Authorization: Basic"
[19:52] <Spideyman> ?
[19:53] <Spideyman> X-Auth?
[19:56] <smoser> Spideyman, i'd look at maas-cli
[19:56] <smoser> (as in look there for the details)
[19:58] <Spideyman> smoser got it, thanks
[22:52] <bigjools> smoser: please can you help debug https://bugs.launchpad.net/bugs/1240838, it makes ephemeral download practically unusable