=== freeflying is now known as freeflying_away | ||
=== freeflying_away is now known as freeflying | ||
=== freeflying is now known as freeflying_away | ||
=== CyberJacob|Away is now known as CyberJacob | ||
bigjools | jamespage: | 07:14 |
---|---|---|
bigjools | ah | 07:14 |
bigjools | jtv: this one | 07:14 |
jtv | OK | 07:14 |
bigjools | so let's discuss the actual symptoms | 07:15 |
bigjools | rvba has the most experience so far | 07:15 |
bigjools | and we can fill out the bug report as it's one of those "report the solution" rather than "report the bug" | 07:15 |
jtv | I guess a broken download could easily be hidden behind a retry... | 07:15 |
rvba | symptoms: you run the import-pxe-files and you get: http://paste.ubuntu.com/6244329/ | 07:15 |
jtv | Might be worth keeping the failed download, and seeing what "strings" and "cmp" can tell us. | 07:17 |
bigjools | rvba: ok and is the file obviously the wrong size, or just corrupt>? | 07:18 |
bigjools | all useful info missing from the bug :( | 07:18 |
jtv | Because the file itself is missing by the time we see it. | 07:18 |
rvba | bigjools: obviously the wrong size, very small whilst the file should be 200M or so | 07:18 |
bigjools | ok, so maybe it crapped out early | 07:18 |
bigjools | has it ever worked? | 07:18 |
rvba | Yes. I worked at some point. | 07:19 |
rvba | It* | 07:19 |
rvba | Let's see if we can reproduce it in a clean environment like a fresh canonistack instance. | 07:19 |
bigjools | there's a start. we need to find a scenario to recreate first then | 07:19 |
bigjools | I'll play there too | 07:19 |
rvba | Yes. | 07:19 |
rvba | Then we need to report a proper bug with all the relevant info. | 07:20 |
jtv | I guess we should also check which seems to be the correct checksum — for all we know we could be getting the wrong one from the index. | 07:20 |
bigjools | very very very unlikely :) | 07:21 |
jtv | All of this is, isn't it? | 07:22 |
jtv | Anyway, we should learn more as soon as we can compare files. | 07:22 |
bigjools | I used debtree on the maas package today. Holy cow. | 07:22 |
jtv | Try to compare to other, similar applictions before judging. :) | 07:24 |
bigjools | not judging just noting | 07:24 |
bigjools | so maas-import-ephemerals - not a sausage of debug output? | 07:25 |
bigjools | or any indication it's doing anything | 07:25 |
bigjools | IIRC smoser filed a bug about it | 07:25 |
jtv | I'm looking for the retry/partial-downlod logic. Should find out how to get more information out of it. | 07:26 |
bigjools | rvba: this happened for you in the lab, right? | 07:26 |
rvba | Correct. | 07:27 |
bigjools | rvba: is it still failing there? | 07:27 |
rvba | Define "still" :) | 07:27 |
rvba | Last time I used the lab, it failed. | 07:27 |
rvba | Last time Diogo used the lab, it failed. | 07:27 |
bigjools | can you try removing the ephemerals completely and try again? | 07:28 |
jtv | Proxy weirdness could be involved... | 07:28 |
jtv | That still fails. | 07:28 |
bigjools | I still think we need a lock BTW | 07:28 |
rvba | bigjools: you mean removing the existing ephemerals? | 07:28 |
bigjools | rvba: yes | 07:28 |
bigjools | rm -rf /var/lib/maas/ephemerals/* | 07:29 |
rvba | In the lab, we create a new VM every time we run tests. | 07:29 |
rvba | So this is long gone. | 07:29 |
bigjools | oh! | 07:29 |
bigjools | ok | 07:29 |
bigjools | thought you were playing around on one instance | 07:29 |
rvba | No. Yesterday, I saw it on one instance, then Diogo saw the same problem on a completely different instance. | 07:30 |
bigjools | dandy | 07:30 |
jtv | Nor is it anything like a single, incidental flipped bit in the proxy — we've seen it on different images. | 07:30 |
rvba | jtv: it's clearly related to using a proxy. | 07:30 |
rvba | I ran the script without problems on canonistack instances yesterday (without using proxies). | 07:30 |
jtv | I believe you — but it's not a single, incidental flipped bit in its cache. | 07:30 |
rvba | Now I'm trying again on a canonistack instance but with a proxy this time. | 07:31 |
jtv | Now, proxies may easily do weird things with partial downloads, right? | 07:31 |
rvba | Maybe. | 07:31 |
rvba | I suspect the weirdness in on simplestreams' side. | 07:31 |
rvba | But I could be wrong. | 07:31 |
jtv | I wonder if it could be something like "proxy may not be sending exactly the same segment that you asked for, but script assumes that it does." | 07:32 |
rvba | Maybe. Let's first reproduce the problem in a clean environment. Then we can start reasoning about the problem. | 07:32 |
bigjools | rvba: install the same proxy in canonistack and get maas to use it | 07:33 |
bigjools | in fact we install a deb proxy by default anyway | 07:33 |
rvba | That's precisely what I've done :). | 07:34 |
bigjools | ha! | 07:34 |
bigjools | also is it my imagination or is the python script really slow? | 07:34 |
jtv | It's doing humongous downloads. | 07:35 |
rvba | It is painfully slow. But it has huge files to download. | 07:35 |
bigjools | I mean, it's taken 8 minutes to do two distroarchseries so far and the old script took about 1 minute TOTAL | 07:35 |
jtv | Quarter-gig files. | 07:35 |
bigjools | (on canonistack) | 07:35 |
bigjools | I smell a rat | 07:35 |
bigjools | I'm going to look at the simplestreams code | 07:35 |
rvba | Hum, I got a TCP_MISS_TIMEDOUT in squid and the script failed with unexpected checksum 'sha256'. | 07:36 |
rvba | But I don't remember seeing TCP_MISS_TIMEDOUT in the lab… | 07:36 |
bigjools | hmmmmm | 07:36 |
rvba | I probably need to configure the proxy better. | 07:36 |
bigjools | this is all related | 07:36 |
rvba | So that it can cope with very long downloads. | 07:36 |
jtv | bigjools: look for contentsource.py | 07:36 |
jtv | Unfortunately it has a lot of different ways of doing things. | 07:37 |
jtv | Fundamentally, it creates a file-like object with the URL, and does a seek() for the offset of the current batch. | 07:37 |
jtv | Oh, that's for "file://" URLs. | 07:38 |
jtv | There's a lot of different data paths here. | 07:38 |
rvba | bigjools: http://paste.ubuntu.com/6249803/ this is squid's logs | 07:38 |
rvba | The first request is made by simplestreams. | 07:38 |
rvba | The second one by wget. | 07:38 |
bigjools | batch? | 07:38 |
rvba | And simplestreams explodes with the checksum error right after the download occurs. | 07:39 |
bigjools | my conclusion is that it's buggy as hell | 07:39 |
bigjools | rvba: the log doesn't say if it's using Range: | 07:40 |
jtv | If we're going to look at the source, we'll need to know which method it chooses for reading URLs. | 07:40 |
jtv | File size for that first download is rather small, isn't it? | 07:41 |
jtv | I'm surprised that gets a 200 response. | 07:41 |
bigjools | the fact that this is slow as hell means it's doing something stupid | 07:41 |
jtv | And the wget request was faster? | 07:42 |
rvba | bigjools: you're right about the download time as well… it takes 11s to download an image with wget (without any proxy). | 07:42 |
bigjools | massively | 07:42 |
jtv | Actually, I've been assuming that the really huge number is the response size... what's that number right before the IP address? | 07:42 |
rvba | So simplestreams is really doing something stupid. | 07:42 |
jtv | I'm also a bit concerned about there being a deflate-aware version of the URL reading stuff. | 07:44 |
jtv | The combination with batches is one of those things that look like they could easily go wrong | 07:44 |
jtv | Ouch. The code that selects the http download mechanism is nontrivial, but quietly swallows any error and makes assumptions about the reason. | 07:48 |
jtv | Not a recipe for failure at all, that. | 07:48 |
bigjools | IOError: [Errno 28] No space left on device | 07:50 |
bigjools | crap | 07:50 |
rvba | bigjools: yeah, you need a large instance, and use /mnt instead of /var/lib/maas | 07:50 |
bigjools | what are you putting on /mnt? | 07:51 |
rvba | I create a symlink so that the images are stored on /mnt | 07:51 |
bigjools | is it already mounted then? | 07:52 |
rvba | bigjools: cd /mnt/ ; mv /var/lib/maas .; cd /var/lib/; ln -s /mnt/maas/ | 07:52 |
rvba | bigjools: yes | 07:52 |
bigjools | ok | 07:52 |
bigjools | oh I already have a /mnt with lots of space | 07:53 |
rvba | bigjools: yes you do :) | 07:53 |
* bigjools moves 10Gb over | 07:53 | |
jtv | We really should sabotage the code that cleans up the broken file. | 07:55 |
jtv | I think that's simplestreams/objectstores/__init__.py: | 07:56 |
jtv | if not cksum.check(): | 07:56 |
jtv | os.unlink(partfile) | 07:56 |
jtv | if orig_part_size: | 07:56 |
jtv | LOG.warn("resumed download of '%s' had bad checksum.", path) | 07:56 |
bigjools | https://bugs.launchpad.net/simplestreams/+bug/1240838 | 07:59 |
ubot5 | Ubuntu bug 1240838 in simplestreams "simplestreams is several orders of magnitude slower at downloading than wget" [Undecided,New] | 07:59 |
rvba | Yay! | 07:59 |
bigjools | ahem | 08:00 |
jtv | We've got batching, checksumming, and compression all handled at more or less the same level in the code. It'd be nice to have a clear indication of what's happening. | 08:03 |
jtv | I can think of horror scenarios for any combination. | 08:03 |
bigjools | jtv: prefer functions to do one simple thing | 08:06 |
jtv | Doesn't that lead to a puzzle of lots of small functions? :) | 08:08 |
bigjools | the puzzle remains the same - the reading it becomes easier | 08:09 |
bigjools | of it* | 08:09 |
jtv | In that case I shall give less attention to complaints about my small functions. :) | 08:09 |
bigjools | jtv: I like them :) | 08:10 |
bigjools | it's a trade-off of course | 08:10 |
jtv | Down with spaghetti code, yay for macaroni code. | 08:11 |
bigjools | more like lasagna | 08:11 |
* jtv wonders whether it's a good idea to define "pizza code" in any detail | 08:11 | |
rvba | bigjools: I really wonder what that script is doing… it takes minutes before I see a single hit in the proxy's cache. | 08:12 |
bigjools | ! | 08:12 |
bigjools | please add that as a data point to the bug | 08:12 |
jtv | The proxy only reports at the end of the request, not the start. | 08:13 |
rvba | bigjools: of course the log entry is written *after* the download but since we know that the download itself it pretty quick, it does not explain anything… | 08:13 |
jtv | Or so it seems to me — Apache does that, plus, it reports a return code doesn't it? | 08:13 |
rvba | bigjools: I will, of course. | 08:13 |
bigjools | rvba: or is it quick ... | 08:14 |
bigjools | we need debugging in the main script | 08:14 |
rvba | Yeah, that's the only option left… | 08:14 |
rvba | :/ | 08:14 |
jtv | I think we need debugging code in the simplestreams code... For starters, which "opener" we get. | 08:16 |
jtv | Well no, for _starters_, we need that file so we can look at it. | 08:16 |
gmb | allenap: When you wrote the comment on line 248 of http://bazaar.launchpad.net/~maas-maintainers/maas/trunk/view/head:/src/maascli/api.py#L248, what were you actually meaning? | 08:26 |
* bigjools guesses unicode | 08:27 | |
bigjools | and print leaves a \n IIRC | 08:28 |
bigjools | gmb: he's not around today BTW | 08:28 |
gmb | bigjools: Right, hence the "files downloaded with the API get an extra newline" | 08:28 |
gmb | OIC. | 08:28 |
bigjools | yeah - easy to fix I reckon :) | 08:29 |
gmb | bigjools: Oh, definitely :). Question: do we _want_ that extra newline for prettification of textual output to stdout? | 08:29 |
gmb | (If so, I'll put it on stderr, which is what we do for headers when printing those out) | 08:29 |
bigjools | there's a fun question | 08:29 |
bigjools | we probably do | 08:30 |
bigjools | and that's a neat solution | 08:30 |
gmb | Cool beans. | 08:30 |
allenap | gmb: I think it was two things, "trailing newline" and "might encode", but you seem to be there already. | 08:34 |
gmb | allenap: Right - I'm on dangling modifier patrol today, so I didn't know if it was two statements or one. | 08:40 |
rvba | smoser: ping | 11:28 |
=== freeflying_away is now known as freeflying | ||
smoser | rvba, here. | 12:33 |
rvba | Hi smoser, I'd like to talk about bug 1240652. | 12:33 |
ubot5 | bug 1240652 in MAAS "maas-import-ephemerals crashes with "unexpected checksum 'sha256'" when using a proxy" [Critical,Triaged] https://launchpad.net/bugs/1240652 | 12:33 |
rvba | smoser: my testing shows that simplestreams is simply stuck in contentsource.py:RequestsUrlReader.read | 12:34 |
smoser | rvba, can i see a system where you reproduced ? | 12:35 |
rvba | After a while, the proxy times out, closes the request; then simplestreams tries to checksum a partial file and that fails. | 12:35 |
rvba | smoser: yes, one sec | 12:35 |
smoser | i can follow your doc its fine in comment f4 excetp for squid config. that'd be the only part dependent on maas. | 12:35 |
rvba | smoser: right, I didn't do any modification to the config, so it's whatever maas ships. | 12:36 |
rvba | smoser: and we also saw the same problem in the lab, using a different proxy (i.e. not the one shipped with maas). | 12:37 |
rvba | smoser: ssh ubuntu@10.55.61.23 | 12:38 |
rvba | smoser: join the byobu session there, I'll show you around | 12:38 |
smoser | rvba, k. there are 2 things. | 12:39 |
smoser | which url reader are we using. | 12:39 |
rvba | smoser: the default one, python-requests | 12:40 |
rvba | smoser: I put lots of debugging output :) | 12:41 |
smoser | yeah. so i got rid of that, and now it seems functional. | 12:41 |
smoser | this sucks. | 12:41 |
smoser | rvba, so, it uses python-requests if it is found with sufficient version. | 12:43 |
smoser | the reason for using it is that it supports compressed encoding | 12:43 |
smoser | (which clearly isn't useful on already compressed data, but for metadata it is) | 12:43 |
rvba | smoser: I see… well, using a different url reader it seems to work indeed. | 12:44 |
rvba | Which, like you said, sucks. | 12:44 |
smoser | rvba, i found http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712915 | 12:51 |
ubot5 | Debian bug 712915 in python-requests "python-requests: adapters.py uses undefined name ProxyManager" [Important,Fixed] | 12:51 |
smoser | earlier in the cycle. | 12:51 |
smoser | so its fairly clear that this is unfortunately not really well tested code | 12:51 |
smoser | i'll debug, rvba | 12:51 |
smoser | thanks for your debugging. | 12:51 |
smoser | whats weird is that we could not reproduce in the lab after the first time. | 12:52 |
rvba | Thanks for your help smoser. | 12:52 |
rvba | I think that's because the partial download stuff hides the problem somehow. | 12:52 |
smoser | rvba, | 14:15 |
smoser | why cann't i reproduce with | 14:16 |
smoser | http://paste.ubuntu.com/6251217/ | 14:16 |
smoser | any ideas? the "this fails" doesn't really fail. | 14:17 |
rvba | smoser: let me try and see if I can spot a difference with when we run maas-import-ephemerals… | 14:25 |
smoser | found it | 14:25 |
smoser | python2 -> python3 fixes | 14:26 |
smoser | joy. | 14:26 |
rvba | Great joy indeed :) | 14:26 |
smoser | rvba, fwiw, https://bugs.launchpad.net/simplestreams/+bug/1240838 | 14:53 |
ubot5 | Ubuntu bug 1240838 in simplestreams "simplestreams is slower at downloading than wget" [Undecided,Triaged] | 14:53 |
smoser | "it takes several minutes before the proxy is actually used" is probably wrong | 14:54 |
smoser | i've been confused by this before. | 14:54 |
smoser | squid prints to log only after downloading | 14:54 |
rvba | smoser: yeah, that's the same isse. | 14:54 |
rvba | issue* | 14:54 |
rvba | Yep. | 14:54 |
smoser | it is very confusing when you're tailing logs :) | 14:54 |
smoser | and downloading large files. | 14:54 |
rvba | But since the download was supposed to take 10 seconds, it didn't make sense for the script to be blocked for several minutes. | 14:54 |
rvba | Now we know this is all because of python-requests acting up. | 14:55 |
jpds | smoser: Hey. | 15:01 |
smoser | matsubara, rvba fudge. | 15:01 |
smoser | https://bugs.launchpad.net/maas/+bug/1240652 | 15:01 |
ubot5 | Ubuntu bug 1240652 in MAAS "maas-import-ephemerals crashes with "unexpected checksum 'sha256'" when using a proxy" [Critical,Triaged] | 15:01 |
smoser | " | 15:01 |
smoser | FWIW, I ran into this bug yesterday while testing the ISO and there was no proxy involved. | 15:01 |
smoser | " | 15:01 |
smoser | i sort of dont believe that. as i cannot recreate it, and *with* a proxy, its trivial to recreate. | 15:01 |
smoser | oh. wait. that is the locking bug. | 15:01 |
smoser | that is basically unerstood. | 15:02 |
smoser | never mind. | 15:02 |
smoser | jpds, whats up? | 15:02 |
jpds | smoser: You heard any reports about pserv failing to come back up on an upgrade from yesterday's package? | 15:02 |
rvba | smoser: really? There is no cron-like system running the script at the moment, so I don't believe in the locking thing :) | 15:02 |
rvba | smoser: I don't see why the bug we saw couldn't be triggered from time to time without the usage of a proxy. | 15:03 |
jpds | smoser: I just upgraded, and maas-pserv went away: http://paste.ubuntu.com/6251438/ | 15:03 |
jpds | smoser: But doing: service maas-pserv start --- made it come back. | 15:03 |
rvba | We just know it can be reproduced with a proxy. We're not sure it only happens with a proxy. | 15:03 |
smoser | roaksoax, ^ see jpds above, thoughts? | 15:05 |
jpds | I can see in line 86 that pserv was suppose to come back, but then it disappeared... | 15:05 |
smoser | jpds, /var/log/upstart/maas-pserv.log have nything? | 15:07 |
roaksoax | uhmmm | 15:07 |
roaksoax | yeah the upstartlog should show more | 15:08 |
jpds | http://paste.ubuntu.com/6251469/ | 15:08 |
smoser | release week is fun. | 15:10 |
roaksoax | yeah | 15:11 |
roaksoax | uhmm thays a new message ive never seen before | 15:13 |
jpds | Of course, there's no timestamp which helps. | 15:13 |
smoser | timestamps are for weenies | 15:16 |
smoser | hm.. | 15:31 |
smoser | it would seem that if maas is expecting any value out of squid3 that it needs to configure it. | 15:31 |
smoser | #Default: | 15:31 |
smoser | # maximum_object_size 4 MB | 15:31 |
smoser | anything over that size (from /etc/squid3/squid.conf) is not getting cached. | 15:32 |
smoser | so small debs will get cached. but images not nor large debs. | 15:32 |
matsubara | smoser, the proxy in the qa lab have maximum_object_size 1000 MB which should be enough to cache the images (not sure if your comments above are related to bug 1240652) | 15:35 |
ubot5 | bug 1240652 in MAAS "maas-import-ephemerals crashes with "unexpected checksum 'sha256'" when using a proxy" [Critical,Triaged] https://launchpad.net/bugs/1240652 | 15:35 |
smoser | well, they are related. | 15:36 |
smoser | but the point still stands. | 15:36 |
smoser | squid3 default config is basically useless to maas. | 15:36 |
smoser | so if we're installing it, we need to be configuring it | 15:36 |
smoser | (also, default is no on disk cache, memory only) | 15:36 |
matsubara | smoser, in any case, when I tested the ISO and got the checksum error, I wasn't using a proxy but my connection to the datacenter is not so great. Could it be that I managed to trigger the bug locally without a proxy because m-i-e was waiting for data, didn't get it for some time, then it retried the download, triggering the bug? | 15:38 |
matsubara | smoser, I think squid3 is part of the dependencies for maas because of the s-d-p. I don't think we require the default install of MAAS to have any proxy configured. | 15:39 |
smoser | ah. i forgot that we have squid-deb-proxy | 15:40 |
smoser | yeah, the dfeault config of that should make more sense. | 15:40 |
smoser | but still wont help for images | 15:40 |
=== freeflying is now known as freeflying_away | ||
=== freeflying_away is now known as freeflying | ||
=== freeflying is now known as freeflying_away | ||
smoser | i think i have a fix for bug | 16:19 |
smoser | http://paste.ubuntu.com/6251802/ | 16:19 |
smoser | matsubara, where did you do this test ? | 16:20 |
smoser | where you saw it from the iso | 16:20 |
smoser | i think there is a proxy in your path | 16:20 |
smoser | in your network path | 16:20 |
matsubara | smoser, locally on my laptop. I have a proxy running on my laptop too but I didn't use it during the install (i.e. the installer asks if I want to use a http proxy for which I answered no) | 16:21 |
smoser | you could be proxied along your path | 16:22 |
smoser | ? | 16:22 |
smoser | ie, another transparent proxy. | 16:22 |
roaksoax | Ruetobas: ping | 16:24 |
roaksoax | err | 16:24 |
roaksoax | sorry | 16:24 |
roaksoax | rvba: ping | 16:24 |
matsubara | smoser, I might. I'm pretty sure there's no transparent proxy inside my network. If my ISP has a tranparent proxy running, then it's beyond my power to change any config (and would make this bug even more critical). Is there any way to check if I'm going through a transparent proxy? I guess not right, that's the point of it being transparent | 16:29 |
smoser | matsubara, only to find bugs like this :) | 16:29 |
roaksoax | allenap: around? | 16:30 |
smoser | you're right. clealy its not your fault, and you have no control. i was just trying to explain what i think is wrong but didn't make sense for your path. | 16:30 |
roaksoax | adam_g: where's the wikipage that you were maintining showing how to install openstack with maas& juju? https://help.ubuntu.com/community/UbuntuCloudInfrastructure ? | 17:09 |
adam_g | roaksoax, here? https://wiki.ubuntu.com/ServerTeam/OpenStackHA | 17:10 |
roaksoax | adam_g: that's for the HA stuff.. i mean the one you were maintaining on help.ubuntu.com/community? | 17:11 |
adam_g | roaksoax, there was one at the URL you posted, dunno where it got moved to | 17:14 |
adam_g | roaksoax, jorge also started https://help.ubuntu.com/community/UbuntuCloudInfrastructure/JujuBundle. | 17:14 |
roaksoax | adam_g: thanks! | 17:14 |
Spideyman | does anyone here have much experience with MAAS restful API? What authentication header do I need to send with the key? | 19:52 |
Spideyman | Authorization: Basic" | 19:52 |
Spideyman | ? | 19:52 |
Spideyman | X-Auth? | 19:53 |
smoser | Spideyman, i'd look at maas-cli | 19:56 |
smoser | (as in look there for the details) | 19:56 |
Spideyman | smoser got it, thanks | 19:58 |
=== Spideyman is now known as Spideyman_afk | ||
=== Spideyman_afk is now known as Spideyman | ||
=== CyberJacob is now known as CyberJacob|Away | ||
bigjools | smoser: please can you help debug https://bugs.launchpad.net/bugs/1240838, it makes ephemeral download practically unusable | 22:52 |
ubot5 | Ubuntu bug 1240838 in simplestreams "simplestreams is slower at downloading than wget" [Undecided,Triaged] | 22:52 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!