=== rberger_ is now known as rberger === mathiaz_ is now known as mathiaz [22:31] elmo: I just noticed that us-east-1.ec2.archive.ubuntu.com resolves to four IP addresses now (within EC2). [22:31] elmo: Are these in different availability zones? [22:31] erichammond: yep, they are! [22:31] nice [22:32] elmo: Are there DNS names for the individual hosts so that I can add failover to my apt.sources ? [22:35] For example, us-east-1.ec2.archive.ubuntu.com would be the round robin for load balancing and the individual hosts might have names like us-east-1-mirror1, -mirror2, -mirror3, -mirror4 [22:35] If I just add us-east-1 to /etc/apt/sources.list (as defaulted in lucid AMI) this provides load balancing, but if the IP address I happen go get is down, then I have no failover. [22:36] erichammond: hmm, I thought we tested this and if the IP address is down down, apt will give up and try the next one - or am I misremembering? [22:37] elmo: We tested it and it does not retry. In fact, the apt software may never even get the chance to see multiple IP addresses. [22:37] I'm currently using the RightScale Ubuntu mirrors which have the individual host names as well as the round robin name. [22:37] really? sorry, can you remind me why it won't see the multiple IP addresses? [22:38] I might be wrong on that, but I thought it simply asks DNS for an IP address and gets one of them randomly. [22:39] I do know that I tested this when one of the Canonical archives was down and it did not retry with the archive that was up. [22:39] With Rightscale I list the sequence: roundrobin, mirror1, mirror2, mirror3. [22:40] This gets load balancing from the "roundrobin" name. If the IP address I happen to request is down, it downloads packages from the next available mirror. [22:41] it definitely gets all of the IPs back - and I know a web browser will retry the next IP if one IP of a round robin is down [22:41] There is a slight added expense of having to get the "apt-get update" from all mirors, but at least the "upgrade" only comes from the first match. [22:41] I'll check with apt; the reason I'm reluctant is that we use the same DNS round robin for failover for archive.ubuntu.com proper [22:41] so if it really doesn't work with apt that's a big problem [22:42] fair 'nuff [22:42] DNS RR isn't ideal, it doesn't cover the case of a server timing out rather than being completely down, but it definitely should do basic failover [22:43] It should be easy to test if you have access to a DNS server. [22:45] sure - it's more that I need to pack and sleep - but I'll open an RT ticket about it and get someone on my (former) team to check into it - do you want to be Cc-ed? [22:49] elmo: I love being in the loop, thanks :) [23:13] elmo: Looks like there's no need to create an RT ticket. [23:13] apt-get in Ubuntu 1.04 Lucid does cycle through the different IP addresses when one or more are down. [23:13] It even shows you which one it's trying as it tests each one. [23:13] \o/ [23:14] I'm pretty sure that it didn't do this back in Hardy, so it must have been added in the last two years. [23:14] Since I'm upgrading everything to Lucid (gradually) I'm not going to worry about it. [23:14] cool [23:14] er, Ubuntu "10.04" [23:26] elmo: Looks like I'm going to have to lose face some more. I just did tests with apt-get on Hardy and it has the same failover behavior with round robin DNS entries. I have no explanation for the failure I remember. Hopefully I remember this test and conversation and don't bother you again in another year. [23:28] Hm, I wonder if there are different failure modes, some of which retry and some that don't. [23:28] hehe [23:28] there could be - in particular, a network failure that doesn't return immediate failure will still have bad behaviour [23:29] Yes, it's very slow (which allows me to see that it's trying different IP addresses)