[05:24] <rsalveti> apw: opened bug 1597573 and bug 1597574 for amd overdrive, including the patches
[05:25] <rsalveti> still testing if something else is still missing
[14:13] <lamont> 4.4.0-24-generic kernel (xenial), trying to talk ipv6 through a firewall running 3.13.0-91-generic (trusty), and it unearths issues with ipv6 neighbor discovery??  http://paste.ubuntu.com/18170606/ (on the firewall, I see the icmp6 echo reply come in, which tells me that's why were doing neighbor solicitation)  the trace is from the xenial host
[14:13] <lamont> apw: ^^ was it you who said you saw something similar?
[14:14] <lamont> and with ipv6 hurting, google is not talking to me.
[14:14] <apw> i have been seeing some oddness, but it all seems to have gone happier for me recently
[14:14] <lamont> just don't reboot the firewall....
[14:15] <apw> my firewall is not ubuntu so i won't see that any more
[14:28] <lamont> ah.. this is one that happens to me everytime I reboot the (trusty) firewall, and then it's broken until I start really poking at it to debug it, and then it starts working for reasons unknown until I reboot again
[14:29] <lamont> apw: neighbor discovery, the last 4 of the ff02:: address have bit 3 flipped?
[14:32] <apw> i am not sure what the algoithm for workgin out the multicast group to broadcast at is
[14:35] <apw> lamont, no that is correct, you just happen to have fexx:yyyy as your address, but the multicast group addres is :ffxx:yyyy
[14:36] <apw> it is only dropping in three octets worth
[14:36] <apw>       Solicited-Node Address:  FF02:0:0:0:0:1:FFXX:XXXX
[14:39] <lamont> ah, ok
[14:39] <lamont> 'twas a headscratcher there.
[14:39] <apw> lamont, had me going for a bit there, especially as you have to follow through like 3 rfcs to find the info
[14:55] <lamont>     inet6 2601:282:8100:3500:24c:40ff:fe1a:c570/64 scope global mngtmpaddr dynamic 
[14:55] <lamont>        valid_lft 299sec preferred_lft 119sec
[14:56] <lamont> apw: ^^ fwiw, that's the addres line on the host that's ignoring the solicitations
[14:56] <apw> where are these solicitations passing to from ?
[14:57] <apw> you are implying they are through the firewall i think ?
[15:00] <lamont> apw: the firewall is attempting to solicit the host, so that it can send the icmp6 echo reply
[15:00] <lamont> you ready for how to fix it?
[15:00] <lamont> on the affected host: ip link set promisc on dev br0; sleep 1; ip link set promisc off dev br0
[15:01] <lamont> which is to say, 4.4.0-24-generic (and others, sooo many others) are sometimes failing to correctly set a multicast address on the nic
[15:01] <lamont> 04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
[15:01] <apw> hrm, how can you tell whether the multi-cast address is set ?
[15:02] <lamont> damn fine question there
[15:02] <apw> i wonder if ethtool can tell
[15:02] <lamont> http://paste.ubuntu.com/18173670/
[15:03] <lamont> of note: br0 is a bridge containing the eth0.2 interface
[15:03] <lamont> presumably this would be a bug to file against the linux ubuntu source package?
[15:04] <apw> yeah i guess so
[15:05] <lamont> it also fails on another box which has br0 ontaining eth0
[15:06] <apw> interesting, that must be the trigger somehow
[15:11] <lamont> Setting the bridge to promisc and turning it back off works around the issue.
[15:11] <lamont> Tcpdump on the underlying eth0.2 does not.
[15:12] <apw> lamont, lets get allllll that in a bug
[15:13] <lamont> yep
[15:13] <lamont> that was cut-n-waste from my bug window
[15:17] <lamont> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1597806 <-- apw
[15:18] <lamont> apw: for additional hilarity, I have some windows 10 users on the network, who also experience the issue
[15:23] <apw> sounds liek the same problem
[15:24] <lamont> yep
[15:25] <lamont> win 8.1 aparently, not 10
[15:30] <apw> lamont, https://www.v13.gr/blog/?p=378 i wonder if this helps any ... tl;dr disable igmp snooping on the bridge
[15:31] <lamont> ip neigh <-- TIL
[15:31] <lamont> cat /sys/devices/virtual/net/br0/bridge/multicast_snooping
[15:31] <lamont> 1
[15:32] <lamont> iface br0:ipv6 inet6 auto
[15:32] <lamont>   up ip addr add fe80::2/64 dev $IFACE
[15:32] <lamont>   up ip addr add fdd7:308c:d9cf::20/64 dev $IFACE
[15:32] <lamont>   down ip addr del fe80::2/64 dev $IFACE
[15:32] <lamont> ^^ my eni
[15:32] <apw> if tis reproducible, itmight be good to try turning her off
[15:38] <lamont> apw: and if it fixes it, it would be likely a good thing to JFD in the kernel...
[15:44] <apw> lamont, right ... and it is a good starting point to understand as well
[15:45] <lamont> comment added
[15:46] <lamont> apw: bug updated
[15:47] <apw> lamont, nice, is that definitaive proof that helps then, or do we need to wait a bit for confirmation
[15:48] <lamont> positively fixes the issue
[15:48] <lamont> what's missing from the comment is that the ping started working once the neighbor advertisement hit
[15:48] <lamont> (duh)
[15:48] <apw> cool.  then we need to decide if we just turn that off by default on bridges
[15:49] <lamont> I would recommend either turning that off, or ipv6 off. :p
[15:49] <apw> :)
[15:49] <lamont> just sayin'
[19:09] <lamont> apw: I think actually, the preferred answer is (c) make multicast_snooping not break neighbor discovery
[19:10] <lamont> I suspect it's just nomming on too much