/srv/irclogs.ubuntu.com/2021/10/06/#snappy.txt

mborzeckimorning05:54
mardy'morning mborzecki 06:13
zyga-mbpgood morning06:30
mborzeckimardy: zyga-mbp: hey06:32
mardyzyga-mbp: hi!06:41
pstolowskimorning07:02
zyga-mbphej pstolowski 07:02
mardyhi pstolowski 07:33
pstolowskihey mardy 07:35
pstolowskigood morning mvo !07:35
mvogood morning pstolowski 07:36
zyga-mbpgood morning mvo 07:41
zyga-mbplong time no see07:41
mvozyga-mbp: hey, indeed! nice to see you07:42
* zyga-mbp I was busy on my windows system lately, doing some fun and exciting stuff :)07:42
zyga-mbpit's about deploying Raspberry Pi 4 as a LAVA dispatcher for testing zephyr boards07:43
* zyga-mbp did his first cloud-init based nested setup07:43
zyga-mbpthere's ubuntu, snaps and lava in the mix07:43
zyga-mbpwhat's not to like :)07:43
zyga-mbpoh and landscape too07:44
* zyga-mbp mvo check out my detailed instructions and the photos of the assembled set https://git.ostc-eu.org/OSTC/infrastructure/rpi4-metal-setup :)07:45
mvozyga-mbp: oh, nice!07:46
zyga-mbpI will be setting up a few of those tomorrow, should fill one shelf in the rack07:47
zyga-mbpand finally some zephyr :)07:47
mupPR snapd#10866 closed: many: replace state.State restart support with overlord/restart <Created by pedronis> <Merged by pedronis> <https://github.com/snapcore/snapd/pull/10866>08:28
dn___I have a desperate problem with snap (I think). I see ` Switch "lxd" snap to cohort "+"` in snap changes and this kills my lxd cluster; I have than to remove lxd, install lxd and restore from savepoint;09:07
dn___What does the `cohort '+'` mean? And how can I stop it?09:08
dn___I see e.g. 09:19
dn___119  Done    today at 03:34 UTC      today at 03:34 UTC      Switch "lxd" snap to cohort "+"09:20
dn___several times in the log after each other, at some point it stops and lxd is broken and I need to remove/install it again to make it work; 09:20
pstolowskidn___: that "+" looks very weird, i cannot see anything obvious in the code (yet). you haven't played with cohorts before, have you? you can try snap switch --leave-cohort lxd09:31
dn___pstolowski: never - what does leaving mean? (I don't want to wreak avok in the lcust eagain)09:32
dn___I'm kinda struggeling with snap/lxd since a while; most of the time it works fine - sometimes it starts upgrading, but not all nodes update at the same moment and than the cluster breaks - but doesn't shutdown; this chorot thingie shutdown the lxd and all VMs 09:32
pstolowskidn___: see https://forum.snapcraft.io/t/managing-cohorts/8995 ; leaving a cohort basically means that lxd snap would not be constrained by the given cohort when refreshing. what does 'snap info lxd' show - is there a cohort listed?09:48
dn___pstolowski I run now the command - but let me check if I have an old output09:49
dn___http://pastie.org/p/2Dc7yuneXwLZFEpe0nUfjz this was before I run it - and http://pastie.org/p/31hdtXvVAag2PeUp9ORLJh after09:53
mborzeckimvo: can you land https://github.com/snapcore/snapd/pull/10882 ?09:54
mupPR #10882: tests/main/interfaces-many: run both variants on all possible Ubuntu systems <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10882>09:54
mborzeckimvo: and this one too: https://github.com/snapcore/snapd/pull/10884 09:56
mupPR #10884: tests/main: disable cgroup-devices-v1 and freezer tests on 21.10 <Simple 😃> <Created by bboozzoo> <https://github.com/snapcore/snapd/pull/10884>09:56
pstolowskidn___: that's very weird, there is no trace of cohort in the first output. something tried to switch cohort for lxd snap but I've no idea what. do you use any kind of orchestration for your clusters - something that could automatically do something like this (e.g. snap switch --cohort ... or snap refresh --cohort=... lxd?10:00
dn___pstolowski: hm - the servers are bare metal, setuped via ansible - but that's like 1y ago :)  otherwise I only use the lxc create/destroy via API nothing else/nothing fancy10:04
dn___pstolowski: the only other thingie I got as piece of info: it seems like the cluster goes out of sync from time to time (lxd) - and this seems to be related to snap updating/refreshing nodes10:05
dn___pstolowski: http://pastie.org/p/0DzGQNrXA1mdwKbjWQRjIY if that is any clue/help - the remove/install and restore is me fixing manual the node10:06
pstolowskidn___: it's a bit misterious what happened, the "+" isn't even a valid cohort id afaik.10:08
pstolowskidn___: also, "Switch "lxd" snap to cohort ..." task can only be a result of manual invocation of "snap switch.." command (or call to snapd's rest api)10:09
dn___oh, I run it manual on all instances - but it happend again :/10:10
dn___cluster down - what a joy; anything you want me to check before I do remove lxd, install lxd?:) I'm so glad for any idea10:10
dn___s10:10
pstolowskidn___: is lxd in cohort "+" on these other instances?10:10
dn___http://pastie.org/p/2TrdClykJi6sQUGCQrN4Zw10:11
pstolowskidn___: oh it got switched to "+" again?10:11
dn___pstolowski: I'm not 100% sure how to check; but all throw the same error before they kill LXD in a bad way10:11
dn___yeap :/10:11
dn___also -> this than leads to `-bash: /snap/bin/lxc: No such file or directory`10:12
dn___If I now stop lxd; remove it, install it, restore from saved, start it - it will be fine agian10:12
dn___also removing is bugging than, too and I need todo it manual10:13
dn___http://pastie.org/p/0CGlgR34Fn0onbyoEJ1EvD10:13
dn___ls -al /var/snap/lxd/21497/10:14
dn___drwxr-xr-x 2 root root 4096 Oct  6 09:52 .10:14
dn___drwxr-xr-x 3 root root 4096 Oct  6 10:13 ..10:14
dn___it keeps an empty dir 10:14
dn___also -> `lxd               21624  latest/stable  canonical✓  disabled,broken,in-cohort`10:14
dn___while broken10:15
pstolowskidn___: so I think the magical "Switch "lxd" snap to cohort "+"" is done by something else outside of our control (it's not snapd)10:15
pstolowskiand this breaks everything10:15
ograogra@anubis:~$ grep -r cohort /snap/lxd/current/*10:17
pstolowskiwe shouldn't fall over this for sure so this looks like a problem10:17
dn___hmm, any idea what it coudl be? It's a bare metal machine and nothing beside LXD/snap runs on it/is installed + the VMs10:17
ogra/snap/lxd/current/commands/daemon.start:    nsenter -t 1 -m snap switch lxd --cohort=+ >/dev/null || true10:17
ograpstolowski, ^^^10:17
pstolowskiouch10:17
dn___right after install snap list shows `lxd     4.19      21654  4.19/candidate  canonical✓  -`10:17
ograyes10:17
dn___but after a moment it shows10:17
dn___`lxd     4.19      21654  4.19/candidate  canonical✓  in-cohort`10:17
mupPR snapd#10874 closed: gadget: mv ensureLayoutCompatibility to gadget proper, add gadgettest pkg <Created by anonymouse64> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/10874>10:18
mupPR snapd#10882 closed: tests/main/interfaces-many: run both variants on all possible Ubuntu systems <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10882>10:18
mupPR snapd#10884 closed: tests/main: disable cgroup-devices-v1 and freezer tests on 21.10 <Simple 😃> <Created by bboozzoo> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10884>10:18
mupPR snapd#10892 opened: tests: add (strict) microk8s smoke test <Created by mvo5> <https://github.com/snapcore/snapd/pull/10892>10:18
pstolowskiok looking at the comment there "+" means something special10:19
dn___thank you very much; will try to get the node up agian :/10:19
dn___does it maybe help if I try to downgrade the cluster from 4.19/candidate to 4.18/stable - and is that even possible?:)10:21
ograuh, yo are running production from a candidate channel ? 10:22
ogra*you10:22
pstolowskidn___: might be worth filing a bug against snapd+lxd with all the details (and snap changes list) if it is reproducible10:22
dn___@o10:22
dn___@o10:22
dn___orga: sorry for spam / not on purpose... I think10:22
dn___I might wanted 4.X when I did the setupt and it was just a candidate - and forgot about it :/  10:22
dn___Is there a way of downgrading it to 4.18/stable?10:23
pstolowskisnap refresh --stable lxd10:23
dn___I would need this on all nodes + is there any issue e.g. with sqlite/migration/something? (just wondering - never did that)10:24
pstolowskidn___: i don't know, may make sense to ask this on snapcraft.io forums, or maybe stgraber ^ can advise if he is online10:26
ograand you should probably file a bug about the cohort thing so it does not move from candidate to stable later ... 10:27
dn___thank you - do you know by chance if all 'config/settings/sqlite' are stored  in the /var/snap/lxd folder? If so I would stop it, backup, reinstall stable and just try10:27
dn___ogra: got a url to a tracker/where to fill it?10:27
ograogra@anubis:~$ snap info lxd|grep contact 10:28
ogracontact:   https://github.com/lxc/lxd/issues10:28
ogratry there 🙂10:28
dn___thank you10:28
dn___I still don't really understand what is failing/failover to - is there anything I can google/check? 10:31
vidal72[m]for how long old snap versions are available in snap store? is there time limit or version limit?10:32
ogravidal72[m], that differs between user and developer/uploader10:33
ograas developer you have access to all revisions ever uploaded ... 10:34
vidal72[m]and as user?10:34
ograonly what the developer released to a track or channel10:35
vidal72[m]all revisions ever uploaded to track/channel?10:37
ograno, only the ones the developer released 10:38
ograi.e. the current ones10:38
vidal72[m]you mean only the latest one? For sure it's not the case, I can download older versions. My question is if there is some cleanup happening over time or can I download those old revisions infinitely?10:44
pstolowskimardy: hey, can you take a look at https://github.com/snapcore/snapd/pull/10824 again?10:44
mupPR #10824: tests: check that a snap that doesn't have gate-auto-refresh hook can call --proceed <Refresh control> <Created by stolowski> <https://github.com/snapcore/snapd/pull/10824>10:44
pstolowskiwould love to land that one10:45
vidal72[m]ogra: is my understanding correct that if uploader has access to old revisions forever then everyone else can have access too if they know the link to old revision?10:47
dn___I found one more odd thing, but maybe I'm reading it wrong: `lxd     4.19      21624  latest/stable  canonical✓  -` - read for me like 4.19 is in latest/stable; but snap info lxd shows `  latest/stable:    4.18        2021-09-13 (21497) 75MB -` - am I just reading it wrong?10:51
ogravidal72[m], i dont think so, but a store person would have to answer that ... technically a user should not have any access to not currentlöy released versions10:53
vidal72[m]a hidden feature then? :)10:58
ograrather a glaring bug 🙂10:58
vidal72[m]if that's the case then please forget this conversation ;)10:59
ograhaha10:59
mupPR snapd#10886 closed: o/snapstate: test prereq update if started by old version <Simple 😃> <Skip spread> <Created by MiguelPires> <Merged by bboozzoo> <https://github.com/snapcore/snapd/pull/10886>11:13
dn___without any change from me -> http://pastie.org/p/1AhUlvB7bEZyBwMo8ycAZT11:32
pstolowskimvo: could you please land https://github.com/snapcore/snapd/pull/10868 ?11:41
mupPR #10868: o/snapstate: support ignore-validation flag when updating to a specific snap revision <validation-sets :white_check_mark:> <Created by stolowski> <https://github.com/snapcore/snapd/pull/10868>11:41
mardypstolowski: +111:46
pstolowskithx11:47
pstolowskidn___: yeah i see this cohort-switching is also present in older versions of lxd (i checked 4.18). can you paste 'snap changes' again? I think it may be best to take this to the forum to also have input from lxd developers11:50
dn___pstolowski: http://pastie.org/p/3o6pg2XAXSXrUHEmlKrbAy is there any better pastie service these days? it seems to have only a ttl of 24h12:06
pstolowskidn___: pastebin.ubuntu.com ftw!12:07
dn___needs an account :)12:08
pstolowskidn___: just to double-check: right after you notice `Switch "lxd" snap to cohort "+"` task, the lxd snap appears broken right?12:09
pstolowskihmm right it needs an account12:09
dn___pstolowski: yes, but I can't say if it happens after the first switch or the last - because the switches happen so fast12:10
dn___pstolowski: https://pastebin.ubuntu.com/p/cdzRymdnhz/12:10
pstolowskidn___: thanks12:10
dn___pstolowski: thank you! Line: 123/124 - it happens quickly and after that it wa sbroken12:11
pstolowskidn___: silly question, do you know whey are there multiple switches, is this related to the cluster configuration?12:13
pstolowskihmm maybe the daemon start simply fails and is retried12:13
pstolowskidn___: one more thing, could you paste the output of systemctl status snap.lxd.daemon.service after it appears broken (and before you remove/reinstall it)?12:15
dn___sorry, lost my connection ... super day :)12:36
dn___pstolowski: regarding multiple switches: no idea, I never knew about cohort before I think or noticed - it also happens 'randomly' - was 12:36
dn___will try to keep an eye on the status & check systemctl when it happens12:36
pstolowskidn___: ok. i think it's just the deamon failing to start and getting re-tried by systemd a couple of times12:42
mupPR snapd#10868 closed: o/snapstate: support ignore-validation flag when updating to a specific snap revision <validation-sets :white_check_mark:> <Created by stolowski> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10868>12:49
dn___pstolowski: does older log entries help you? I can try to check them when it happend - let me try to find them12:56
pstolowskidn___: yes, they might. But it would be best if you collected all this under a bug report12:57
dn___you are right, will try12:58
pstolowskidn___: thanks. someone will probably look at this soon, but i'm not sure if it will be me13:01
dn___I'll start collecting - https://pastebin.ubuntu.com/p/f7kP4YJrbg/ and make an issue13:02
dn___that's when the last 'swicherido' happend13:02
dn___pstolowski: happend again -> https://pastebin.ubuntu.com/p/nxysmpc4FT/ (will do issue, still fighting)13:45
dn___pstolowski: just fyi - this is my full journey & output how I fix it when it happens: https://pastebin.ubuntu.com/p/bHG6cvSqb2/13:50
pstolowskidn___: you can stop doing "snap switch --leave-cohort lxd" because lxd wants to be in this cohort and will just keep switching to it anyway13:51
dn___oki, I see - would have been to easy l;-)13:51
pstolowski(I initially tought this was a wrong cohort coming from somewhere)13:51
dn___will try to debug/keep the cluster going & write the issue - also glad for any other idea;-)13:54
pstolowskidn___: input from lxd guys may help, therefore forum.snapcraft.io may be a good place13:57
pstolowskito discuss13:57
stgraberdn___, ogra, pstolowski: the + cohort is a special cohort that all LXD cluster users must be in14:01
stgraberif some servers aren't in it, they'll get different LXD releases during phased rollout, breaking the cluster14:01
dn___oki, thank you will try (forum/and trying to figure out if same cohort)14:04
pstolowskimvo: can you also land https://github.com/snapcore/snapd/pull/10824 ?14:14
mupPR #10824: tests: check that a snap that doesn't have gate-auto-refresh hook can call --proceed <Refresh control> <Created by stolowski> <https://github.com/snapcore/snapd/pull/10824>14:14
mvopstolowski: sure14:17
pstolowskity14:18
mupPR snapd#10824 closed: tests: check that a snap that doesn't have gate-auto-refresh hook can call --proceed <Refresh control> <Created by stolowski> <Merged by mvo5> <https://github.com/snapcore/snapd/pull/10824>14:19
mupPR core20#116 opened: hooks: adjtime: add adjtime file to etc <Created by stulluk> <https://github.com/snapcore/core20/pull/116>14:30
ijohnson[m]degville: can you publish https://forum.snapcraft.io/t/quota-groups/25553 on the snapcraft.io/docs site?14:58
mupPR snapd#10893 opened: i/builtin/kubernetes_support: add access to Calico lock file <Created by mardy> <https://github.com/snapcore/snapd/pull/10893>15:04
=== graham1 is now known as degville
degvilleijohnson[m]: sorry for the delay. That doc can be found at https://snapcraft.io/docs/quota-groups, but I'll also add it to the navigation so it's more discoverable.15:48
ijohnson[m]ah perfect, I couldn't find it via searching15:48
ijohnson[m]thanks degville 15:48
degvilleijohnson[m]: good point about the search. That sounds like a bug.15:49
pstolowskimvo: a bit of a confusion: https://github.com/snapcore/snapd/pull/1089416:14
mupPR #10894: [RFC] o/configcore: allow hostnames up to 253 characters, with dot-delimited elements <Created by stolowski> <https://github.com/snapcore/snapd/pull/10894>16:14
mupPR snapd#10894 opened: [RFC] o/configcore: allow hostnames up to 253 characters, with dot-delimited elements <Created by stolowski> <https://github.com/snapcore/snapd/pull/10894>16:14
pstolowskialso ijohnson[m] ^16:20
mupPR snapd#10895 opened: many: wait for up to 10min for NTP syncronization before autorefresh <Created by mvo5> <https://github.com/snapcore/snapd/pull/10895>16:34
dn___Oct  6 18:54:07 n02 snap-failure[331032]: retry.go:49: DEBUG: Retrying https://api.snapcraft.io/api/v1/snaps/names?confinement=strict%2Cclassic, attempt 1, elapsed time=5.781µs18:57
dn___Oct  6 18:54:07 n02 snap-failure[331032]: retry.go:184: DEBUG: Not retrying: &errors.errorString{s:"too many requests"}18:57
dn___does it mean snap does to many requests to the API? (I only got 10 instances, but all use the same external ip)18:57
mupPR snapcraft#3588 opened: snap: patch patchelf on riscv64 (CRAFT-566) <Created by sergiusens> <https://github.com/snapcore/snapcraft/pull/3588>21:15

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!