[00:11] PR snapd#9745 closed: [RFC] seed: enable uc20 devmode snaps in dangerous models [06:45] morning [06:47] PR snapd#9762 closed: gadget: prepare gadget kernel refs (0/N) [08:02] morning [08:09] pstolowski: mvo: morning guys [08:11] good morning mborzecki and pstolowski [09:54] awfully quiet [10:00] * ogra rattles the chains [10:07] yes i just restarted my irc client, thought it was misbehaving ;) [10:08] haha [10:08] holiday season is clearly upon us [10:10] haha [10:10] yeah [10:18] on that note ... did anyone see my question about https://forum.snapcraft.io/t/scummvm-snap-failing-to-install-on-rpi-4/21394 yesterday ? [10:18] (install hook failing because snapctl is not allowed due to "install in progress") [10:23] ogra: is this with stable snapd, or edge? [10:23] thats with stable ... [10:23] armhf, debian buster based OS [10:23] ogra: ok. i'll check this thread later (it's pretty long!), and see if i can reproduce [10:24] you need PiOS and an rpi for it though [10:25] ogra: ah, it's Pi specific? doh [10:26] ogra: anyway, i'll see if i can deduce anything from the forum posts then [10:26] yeah and sadly scummvm is one of the apps heavily promoted by the pi foundation so it could be a very typical target for a "first snap" people install on their new pi400 they got for christmas [10:27] i got HW and an install here, the error is pretty clearly some ordering problem (but obviously only happening on that HW/OS) [10:27] Dez 07 11:54:01 raspberrypi scummvm.daemon[2935]: error: error running snapctl: snap "scummvm" has "install-snap" change in progress [10:28] thats the message i get [10:42] ogra: i can reproduce it also with x86 vm (on focal, 2.48.1) [10:42] pstolowski: hi, is #9429 now ready for re-review? [10:42] PR #9429: o/daemon: validation sets api and basic spread test [10:42] oh, wow ... [10:42] so that's "good", i can play around with it [10:42] i cant reproduce it on the same system using 20.10 desktop [10:43] interesting [10:43] (and others in the thread see the same) [10:44] perhaps we're just lucky there though [10:44] (race wise) [10:48] It is always fun to prove races are really fixed [10:55] pstolowski: interesting, can you post snap change? [10:56] mborzecki: https://pastebin.ubuntu.com/p/ktRYVjdndZ/ [10:57] snap.scummvm.daemon.service: Scheduled restart job, restart counter is at 5. [10:57] what is that? [10:59] pstolowski: hmm weird, what is the hook trying to run? [11:05] mborzecki, https://github.com/snapcrafters/scummvm/blob/master/snap/hooks/install [11:05] just "snapctl set" [11:06] (it shoudl admitedly perhaps just call "snapctl stop --disable snap.scummvm.daemon") [11:06] (and start it from the configure hook) [11:14] pedronis: hi, sorry, missed your question, yes [11:23] PR snapd#9771 opened: boot: boot config update & reseal [11:27] need 2nd review for https://github.com/snapcore/snapd/pull/9732 [11:27] PR #9732: asserts: snapasserts method to validate installed snaps against validation sets [11:31] pstolowski: ogra: taht would be a configure hook calling snapctl restart? [11:43] i'm unclear what this snap is doing / should do, i will take a closer look later today [11:44] but it clearly shouldn't fail like this [11:46] will also check if edge fixes it [11:57] pstolowski, I wrote the hook scripts. It just checks for and sets a default configuration option that is later read by a launch script. [11:58] alan_g: hi, yes, sorry, i understand that; what i mean is i don't have the big picture wrt services of this snap, i need to take a closer look [12:00] pstolowski: reviewed [12:00] Oh, the launch script just stops the service depending on the configuration option. [12:05] pedronis: ty [12:09] PR snapd#9772 opened: desktop/notification: test against a real session bus and notification server implementation [12:14] PR snapd#9162 closed: gadget: change mountedfilesystemwriter to use resolvedSource (3/N) [12:28] mborzecki, (sorry, was in a meeting) currently it is the install hook calling "snapctl set" to set a parameter that tells the daemon wrapper to start or not start the service [12:29] while i'm sure just moving the hook to be configure instead of install would help with the race, using snapctl from an install hook should indeed still work [13:04] PR snapd#9773 opened: interfaces/apparmor: do not fail durin initialization when there is no AppArmor profile for snap-confine [13:19] package dmitri.shuralyov.com/go/generated: unrecognized import path "dmitri.shuralyov.com/go/generated": https fetch: Get "https://dmitri.shuralyov.com/go/generated?go-get=1": dial tcp 172.93.50.41:443: connect: connection refused [13:19] wth? [13:21] why is this package even being pulled? [13:27] I don't see anything that refers to it [13:33] mborzecki, pedronis: it is a dependency of https://github.com/gordonklaus/ineffassign, which is run by the static checks. It looks like the domain in question has expired [13:34] jamesh: hm whois says it expires next yearhttps://paste.ubuntu.com/p/wpvGK4xSQr/ [13:35] mmh [13:35] https://paste.ubuntu.com/p/wpvGK4xSQr/ [13:35] mborzecki: you're right. I put in the wrong query [13:36] anyways, whether shady or not, it is kind of a bummer it's not on github or somehing [13:40] and maybe it is back now? https://dmitri.shuralyov.com/go/generated [13:48] heh, urls in import paths [13:56] If we switched to modules, I guess we'd avoid this by only depending on the module proxy being up [13:57] we would avoid many things if we could switch to modules :-) [13:58] ah i think i understand the issue with scummvm snap [14:01] it is the daemon script calling snapctl, which conflicts when the install change that is still running. and i think it might be racy and may succeed [14:03] ijohnson: hey, did you see my question yesterday about lxd install hook / namespace slowness? i quit irc shortly after so if you answered this, then i missed that [14:10] i this case it's not even us, but rather the ineffassign tool [14:11] Starting a daemon before install completes sounds pretty racy to me! [14:12] Is there a way the daemon script could detect this? [14:25] alan_g, why do you use a script at all ? just use the hooks directly ... make the install hook always stop the daemon, put the logic about starting it based on the setting into the configure hook [14:27] It's the first way I found that worked. Stopping in the install hook isn't enough to deal with reboots and restarts. [14:28] alan_g, https://github.com/ogra1/pi-fancontrol-snap/blob/master/snap/hooks/install#L8 similar to this [14:28] And IIRC snapctl doesn't do disable [14:28] alan_g, and this in the configure hook https://github.com/ogra1/pi-fancontrol-snap/blob/master/snap/hooks/configure#L16 [14:29] we use a similar setup in a lot of customer snaps and that works reliabely [14:29] *reliable [14:30] just add some extra logic to check for the setting and drop the script alrogether [14:30] Ack. I've not seen problems until now. And didn't see your approach when I first came up with this. [14:31] But how does your approach avoid the service starting after a reboot? [14:32] it is dsabled [14:32] the install hook calls: "snapctl stop --disable ${SNAP_NAME}.${SNAP_NAME} 2>&1 || true" [14:33] Oh! When did that become possible? [14:33] the configure hook checks if it is inactive (and can check additionally for the setting) and then calls "snapctl start --enable ${SNAP_NAME}.${SNAP_NAME} 2>&1 || true" [14:33] Or did I just not find the right docs? [14:33] i think that was always there [14:34] its after all just a frontend to systemd features [14:34] yes it has always been there ;) [14:35] Its a long time ago, but I *wanted* to disable and never figured out how. /o\ [14:35] alan_g, ogra i'm in the standup, give me a moment, i've suggestion for this snap [14:35] no hurry [14:35] (before christmas is fine i think 😄 ) [14:35] Not my snap anyway [14:36] But I have the same logic in several of mine [14:43] alan_g: so, 1) yes, snapctl stop --disable will be the cleanest [14:45] alan_g: 2) not sure why install hook has logic around snapctl get, install hook is only run once for the first intallation of the given snap where by definition there is no configuration... Such logic should live in configure hook. [14:47] pstolowski, it's just to make it the same as post-refresh. I didn't realise that configure would be run in both cases. [14:48] 3) the error we're seeing here is caused by daemon.sh calling snapctl when (re)starting during install. we currently detect such situation as a conflict so it fails. The solution to this is to use snapctl get.. to get all the configuration from configure hook and generate a config file from that (in snap data dir), and the daemon just reads the config file on start. [14:49] anyway, i suppose you won't need a config file after using --disable, but mentioning in case you need something more sophisticated elsewhere [14:53] Thanks. I'll try updating one of my snaps and report back on the forum [14:55] alan_g: that hook will need an update to work on core20 reliably, there's no snap_core, so the check should be modified to `grep -e snap_core= -e snapd_recovery_mode=` [14:57] it would really be nice if we could have a "snapctl is-core" or something in a future release to not having to grep /proc/cmdline from packages [14:57] @mborzecki, thanks, but that's already in my snaps. But I can mention it to the author [14:58] alan_g: I think the other way you could get around "snapctl get daemon" from the daemon script is to just run it in a loop until it works, that way it will start working when the install-snap change is finished [14:59] well, the daemon should really only start on core ... the snap is for both, core and desktop ... so you dont want something looping constantly in the background [14:59] I wondered about that. But now I know how to disable from the install hook I don't think it is needed at all. [14:59] oh I see [15:00] --disable from install, --enable from configure and dropping the wrapper is really the cleanest solution [15:01] I think (conditional) --disable from install and leave the user to enable if they want to covers it. [15:02] not sure what you want to make conditional in install here [15:02] i'd just install with stopped by default and do the conditional stuff from configure [15:03] (there are no conditions to check on install since you cant "snap set " before the snap is installed) [15:04] The condition is "if grep -q -e snap_core= -e snapd_recovery_mode= /proc/cmdline" [15:04] oh, that ... [15:04] i'D still do that from configure .. but yeah indeed [15:04] ... missed that [15:05] Well, configure might run if the user configures something [15:05] I just want it on install [15:06] installation calls configure once [15:06] so you dont need to duplicate code [15:07] What duplicate code? [15:08] you'D still check "snapctl get daemon", no ? [15:09] Why? I'd do away with the configuration option and let the user enable the daemon [15:09] so make install just disable it by default and have configure check for both conditions (on core or daemon=true) and enable it if required [15:09] i've summarized what i wrote above in the forum [15:10] well, i thought you want it to start on core in any case ... but also allow the user to start it as daemon on desktop optionaly [15:11] so you write a single conditional in configure and have it always come up disabled in install [15:12] if /* not on core */; then snapctl stop --disable $SNAP_NAME.daemon; fi [15:12] in install. Nothing in post-refresh, nothing in configure [15:13] sure and an additional chck for daemon= in configure ... [15:13] Why? [15:13] i'm just proposing to have both conditionals in configure to have a central place [15:13] so you never need to bother about install anymore [15:13] even if conditions change [15:14] but up to you really ... i just find it a lot more elegant .. but thats personal taste 🙂 [15:14] * cachio lunch [15:15] mvo, this is failing in debian https://paste.ubuntu.com/p/y4kCcgyrqT/ [15:16] mvo, any idea bout how to fix it [15:16] ? [15:16] cachio: oh, fun - looks like the archive is inconsistent. could you try a "apt full-upgrade -y" before the "apt build-dep" ? [15:16] I think I'm missing your point. What condition might change? [15:16] mvo, sure, thanks [15:17] well, you just had one that changed 😉 from snap_core to snapd_recovery_mode ... [15:18] but really, do as you like ... lets not discuss style as log as we get a fix out 🙂 [15:20] AFAICS its harder in configure as we only want to disable during an initial install [15:20] Not on any random change [15:20] yes, thats why i'D unconditionally always disable it in install [15:20] and have all the enablement logic in configure [15:22] The logic is "if (first time && On desktop) then disable" [15:22] just do as you like, really [15:22] both hoos run in succession anyway [15:23] *hooks [15:23] I still feel that logic is simpler in install as you know it is first time [15:23] well, you still need the daemon= logic in configure in any case [15:24] Why? [15:24] because your user might want to run a kiosk on classic ? [15:24] i thought thats the purpose of having daemon= [15:24] so you give additional control [15:24] So the user enables $SNAP_NAME.daemo [15:25] mvo: pedronis: i've updaed #9629 to the latest version of license data [15:25] PR #9629: spdx: update to SPDX license list version: 3.10 2020-08-03 <⛔ Blocked> [15:25] ah, so you would ask the user to snap start --enable scummvm.daemon ... instead of snap set ... sure ... that works but wouldnt be usable from a gadget on classic [15:26] i suppose i can drop the blocked label now too [15:26] "wouldnt be usable from a gadget on classic" is the point I was missing. Thanks! [15:26] not a super common case ... but possible [15:27] (up to now we talked all diigtal signage users into using core anyway 🙂 ) [15:37] ijohnson: a quick observation, i was able to reproduce the rsa veirification error quite reliably every couple of runs when i was building a kernel with yocto in the background [15:39] mborzecki: thank you! [15:47] mborzecki: interesting [15:47] perhaps it is so difficult to reproduce for me because I have so many cores that are not busy :-p [16:09] 2020-12-09T14:35:03.9833068Z Dec 09 14:32:59 ubuntu snapd[1702]: 2020/12/09 14:32:59.119476 stateengine.go:150: state ensure error: devicemgr: cannot mark boot successful: cannot check for fde-setup hook in reseal: cannot get kernel info: no state entry for key [16:09] weird [16:09] mvo: any clues what that might be about? ^^ [16:10] hm found more weird logs: [16:10] 2020-12-09T14:35:03.9588849Z [ 55.176506] snapd[1702]: 2020/12/09 14:33:01.515032 stateengine.go:150: state ensure error: devicemgr: cannot mark boot successful: cannot identify kernel snap with bootloader grub: cannot read dangling symlink kernel.efi [16:13] looks like this happens right after install too, this appears when booting into run mode for the first time: [16:13] 2020-12-09T14:35:03.8893265Z [ 32.599045] snapd[885]: stateengine.go:150: state ensure error: devicemgr: cannot mark boot successful: cannot check for fde-setup hook in reseal: cannot get kernel info: no state entry for key [16:18] Hey guys [16:18] I have been referred to ask a question here [16:18] I am trying to copy a snap to another machine with a different username [16:19] if I just copy the folder over, that doesn't work [16:20] technically you should install the snap newly, take a snapshot on the old host and restore it on the new host ... but that will likely not handle changed user name or changed UID for user data [16:22] perhaps someone with more insight into snapshots can give a hint if it is possibel to restore snapshots to a new user [16:23] BTW, snap is question is bluemail [16:23] ogra: tried that but that doesn't work due to username [16:23] :( [16:24] you will definitey need to install the snap anew ... you can surely also restore the system bits from a snapshot ... perhaps then copying the ~/snap/bluemail/current/* content is enough [16:25] pedronis: I have this feeling that 9149 has too much in it, it's a bit messy, should I split it into one PR that does the "$kernel:ref" validation, one PR that implements gadget.ResolveContentPaths() and one that uses ResolveContentPaths() ? wdyt? [16:26] mvo: that's fine with me, it's not very large, but that sequence seems easier to review [16:30] PR snapd#9149 closed: gadget: provide new gadget.ResolveContentPaths() (2/N) [16:35] PR snapd#9774 opened: o/snapshotstate: don't set auto flag in the snapshot file [16:39] pedronis: ^ [16:41] pstolowski: thx [16:56] * ijohnson short break [17:00] pstolowski, would there be the same problem with using `snapctl is-connected` in a launch script? [17:07] why would you do that from a launch script instead of a hook ? [17:09] alan_g, https://github.com/ogra1/pi-fancontrol-snap/tree/master/snap/hooks ... se the connect hooks [17:09] (and how the configure hook uses is-connected alongside) [17:10] I've existing scripts that check for the wayland and x11 interfaces to figure out how to launch [17:10] i doubt it makes any difference what comes after snapctl ... the call itself is the issue [17:11] I suspect as much too. But hoped... [17:11] So configure runs on connect/disconnect? [17:12] no, the connect hooks do [17:12] configure just uses is-connected and exits zero if a connection is missing [17:12] before it starts (or restarts) the service .... [17:13] * alan_g is tempted to keep calling snapctl until it works [17:15] in a crazy loop 🙂 [17:17] alan_g: no, that should be fine [17:19] alan_g: but also i was slightly wrong about the source of the conflict, it's actually 'snapctl stop ..' in the daemon-start.sh (not snapctl get) triggering this (it's fine to do this from hooks, but in daemon it conflicts with install as explained earlier) [17:20] it seems we have tests that generate real notifications? [17:22] pstolowski, that seems less awkward. But that means a daemon can't stop itself in the case of persistent problems? [17:23] it can but needs to deal with conflict errors [17:23] maybe we need those to be more detectable [17:28] until snapctl stop --disable $SNAP_NAME.daemon; do sleep 1; done # Ugh! [17:31] hmm [17:31] why the loop? [17:32] to deal with conflict errors [17:32] Or have I misunderstood the failure mode? [17:33] alan_g: if you do this from hook then it should just work [17:33] i.e. won't conflict [17:33] But the hook doesn't know that the daemon has encountered a persistent error [17:35] PR snapd#9775 opened: gadget,o/devicestate,tests: drop EffectiveFilesystemLabel and instead set the implicit labels when loading the yaml [17:36] But it could `snapctl set killme=true` and the configure hook would process that? [17:39] sorry, i need to run, need to taxi my daughter, let's talk tomorrow (oe maybe ijohnson can help) [17:40] o/ [17:46] * ijohnson is back [17:47] alan_g: I'm a bit confused where you're at right now [17:50] ijohnson, I understand the immediate problem and solution. But I'm just imagining the hypothetical circumstance of a daemon that hits a persistent problem at runtime and elects to stop itself. In that case it is necessary to "deal with conflict error". I see two ways to do that: [17:50] 1. until snapctl stop --disable $SNAP_NAME.daemon; do sleep 1; done [17:51] 2. `snapctl set killme=true` and the configure hook would process that? [17:52] so just to make sure we are on the same page, `snapctl stop --disable ...` needs to be run in a loop because if the daemon runs very fast, snapctl may fail due to an conflict in progress ? [17:52] i.e. install-snap in progress or some such error message [17:53] Yes [17:54] It's not blocking anything right now. Just want to confirm my understanding. [17:56] ok, then imho having `snapctl stop --disable` run in a loop until it works is the cleaner solution [17:56] I think there is maybe things we can do in snapd to make `snapctl stop --disable` work when there is a conflict like this, but it's unclear how exactly that would be implemented [17:57] I guess just as a user seeing `snap get ` and seeing `killme: true` would be a bit unexpected and confusing [17:57] oh wait actually you can't do that [17:58] because when you run `snapctl set` that does _not_ trigger the configure hook to run [17:58] so `snapctl set killme=true` would be racing with the configure hook itself and would fail anyways [17:58] *could [18:01] The problem with the loop is that it isn't obvious it is needed and "just works" without most of the time. (Until a user hits a weird problem on some new device.) [18:02] but why do you need it at all ? [18:02] the hooks offer everything ou need [18:02] and they save you from having to use a wrapper at all usually [18:03] ogra, I understand the immediate problem and solution. But I'm just imagining the hypothetical circumstance of a daemon that hits a persistent problem at runtime and elects to stop itself. The hooks are not running. [18:04] well sometimes things are not obvious, that's what code comments are for :-P === ijohnson is now known as ijohnson|lunch [18:04] well, thats something you'd probably manage via an additioanl watcher service then [18:06] * alan_g hits EOD [18:15] PR snapd#9776 opened: gadget: add validation for "$kernel:ref" style content === ijohnson|lunch is now known as ijohnson [18:50] PR snapd#9777 opened: gadget: add gadget.ResolveContentPaths() [19:29] * cachio afk [19:49] pedronis: do we have any current assertions or assertion examples where a list is empty? it seems to me that we have no such example and I can't seem to convince the assertion decoding function to understand what an empty list is, which leads me to believe that the assertion format doesn't support empty lists and only allows fields to be omitted if they are empty [19:49] indeed, if I try signing a system user assertion json with an empty string for serials, serials is just omitted from the produced assertion [20:18] @ogra tried copying the /current/* but still nothing :( [20:18] ijohnson: yes, empty and omitted are equivalent [20:19] ok [20:19] thanks for clarifying [20:19] is there a way I can backup snaps to be deployed on another machine with a different username [20:20] N3bulaK, so yu installed a fres snap from the store, took a snapshot of the system config of the old one and copied the content of current/* (making sure all "dot dirs are included) ? [20:20] *fresh [20:20] N3bulaK, also whats the actual issue you see with that ? is just data missing, does the app not start etc etc [20:28] Hello! I'm having some trouble with my snap install of LXD [20:29] I think the problem is that I configured LXD to set storage pools in /data/lxd (note: /data is a btrfs drive), and snap is not mounting them in /var/lib/snapd/hostfs/data/lxd, which apparently is where lxd expects them [20:30] The strange thing is, this setup worked flawlessly two weeks ago! Then I shut off the server for a while, powered it on today, and lxd won't even start anymore [20:30] Here's the output of `sudo lxd --debug --group lxd`: https://l.termbin.com/rlaiu [20:36] PR snapd#9479 closed: tests: replace pkgdb.sh (library) with tests.pkgs (program) [20:38] stgraber, ^^^ [20:58] pinusc: your error is about: EROR[12-09|20:11:51] Failed to start the daemon: Failed initializing storage pool "default": Failed to mount '/var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img' on '/var/snap/lxd/common/lxd/storage-pools/default': not a directory [20:58] pinusc: oh, I see [20:59] Yes [20:59] pinusc: what the tell is this mess, we don't support disks/ being anywhere other than /var/snap/lxd/common/lxd/disks/ [20:59] I honestly have no idea [21:00] This was my first time installing lxd and I'm very confused about configuring storage [21:00] pinusc: Does /data/lxd/common/lxd/disks/default.img exist on your system? [21:01] Yes [21:02] I'm not sure I exactly understand how snap works, but is /var/lib/snapd/hostfs supposed to contain some sort of bind mount of the root fs? Because right now it's completely empty, which I think is the problem [21:02] what does `sudo nsenter --mount=/run/snapd/ns/lxd.mnt ls -lh /var/lib/snapd/hostfs/data` show you? [21:02] you can't see the content of /var/lib/snapd/hostfs from outside the snap, that's normal [21:04] It shows me the contents of /data [21:05] what does `sudo nsenter --mount=/run/snapd/ns/lxd.mnt readlink -f /var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img` show you? [21:06] and `readlink -f /data/lxd/common/lxd/disks/default.img` without the nsenter stuff for good measure [21:06] the nsenter one prints /var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img [21:07] Without nsenter it prints nothing and exits with 1 [21:08] what does `sudo nsenter --mount=/run/snapd/ns/lxd.mnt ls -lh /var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img` show you? [21:09] -rw------- 1 root root 11G Nov 18 21:19 /var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img [21:12] ok, so that environment looks happy enough now, what does `lxc info` show you? [21:13] Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: permission denied [21:13] Which is expected because lxd won't start at all [21:16] PR snapd#9778 opened: asserts/repair.go: add "bases" and "modes" support to the repair assertion [22:33] ogra: App doesn't start, I have tried copying the folders etc but to no avail [22:33] but If I remove the copied folders then it works fine without any previous data [23:06] stgraber: any more ideas on what I could try? Sorry for the insistence, I'm completely lost [23:07] pinusc: tried `sudo lxc info`? [23:07] Yup, same Error: Get "http://unix.socket/1.0": dial unix /var/snap/lxd/common/lxd/unix.socket: connect: connection refused [23:07] pinusc: ah, that's better than the permission denied [23:07] pinusc: try `systemctl restart snap.lxd.daemon` [23:08] well, `sudo systemctl restart snap.lxd.daemon` [23:09] The command exits cleanly, but the service fails soon [23:10] Oh, there's a (new?) error [23:10] Failed initializing storage pool \"default\": Source path '/var/lib/snapd/hostfs/data/lxd/common/lxd/disks/default.img' isn't btrfs" [23:11] I really wonder how you managed to get yourself into such a broken situation in the first place, LXD shouldn't have ever let you put default.img anywhere other than /var/snap/lxd/common/lxd/disks/ [23:12] That's a very good question [23:12] The weird thing is, it used to work [23:12] the first failure you got was because LXD started before your /data mount was mounted, now you're hitting an error because LXD assumes that any source path outside of /var/snap/lxd/common/lxd/disks refers to a block device or a path, which your setup definitely doesn't match [23:12] I *might* have created the default.img and then moved it somewhere else and then changed the config to reflect that [23:13] hmm, yeah but LXD wouldn't have let you change the source property, it's read-only. The only way you could make such a mess short of us having a bug that let you do it another way is through `lxd sql` by directly updating the DB [23:13] That I am sure I did not do [23:14] anyway, do you have enough space on /var/snap/lxd/common/lxd/disks to store that default.img file? [23:15] I will ask the question again as there are more people active at this very moment :D [23:15] I am trying to copy snap data to another machine with a different username but can't make it work [23:15] stgraber: yes, I do [23:16] snap is question is bluemail [23:16] pinusc: ok, then move it where it should be at /var/snap/lxd/common/lxd/disks/default.img [23:16] in* [23:16] pinusc: do you have more than one storage pool configured? [23:16] stgraber: Also, I went through my .bash_history and I did indeed mv the directory under /data/lxd, and then the next relevant command is `sudo lxc storage edit default` [23:17] stgraber: nope [23:17] No idea what I did in storage edit, but I guess just changed the path? [23:18] pinusc: yeah, apparently there's a bug that lets you change it... I'll have someone sort that out tomorrow [23:18] Oh, that would be good... I just assumed that this setting was fina [23:18] Are you a lxd maintainer? [23:20] pinusc: onece you have default.img moved back where it belongs, you can create a file at "/var/snap/lxd/common/lxd/database/patch.global.sql" containing "UPDATE storage_pools_config SET value='/var/snap/lxd/common/lxd/disks/default.img' WHERE key='source';". Then restart LXD. The database should get updated with the correct path and hopefully things will start back up. [23:20] pinusc: I'm the LXD project leader. [23:22] Oh wow, thank you for helping [23:22] The lxd daemon now starts up fine! [23:22] Though containers fail to start for some reason... [23:22] I'll see if I can debug that [23:36] stgraber: I'm getting Failed to mount rootfs "/var/snap/lxd/common/lxd/containers/synapse/rootfs" onto "/var/snap/lxd/common/lxc/" with options "(null)" [23:37] When I try to launch a (existing) container [23:37] New containers, however, run fine [23:37] pinusc: what's `ls -lh /var/snap/lxd/common/lxd/containers/` showing you? [23:38] Links to /var/snap/lxd/common/lxd/storage-pools/default/containers/CONTAINERNAME [23:38] ok, that part is good then [23:38] ls -lh /var/snap/lxd/common/mntns/var/snap/lxd/common/lxd/storage-pools/default/ [23:39] Some dirs, including containers/ [23:41] Oooh, inside containers/ is one dir per container, but the owner might be wrong. I have root:root for everything, except the one I just created (which is the only one which works), which has 1000000:root as permission [23:44] Also, the old ones are empty, except for a backup.yaml, whereas the new one has other stuff---including rootfs [23:46] Can you check if you see anything at `/var/snap/lxd/common/lxd/storage-pools/default`? you shouldn't but given the current mess, it's not impossible that some of the data ended up there somehow? [23:47] Nope, empty [23:52] pinusc: oh, I think I may know what happened but you're not going to like it [23:53] pinusc: were those containers created after default.img got moved but prior to the next system reboot? [23:53] Very likely [23:53] pinusc: and are /data and /var/snap/lxd on different partitions? [23:53] Yes [23:55] right, then I'm afraid that you're screwed. You see there is no such thing as moving a file between two mounts, when you `mv` between two mounts, the source is copied and then deleted. In your case, the source was still mounted and actively being used. When that happens Linux succeeds in deleting the file but actually keeps it active on disk until such time as the last thing that has it open closes it. [23:55] pinusc: so when you moved default.img, LXD never actually used the moved path in /data, instead it just kept using the now delete file under /var/snap/lxd [23:55] Ooh, I see [23:56] pinusc: after a reboot, the data in /var/snap/lxd is gone forever and your data in /data is effectively a copy of a very old state [23:56] So that also explains why it was working before. On a reboot, it tried to actually access /data for the first time, and failed [23:57] I have to say, it sucks that I lost all that was in the containers... but this is a satisfying answer [23:57] I was dumb and I got bitten [23:58] Before I proceed, I'll make sure to actually read the documentation and properly set up a storage pool on an external media [23:58] Meanwhile, i guess I'll just have to delete my containers and start from scratch... [23:59] the best is to have a dedicated disk or partition for LXD [23:59] then during LXD init you will be prompted for whether you have one of those for your storage pool [23:59] LXD will then automatically mount it for you on startup all inside its mount namespace [23:59] Yeah, I guess I'll make a btrfs subvolume