/srv/irclogs.ubuntu.com/2020/10/09/#cloud-init.txt

catphishnot strictly a cloud-init question, but i'm trying to encrypt my cloud-init data with AES, but a clean ubuntu box doesn't seem to have python module "Crypto", i'm not actually very familiar with python, should it have an AES library built in? or am i not going to be able to do this?10:10
catphishlooks like i can solve this by piping the data to the openssl CLI, it's a little painful but effective11:14
catphishopenssl enc -d -base64 -aes256 -K 4e3077305271704834626963785a62785a567a6245436f546353753145457376 -iv 47a8326ba88ae1eed21fa74266f4363a <<< HY6jY3tYIHFCbvnsZnQBCSjUXg9Jw8c5oS6utnW0H03M6u/E6rclSk1iIqN41byS11:14
=== vrubiolo1 is now known as vrubiolo
Odd_Blokecatphish: What cloud-init data are you referring to?  (And how does the symmetric key get into the instance?)14:52
AnhVoMSFT@Odd_Bloke Can you review this PR when you have a chance? https://github.com/canonical/cloud-init/pull/591 there's a comment from Ryan that needs your feedback. I did test upgrade scenario and found no issue (even when not deleting the obj.pkl)14:54
catphishOdd_Bloke: my network data source sends data AES256 encrypted, the symmetric key is in the BIOS15:18
catphishthis seemed a better idea than any other method of ensuring that only the correct VM can obtain and use its data15:23
catphish(that i could think of)15:23
catphishthough another reasonable option is to use the PSK in the BIOS as a password, and use that to request the data over TLS, i was just struggling to work out what CA I'd use15:25
catphishi was trying to keep it largely free from any external requirements, requiring a public certificate signed by my org's domain name, and then just sending a password from the BIOS over the TLS to request the plain data might be sufficient though15:28
catphishi'm really just trying to ensure it's maintenance free15:28
catphishtldr: trying to design a secure way that my VM can trust and fetch cloud-init data from its hypervisor securely, with a minimum of need to hardcode any kind of key into the module15:32
AnhVoMSFT@rharper do you have suggestion for how to write the unit test that you suggested? Generating the list of interfaces for renaming come from stages and not the datasource15:34
AnhVoMSFTstages do reset the distro object that comes into datasource (this caught me by surprise, which was why i had to move the blacklist instantiation into _get_data and not in datasource's _init)15:37
rharperAnhVoMSFT: stages does make the change, but it invokes methods on ds object15:37
AnhVoMSFTright, but how do I make the kind of unit tests that would basically go through the flow of stages invoking DS's init, then reset the distro, then calling _get_data. That would be a more useful test15:38
AnhVoMSFTbecause later if someone either changes stages' resetting the distro object in a different place, or move where the DSAzure is setting the blacklist, we want the test to fail so that it can be paid attention to15:38
AnhVoMSFTwhatever datasource does to distro in __init__ is completely irrelevant as stages will reset distro to None and supply a new distro instance later15:40
rharperI'm not sure what reset you mean?   the distro object is created and the passed to the Datasource constructor;  on every boot; is that what you meant by reset ?15:41
AnhVoMSFThttps://github.com/canonical/cloud-init/blob/5bf287f430b97860bf746e61b83ff53b834592d0/cloudinit/stages.py#L26515:43
AnhVoMSFTso the distro object there gets reset to None, a new object will be created when it's accessed again the next time15:46
rharperthat resets the stages object; in particular, it's called without reset_ds=True, so it's only modifying the stage's copy of _distro _path and _cfg;  the self.datasource is the real DS that's used15:46
AnhVoMSFTright, but stages.distro is the one that is passed to networking15:46
AnhVoMSFTso if the datasource is to influence the distro object (by setting the blacklist_drivers), it needs to do it in the right place. It can't assume that the same distro object exists throughout boots15:47
rharperno, it uses stages.distro, not stages._distro15:47
AnhVoMSFTstages.distro is a @property that would return stages._distro if instantiated, otherwise it would create instantiate a new one, is that not the right assumption?15:48
rharperAnhVoMSFT: I see your issue;  it makes a new class on each access15:48
rharperdistros.fetch()15:48
AnhVoMSFTis it possible to write a unit test for such interaction (that would essentially need two boots, I think?)15:49
rharperso state can;t be stored in the instantiated distro ... so how is this working then ?  or does the blacklist get reapplied as well (you moved it to _get_ds()15:49
rharperyeah, you just call stages.Init() just like it's called in cmd/main.py15:50
AnhVoMSFTthe blacklist gets reapplied every time _get_data is called15:50
rharperright, to deal with the distro getting recreated each stage15:50
AnhVoMSFTactually as long as I write a test that would invoke stages' distro.networking.wait_for_physdevs(netcfg) and come out with the right netcfg that should be good15:52
AnhVoMSFThttps://github.com/canonical/cloud-init/blob/5bf287f430b97860bf746e61b83ff53b834592d0/cloudinit/stages.py#L69915:53
rharperyou don't ahve to use stages; but yes;15:53
rharperyou can create a DS, passing it a distro object;  1) confirm that ds.distro.networking.blacklist_drivers matches what's in DatasourceAzure;15:53
AnhVoMSFTthat makes sense15:53
rharperand 2)  mocking cloudinit.net.get_devices (or whatever the main call)  is called with the correct blacklist_drivers values15:54
AnhVoMSFTneed to invoke the DS' _get_data, then verify the blacklist15:54
rharperyep15:54
AnhVoMSFTcool, let me work on that15:54
AnhVoMSFTthanks Ryan15:54
AnhVoMSFTwould you want me to keep the unit test I wrote for net.get_interfaces or we don't really need that test at all15:56
rharperI don't think we need it; there's an existing blacklist filter test15:57
AnhVoMSFTok15:57
Odd_Blokecatphish: Aha, OK, I'd forgotten you were working on a datasource.  I'm not aware of any other DS doing something like this, so I don't have any particularly useful advice to give.  Using a key from the BIOS sounds like a reasonable path forward; I would suggest shelling out to `openssl` for now, only because that's going to be more readily available than a Python library, I expect.15:58
catphishOdd_Bloke: that was my conclusion so far, thank you, though i will look into using SSL and doing the encryption at that layer instead, rather than encrypting the data itself16:00
Odd_Blokerharper: You are correct that `obj.distro.networking` is not present after upgrade to a cloud-init version where we would expect it to be present; I think making `Distro.networking` a property which instantiates `self.networking_cls` (which is a _class_ attribute, so not stored in the pickle and so read from the class definition in the current code) as needed will do the trick.16:17
rharperOdd_Bloke: ok;  even though azure resets the pkl object on each boot; immediately after upgrade; if users run cloud-init commands which would load the object; I think we'd see issues16:20
Odd_BlokeYeah, we'd also see the issue on non-Azure platforms.16:23
AnhVoMSFTwhy did I not run into any issue when I tested upgrade (I explicitly removed the code that would delete the obj.pkl)16:23
Odd_BlokeThat's a good question.16:23
AnhVoMSFTis it because _get_data was actually not called at all, so the code wouldn't run16:25
rharperyou'd need to do something like: cloud-init init  to force it to run again16:25
rharperwe've had users manually run those commands after upgrade (which isn't needed, but shouldn't break things)16:25
AnhVoMSFTI did upgrade, then remove the systemd unit responsible for deleting the obj.pkl, then reboot16:26
AnhVoMSFTso in effect that should have called cloud-init init again. I think when obj.pkl existed, the _get_data method isnt invoked on reboot, so the access to .networking object was not executed16:26
rharperyes16:28
Odd_BlokeA minimal reproducer using LXD: https://paste.ubuntu.com/p/g5Vp62h2g3/16:30
rharperOdd_Bloke: yeah, AnhVoMSFT your test upgraded from an instance which already had the refactor done;16:33
AnhVoMSFTah I see - so technically this issue already existed before my change?16:34
Odd_BlokeYeah, I think we have a latent bug there, though IIUC it's unlikely to be triggered in normal usage ATM.16:50
Odd_BlokeI have a commit locally that should address the missing .networking issue; I'm off on Monday so I'll do a bit more testing on Tuesday then propose it.17:11

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!