/srv/irclogs.ubuntu.com/2011/03/24/#ubuntu-classroom.txt

=== bootri is now known as Omega
=== niemeyer_dinner is now known as niemeyer
=== mintos is now known as mintos_afk
=== mintos_afk is now known as mintos_out
=== mintos is now known as mintos_out
=== mintos_out is now known as mintos
=== _LibertyZero is now known as LibertyZero
=== msnsachin12 is now known as msnsachin
=== _LibertyZero is now known as LibertyZero
=== shrewduh_ is now known as shrewduh
=== daker_ is now known as daker
=== Harry is now known as Guest36736
=== ziviani is now known as JRBeer
=== nigelbabu is now known as nigelb
=== daker_ is now known as daker
=== yofel_ is now known as yofel
nevrax#ubuntu-classroom-chat13:44
=== goom is now known as GooOm
=== daker_ is now known as daker
=== niemeyer is now known as niemeyer_lunch
budiwhello all15:42
MarkAt2od#ubuntu-classroom-chat15:48
* koolhead17 kicks nigelb 15:53
kim0Hi folks .. Ubuntu Cloud Days starting in around 5mins here in #ubuntu-classroom .. You can discuss and ask questions in #ubuntu-classroom-chat .. Please feel free to tweet and share this info with your friends15:54
kim0Those unfamiliar with IRC can simply use this web page http://webchat.freenode.net/?channels=ubuntu-classroom%2Cubuntu-classroom-chat15:56
smoserhi all15:59
smoserok...16:00
smoserso shall i start, mr kim0 ?16:00
kim0You might indeed16:00
smoserHi, my name is Scott Moser.  I'm a member of the Ubuntu Server team.16:01
smoserFor the past 18 months or so, I've been tasked with preparing and managing the "Official Ubuntu Images" that can be used on Ubuntu Enterprise Cloud (UEC) or on Amazon EC2.16:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat || Event: Ubuntu Cloud Days - Current Session: Rebundling/re-using Ubuntu's UEC images - Instructors: smoser
smoserHi, my name is Scott Moser.  I'm a member of the Ubuntu Server team.16:01
smoserFor the past 18 months or so, I've been tasked with preparing and managing the "Official Ubuntu Images" that can be used on Ubuntu Enterprise Cloud (UEC) or on Amazon EC2.16:01
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html following the conclusion of the session.16:01
smoserSome links, for reference:16:02
smoser- https://help.ubuntu.com/community/UEC/Images16:02
smoser- http://uec-images.ubuntu.com/releases/16:02
smoserThe first gives some general information about the images, the second gives access to download the images for use in UEC or Eucalyptus and AMI ids on Amazon.16:03
smoserThe subject of my discussion here is "rebundling/re-using Ubuntu's UEC images".16:03
smoser"Rebundling" is the term used for taking an existing image (or instance), making some modifications to it, and creating a new image from it.16:03
smoserWith the Ubuntu images on UEC or EC2, There are generally 3 ways to rebundle an image.16:03
smoser * use a "bundle-vol" command from either euca2ools (euca-bundle-vol) or ec2-ami-tools (ec2-bundle-vol)16:03
smoser * use the EC2 'CreateImage' interface.16:03
smoser * make modifications to a pristine (unbooted) image via loop-back mounts16:03
smoserI'll talk a little bit about each one of these, and then open the floor to questions.16:04
smoser== General Re-Bundling Discussion ==16:04
smoserThe "Official Ubuntu Images" are generally stock Ubuntu server installs.  As with any default install, they're not much use out of the box.16:04
smoserThere are 2 basic ways to use Ubuntu images on EC2.16:05
smoser * "rebundle" one and have your own private (or public) AMI. (as we're discussing here)16:05
smoser * use the stock images, and customize the instance on first boot via scripted or manual methods.16:05
smoserWe've gone to a fair amount of trouble to make them generally malleable so that you can use them without the need to rebundle.16:05
smoserCloud-init (https://help.ubuntu.com/community/CloudInit) was developed to reduce the need for users to need to maintain their own AMIs.  Instead, the images are easily customizable on first boot.16:06
smoserkim0, has put together several blog posts about how to get cloud-init to do your bidding so its ready to use once you attach to it.16:06
smoserThere is some work in maintaining a rebundled image, and, it you can remove that effort by using stock images.16:07
smosers/it you/it /16:07
smoserbah. you understand what i was saying. you can remove the effort of maintaining a rebundled image by using a stock image.16:07
smoserhttp://foss-boss.blogspot.com/ is kim0's blog.16:08
smoseryou should all have that bookmarked and available in your RSS reader of choice16:08
smoser:)16:08
smoserSo, while there are some costs involved in rebundling, there are reasons to rebundle.  If you have a large stack that you're installing on top of the stock image, it may take some time for you to do so.16:09
smoserRebundling generally allows you to have a more "ready-to-go", "custom" image.16:09
smoserBy rebundling an image, you can add your stack and then reduce the bringup time of your custom AMI.16:09
smoserany questions ?16:10
smoser== Bundle Volume ==16:10
smoserUsing 'ec2-bundle-vol' was the first way that I'm aware of to rebundle.  The euca2ools also provide a work-alike command 'euca-bundle-vol'.16:10
smoserThe way most people use this tool is to16:11
smoser * boot an instance that they want to start from16:11
smoser * add some packages to it, make some changes ...16:11
smoser * issue the rebundle command16:11
smoser[sorry, issue the 'ec2-bundle-vol' or 'euca-bundle-vol' command]16:12
smoserwhat this does is basically copy the contents of the root filesystem into a disk image16:12
smoserand then package that disk image up for uploading16:12
smoseras you can imagine, simply doing something like "cp -a / /mnt" (ignoring the recursive copy of /mnt) is not the most "clean" thing in the world.16:13
smoserthe euca-bundle-vol command and ec2-bundle-vol command both include some OS specific hacks , so they dont copy certain files.16:14
smoserand, inside the images themselves, we've made some changes to make this "just work".16:14
smoseronce you've set up the euca2ools, you might rebundle with something like:16:15
smoser  sudo euca-bundle-vol --config /mnt/cred/credentials --size 2048 --destination /mnt/target-directory16:15
smoser!question16:16
smoserhm..16:16
smoserthere was a question as to whether this applied to openstack16:17
smoserlargely, openstack's EC2 compatibility should make this work.16:17
smoseri've not tried rebundling an image in openstack exactly, but i do know that the euca2ools interact with openstack fine. and copying a filesystem is not really cloud specific16:18
ClassBotkoolhead17 asked: Is this class limited to eucalyptus image bundling or openstack as well ?16:19
smoserright. so my previous comments attempted to address that question16:19
ClassBotkim0 asked: is euca-bundle-vol only for running instances, can't I poweroff an instance and bundle its disk while powered off16:19
smosereuca-bundle-vol only runs in instances.16:20
smosereuca-bundle-image (and ec2-bundle-image) take a filesystem image as input.16:20
smoserafter using euca-bundle-image (or ec2-bundle-image) you then have to upload and register the output16:21
smoseri generally would suggest using 'uec-publish-image' instead, which is a wrapper that does those three things.  The most recent version of this tool in natty allows you to use either the ec2- tools or euca2ools under the covers.16:21
ClassBotkoolhead17 asked: but open-stack doesnot use the ramdisk part of image16:22
smoseri might be missing something.16:22
smoserit is my understanding that the issue with the ubuntu images and openstack was that openstack was hard coded to expect a ramdisk16:22
smoserwhere as the 10.04 and beyond images from Ubuntu do not use a ramdisk, so none was available in the tarball that you download.16:23
smoseri believe that a.) that bug is fixed16:23
smoserb.) there *is* in openstack a way to boot a instance with an internal kernel, ie not having a separate kernel/ramdisk at all, but relying on the bootloader installed in the disk image16:24
smoserboy... i'm getting loads of questions, and i'd like to kind of get back to my over all plan, and then i can take questions.16:24
smoserrather than sitting in interupt mode for the whole hour16:24
smoserwe will *definitely* have time to take questions, so please queue them up in #ubuntu-classroom-chat16:25
smosernow where was i...16:25
smoserso, after bundling, then you have to use euca-upload-bundle or ec2-upload-bundle and <prefix>-register to register your image.16:26
smoserI should have noted above, that this "bundle-vol" really is only for instance-store images.16:26
smoserEucalyptus (in 2.0.X) only supports instance store images.16:26
smoseri believe that they plan to have EBS root images in the future.16:27
smoserSo, that brings us to the second type of bundling16:27
smoser== CreateImage ==16:27
smoserWhen amazon began offering EBS root instances, they added an API call called 'CreateImage'16:28
smoserCreateImage is an AWS API call that basically does the following:16:28
smoser * stop the instance if it is not stopped16:28
smoser * snapshot it's root volume16:28
smoser * register an AMI based on that snapshot16:28
smoser * start the instance back up.16:28
smoserThe CreateImage api is exposed via a command line tool (http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/ApiReference-cmd-CreateImage.html) and also via the EC2 Web Console.16:28
smoserThis feature makes it dramatically easier for anyone to create a custom AMI.  There is literally one button that you push in the EC2 Console, and then type in a name and description.16:29
smoserI would generally recommend using CreateImage if you're using an EBS based image.  It is an extremely useful wrapper, and will get your a consistent filesystem snapshot.16:29
smoserOnce you have a snapshot id of a filesystem, you could actually fairly easily upload an instance-store image based on that snapshot.16:30
smoserthis is left as an excercise to the reader.16:30
smoserSo, the final way of rebundling an image16:31
smoser== modify pristine download images ==16:31
smoserFew other image producers on EC2 make their images available as filesystem images for download.16:31
smoserUbuntu does this so you can easily grab the image, make some changes to it, and then upload and register your modified image16:32
smoserThis might be the most involved way of creating an image, but it is also the one that lends it self best to automation16:33
smoserFor a simple example, say I wanted to add a user to an ubuntu image so that I could log in as that user on initial boot.16:34
smoserWhat I would do is:16:34
smoser * launch a utility instance in EC216:35
smoserI'd pick a lucid 64 bit image, possibly even use an EBS root image and a t1.micro size.  The size would largely depend on what I wanted to do.16:35
smoseronce that image was up, I'd ssh to it.16:35
smoserthen, download a image tarball that I found a link to from https://uec-images.ubuntu.com/releases/lucid/16:36
smoser$ wget https://uec-images.ubuntu.com/releases/lucid/release/ubuntu-10.04-server-uec-amd64.tar.gz16:36
smoserthen, extract the image, mount it loopback and make my modifications16:36
smoser$ tar -Sxvzf ubuntu-10.04-server-uec-amd64.tar.gz16:37
smoser$ sudo mkdir /target16:37
smoser$ sudo mount -o loop *.img /target16:37
smoser$ sudo chroot /target adduser foobar16:37
smoser... follow some prompts ...16:37
smosermaybe make some other changes here16:37
smoser$ sudo umount /target16:38
smoserassuming you've also set up your credentials so that you can use euca-* or ec2-* tools, then you can do:16:38
smoseruec-publish-image x86_64 *.img my-s3-bucket16:39
smoserand out will pop a AMI-XXXXX id that you can then launch.16:39
=== niemeyer_lunch is now known as niemeyer
smoserThis process lends itself *very* well to scripting.  You can launch the instance, connect to it, and do all the modifications via a program and revision control them.16:40
smoserso you'll know exactly what you have16:40
smoserAlso, we make machine consumable information about how to download the images available at https://uec-images.ubuntu.com/query/16:40
smoserFor some things I was working on, I put together a script that does much of the above16:41
smoserIt assumes the instance is launched, and you're on it with credentials, but then does the rest16:41
smoserhttp://bazaar.launchpad.net/~smoser/+junk/kexec-loader/view/head:/misc/publish-uec-image16:41
smoserso...16:41
smosersorry for pushign through all that without taking interupts, but I wanted to get through it.16:42
ClassBotobino asked: is there a "preferred" file system for instances?16:42
smoser10.04 images I believe use ext3 filesystem.16:42
smoserthat should have been ext4, as the images are really intended to be "stock ubuntu installs", and ext4 was the default filesystem in 10.0416:43
smoserthe 10.10 images use ext4, and so does natty.16:43
smoserit is possible that Ubuntu will move to BRTFS as the default in 11.10. if thats the case, I'd like to follow that in the images (brtfs is has some *really* nice features).16:44
smoserso as to "preferred"....16:44
smoserI know people use xfs, as it offers snapshotting functionality not available in ext416:44
smoserand Eric Hammond's "create-consistent-snapshot" is a popular tool that sits atop using xfs for your data partitions.16:45
smoserI wrote a blog entry on how you can rebundle the Ubuntu images into an xfs based image at16:45
smoserhttp://ubuntu-smoser.blogspot.com/2010/11/create-image-with-xfs-root-filesystem.html16:45
smosernavanjr, regarding "are you suggesting we should use the CreateImage method on EC2 to create an image for my UEC Private Cloud?"16:47
smoseri might have somehwat covered that.16:47
=== sre-su_ is now known as sre-su
smoserbut you really cannot use CreateImage on EC2 to create an image for UEC16:47
smoserone general approach that would work would be to get your instance into a state that you're happy with it16:47
smoserthen stop the instance16:47
smosersnapshot its root volume16:47
smoserstart the instance16:48
smoserattach that snapshot as another disk16:48
smoserthen copy the filesystem contents of the second disk to a disk image.16:48
smoserthat disk image then could be brought to UEC.16:48
smoseri'd have to think through that a bit more, but i believe the general path is correct.16:49
smosersemiosis pointed out: CreateImage will also snapshot any other (non-root) EBS volumes attached to the instance, and those snapshots are automatically restored & attached to new instances made from the AMI.16:49
smoserCreateImage is *really* a handy wrapper.16:49
ClassBotobino asked: thanks for cloud-init! Is there any plan to make cloud-init available for other distro?16:49
smoserwell, amazon has taken cloud-init to their CentOS derived "Amazon Linux".16:50
smoserand I believe that they intend to continue doing so.16:50
smoseri'm definitely interested in helping them, and have worked with some of their engineers.16:50
smoserI'd also like to get cloud-init into debian.  I know of one person who was trying to do that, and one person who was interested in getting it into fedora.16:51
ClassBotThere are 10 minutes remaining in the current session.16:51
smoserSo, yes, I would like to see that.  I think consistent bootstrapping code accross linuxes would be a general win.16:52
smosernavanjr, asked 'so there is no "createImage" similar for use on a running UEC instance?'16:53
smoserthere is no createImage like functionality in UEC.  It relies upon EBS and block level snapshotting.  Eucalyptus does not have EBS root functionality in any released version that i'm aware of16:54
smoserhowever, navanjr you might be able to get some more information out of obino16:55
smoser(sorry, obino)16:55
smoserI suspect htat question might have been planted.16:55
smoserit leads into the next session very well16:55
smoserTeTeT will talk about "UEC Persistency", which offers a way to get EBS root-like function on UEC, and even LTS 10.04.16:56
smoseri wont take his thunder, though.16:56
ClassBotThere are 5 minutes remaining in the current session.16:56
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat || Event: Ubuntu Cloud Days - Current Session: UEC persistency - Instructors: tetet
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html following the conclusion of the session.17:01
TeTeTHello! Nice to have you in class today.17:01
TeTeTIt's a bit weird, but I'm as nervous as before giving a live class :)17:02
TeTeTIf you have any questions, ask them in #ubuntu-classroom-chat with the prefix17:02
TeTeTQUESTION17:02
TeTeTI am not very familiar with the classbot, but I'll do my best to check if there are any questions in the queue.17:02
TeTeTA brief introduction: I'm Torsten Spindler, been working for Canonical since December 2006 and I am part of the corporate services team, so unlike most other presenters here, I'm on the commercial side.17:02
TeTeTI bring this up as one of my responsibilities is to maintain and deliver the Ubuntu Enterprise Cloud classes.17:03
TeTeTThis is directly related to this session, as I present one of the case study exercises found in the UEC class, it's the latest addition to the course material and brand new17:03
TeTeTSo what do I want to present here?17:03
TeTeTDuring the UEC class I often got the question: 'How can I have an instance on UEC that stores all of its information on an EBS volume, so I can simply use it like a regular virtualized server?'17:04
TeTeTWhy is that of interest? The data on an instance is volatile, e.g. if the instance dies, all data of it is gone.17:04
TeTeTUnless you store the data on a persistent cloud datastore; in UEC we have two of them: Walrus S3 and EBS volumes.17:04
TeTeTAn EBS volume is a device in an instance that serves as a disk, very much like a USB stick you insert in your system.17:05
TeTeTyou can see a graphic for this at http://people.canonical.com/~tspindler/UEC/attach-volume.png17:05
TeTeTkeep in mind that the disk actually attaches to an instance running on the node controller, not on the node controller itself17:05
TeTeTthe technology used in UEC is 'ATA over Ethernet - AoE', which means that the EBS volume is exported via network from a server, the EBS storage controller.17:06
TeTeTWith Amazon Web Services (AWS) it is possible to boot from an EBS volume in the public cloud, with a technology named 'EBS root'. For more info on this, please see http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?Concepts_BootFromEBS.html17:06
TeTeTWith UEC for the private cloud we use Eucalyptus as upstream technology. With Eucalyptus the concept of booting an instance straight from an EBS volume is not given.17:06
TeTeTSo the situation looks a bit like this: http://people.canonical.com/~tspindler/UEC/01-instance-volume.png17:07
TeTeTforgive my lack of design skills ;)17:07
TeTeTin UEC people usually have this: http://people.canonical.com/~tspindler/UEC/02-instance-volume-standard.png17:07
TeTeTan instance holds the kernel and /, while the data is saved on a persistent EBS volume17:08
TeTeTUEC persistency is about moving the kernel and / to the EBS volume17:08
TeTeTdepicted in http://people.canonical.com/~tspindler/UEC/03-instance-volume-ebs-based-instance.png17:08
TeTeTthat is, the kernel of the instance launches a kernel stored on an EBS volume17:08
TeTeTthe kernel on the EBS volume will use / from the ebs volume and run completely from there17:09
TeTeTI asked the Ubuntu server team on advice for realizing such a service back in January 2011. It was motivated by the questions I received during the UEC class.17:09
TeTeTMy initial thought was to have a specialized initrd that calls pivot_root on a root filesystem stored in an EBS volume.17:10
TeTeTBut Scott Moser (smoser) had a much better idea: Why not use kexec to load the kernel and root from the EBS volume.17:10
TeTeTMore information on kexec-loader can be found at http://www.solemnwarning.net/kexec-loader/17:10
TeTeTScott went ahead and implemented this and my colleague Tom Ellis tested it.17:10
TeTeTThen I used the docs from Scott, tested them and created an exercise for the UEC class out of it.17:10
TeTeTWe decided to publish this work in Launchpad and you can find the result in https://launchpad.net/uec-persistency.17:11
TeTeTThe branch under lp:uec-persistency contains the needed code and the exercise in the docs directory.17:11
TeTeTsmoser just told me that we use something like kexec-loader, not exactly the same tech17:12
TeTeTsee the chat channel for more background info ;)17:12
TeTeTIf you don't have bazaar installed right now, you can also take a look at the exercise PDF, found at http://people.canonical.com/~tspindler/UEC/ebs-based-instance.pdf17:12
TeTeTthe odt file is also freely available17:12
TeTeTI will now cover this exercise step by step.17:12
TeTeTNo questions so far on what the aim is?17:12
TeTeTto repeat, we want to use the kernel and root filesystem found on an EBS volume, not that in the instance itself17:13
TeTeTThe steps during the exercise have to be conducted from two different system, one I named the [cloud control host], the other the [utility instance].17:13
TeTeTit would be useful to have the PDF or ODT open for the exercise now17:13
TeTeTWith cloud control host I mean any system that holds your UEC credentials, so you can issue commands to the front end.17:14
TeTeTThe utility instance is created during the exercise and is used to populate an EBS volume with an Ubuntu image.17:14
TeTeTThe first two steps are preparing your cloud with a loader emi. This loader emi will later be used to kexec the kernel on your EBS volume.17:15
TeTeTStep three and four setup the utility instance. This is a regular UEC instance that is large enough to hold an Ubuntu image and store it on the attached EBS volume.17:15
TeTeTThe EBS volume created and attached in step three will be the base for your EBS based instances later on.17:15
TeTeTAll the steps five to nine are needed to copy an Ubuntu image to the EBS volume and ensure it boots fine later on.17:16
TeTeTIn step 11 a snapshot of the pristine Ubuntu EBS volume is made. While one could use the EBS volume right away, it's much nicer to clone it via the snapshot mechanism of EBS.17:16
TeTeTJust in case you later want another server based on that image.17:17
TeTeTSteps 12 and 13 are there to launch an instance based on an EBS volume.17:17
TeTeTThe final step 14 describes how to ssh to the instance and check if it is really based on the EBS volume, e.g. /dev/sdb1.17:17
TeTeTThat's really all there is to it, thanks to smosers work. Perfectly doable by yourself within 2 - 4 hours, starting with two bare metal servers17:18
TeTeTOnce you've been through all the steps and you want more EBS based instances of the same image, simply repeat from step 11 boot_vol.17:18
TeTeTWith this you should have virtualized servers in your UEC within a few minutes, quite a nice time for provisioning.17:19
TeTeTEspecially useful might be to assign permanent public addresses to those instances.17:19
TeTeTThis can be done with help of euca-allocate-address and euca-associate-address.17:19
TeTeTany questions so far? Everything crystal clear?17:20
TeTeTWell, it was a very short session then I fear...17:21
TeTeTClosing words, we're looking into automating the storage of the Ubuntu image on the EBS volume to make this step less work intense. So keep an eye on the Launchpad project.17:22
TeTeTIn the end you should be able to use any Ubuntu image within UEC on Ubuntu 10.04 LTS on an EBS volume in a few minutes, rather than hours17:22
TeTeTif there's anyone interested in contributing to the project or UEC in general, please get in touch with kim017:23
TeTeTwe're looking for people with any skills, from coding to writing documentation17:23
TeTeTwe'd also love to hear from you if you try UEC persistency and it works or doesn't work for you17:24
TeTeTI tested the exercise a few times, but you'll never know17:24
TeTeTyou can touch base with us in #ubuntu-cloud and myself also in #ubuntu-training17:25
ClassBotobino asked: have you even RAIDed the EBS volumes? Is there any advantage in doing so?17:25
TeTeTnope, I've never RAIDed the EBS volumes, but would think there might be a bit of a performance hit due to the ATA over Ethernet protocol17:26
TeTeTalso keep in mind that while served via network, the EBS volumes are likely to come from the same Storage Controller (SC)17:26
TeTeTso not sure if this is a good approach17:27
TeTeTmight be interesting to use drbd though, and maybe use the EBS volume as well as the ephemeral storage and see how that goes17:28
TeTeTI guess there's quite a bit of room for experimentation for EBS based instances17:28
TeTeTguess you have no 30 minutes left in the session, so enough time to actually do the exercise if you have a UEC ready :)17:30
TeTeTno=now17:30
TeTeTthanks for attending, catch me in #ubuntu-training if you run into problems with the exercise, bye now17:32
kim0So we finished a bit early on this session17:33
kim0Daviey starts in less than 30 mins with a puppet session17:33
kim0Time for a coffee break :)17:34
ClassBotThere are 10 minutes remaining in the current session.17:51
ClassBotThere are 5 minutes remaining in the current session.17:56
Davieykim0, Are you managing the session?18:01
kim0It's pretty much self managing .. topic will be changed in a few seconds18:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat || Event: Ubuntu Cloud Days - Current Session: Puppet Configuration Management - Instructors: Daviey
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html following the conclusion of the session.18:01
DavieyMy name is Dave Walker (Daviey), and I am a member of the Ubuntu Server Team.18:02
DavieyWelcome to the puppet classroom session.  This session is mainly targeted at those that have had minimal or no exposure to the puppet.18:02
DavieyIt allows reproducible, consistent deployments, which is good for horizontal scaling and replacing machines which have malfunctioned.18:03
DavieyA good reference for more details about the project is at:18:03
Davieyhttp://projects.puppetlabs.com/projects/puppet/wiki/About_Puppet18:03
DavieyPlease take a few moment to grok the content of that page, there is little point in my reproducing the content here.18:03
* Daviey waits a few minutes.18:03
DavieyNow, some of that might sound little complicated but it really is simple when you get started.18:04
* Daviey continues.18:05
DavieyPuppet focuses on the 'configuration' management.  The initial operating system deployment is usually done with either, preseeding the installer, cobbler, FAI or simply spawning a cloud machine, such as EC2.18:05
DavieyIn regards to EC2.. people tend to use user-scripts or increasingly cloud-init.18:05
DavieyOnce the base operating is installed, there is always some changes that need to be made to make the server usable for production.  This varies from performance tweaks, application configuration and even custom versions of packages.  This could all be handled with scripts and such, but this is less than clean and near impossible to maintain.  This is where puppet provides a clean solution.18:06
DavieyOnce the base operating is installed, there is always some changes that need to be made to make the server usable for production.  This varies from performance tweaks, application configuration and even custom versions of packages.  This could all be handled with scripts and such, but this is less than clean and near impossible to maintain.  This is where puppet provides a clean solution.18:06
DavieyPuppet generally acts on a client/server method, to manage multiple nodes.  However, it is also possible to use puppet on a single host.  For simplicity, this session will demonstrate a single host deployment example and some of the features of puppet via their configuration format - called a manifest.18:06
DavieyIn this session, we will do the following:18:07
Daviey• Connect to an instance in the cloud18:07
Daviey• Install puppet18:07
Daviey• Initial configuration18:07
Daviey• Configure the same node to install and create a basic apache virtual host18:07
DavieyFirstly, i hope everyone will be able to look at a console window, and this IRC session concurrently.18:07
DavieyI'm going to invite everyone to connect via ssh to a cloud instance:18:08
Daviey$ ssh demo@demo.daviey.com18:08
DavieyYou'll need to accept the host key18:08
DavieyI don't think it really requires verification in this instance.18:08
Daviey(Although, it's generally good pratice to compare the fingerprint)18:08
DavieyThe password is 'demo'18:09
Daviey(Secure huh?)18:09
* Daviey waits for a confirmation.18:10
=== pvo_ is now known as pvo
DavieyI will type in the IRC channel comments, so please multi-task by looking at both.. Thanks :)18:10
DavieySo, i just checked to see if we have apache2 installed... we do not!18:11
Daviey(You can check there is nothing running as a httpd on port 80, by visiting http://demo.daviey.com18:11
Daviey(You should get a failure)18:12
DavieyI'm running sudo apt-get update, to make sure our indexes are updated18:12
DavieyThe observant amongst you, will notice i'm running Natty18:12
DavieyThe current development version18:12
Daviey(I must be crazy doing a demo on this! :)18:12
DavieySo, i just, sudo apt-get install puppet18:13
DavieyThis installs the puppet application and it's dependencies.18:13
DavieyThis stage, would normally be done automatically during installation18:14
Daviey(if you preseed it such)18:14
DavieyYou'll notice the output here:18:14
Davieypuppet not configured to start, please edit /etc/default/puppet to enable18:14
DavieyDid you all see the START=no, option18:14
DavieyThis means that the puppet client agent will not run automatically18:15
DavieyMy intention is to invoke puppet manually.. so i do not need the client to be running18:15
Daviey(one moment please)18:16
Daviey(slight technical issue, please hold)18:17
DavieyOkay!18:21
Davieywe are back18:21
Davieyokay, this is the directory structure we should see18:21
Davieyon a fresh installation18:21
DavieyOkay, i have just copied a manifest to /etc/puppet/manifest18:22
DavieyI hope everyone can see this18:22
DavieyIt's quite a quick one i have thrown together18:22
DavieyIt should:18:22
DavieyInstall apache218:22
Davieyadd a virtual host, called demo.daviey.com18:22
Davieyand enable it18:23
Daviey(I'll make it avaliable afterwards via a pastebin)18:23
DavieyThe stanza towards the bottom mentions, ip-10-117-82-13818:23
Daviey(for the observant, you'll notice this is the hostname of the machine)18:23
DavieyI could equally, have put 'default' here... which would mean that it would do it for every machine connected18:24
Daviey(in this instance, i am only using one machine)18:24
DavieyNow, the actual virtual host needs a template...18:24
Davieylets create it.18:24
=== daker is now known as daker_
Davieypuppet uses Ruby's ERB template system:18:25
DavieyYou'll notice that there are parts which can be expanded.18:25
DavieySo, this is a generic apache virtual host template, that could be used for other virtualhosts18:25
Davieyother than demo.daviey.com18:25
DavieyNow... lets make puppet do it's thing!18:26
DavieyI love it when a plan comes together.18:27
DavieyEssentially, i did a dry run with this configs before the session.. and didn't clean up properly!18:28
DavieyThis is why i should have used puppet to clean up, as it would have done it better than me!18:28
DavieySo, puppet install apache2 and enabled the virtual host18:29
Davieypuppet knows which package hander to use18:29
Davieyie, apt, yum etc18:29
DavieyNow... if we check to see if apache started.. we'll see it failed... one moment18:30
DavieySo...18:31
Daviey(2)No such file or directory: apache2: could not open error log file /var/log/apache/demo.daviey.com-error.log.18:31
DavieyUnable to open logs18:31
DavieyThis means i made a typo in my template... suggestions on how i should fix this?18:31
Davieykim0, Is quite correct with:18:31
Daviey<kim0> Daviey: should be "apache2" there18:31
DavieyBut... How should i *fix* it?18:31
DavieyWe edit the template of course!18:32
DavieyNow, we should be able to go to http://demo.daviey.com/18:33
Daviey(My simple Task didn't try to start apache if it wasn't already running!)18:33
Davieynotice: /Stage[main]//Node[ip-10-117-82-138]/Apache2::Simple-vhost[demo.daviey.com]/File[/etc/apache2/sites-available/demo.daviey.com]/content: content changed '{md5}5047b9f9a688c04e2697d9fd961960ed' to '{md5}2c32102fd06543c85967276eeee797e2'18:33
Daviey^^ Puppet knew it should create a new virtual host, based on the template changing!18:34
DavieyHow neat is that?!18:34
DavieyNow, in a real life example - puppet would also manage pulling in the website..18:35
Davieypuppet provides a FileBucket interface..18:35
DavieyThis is similar to rsync, and allows files to be retrieved from there.18:35
DavieyHowever, for large files - people often use an external application which is configured via puppet.18:35
DavieyThis could be anything from rsyncd, nfs or event torrent!18:36
Davieyfacter is a really useful tool.  This is where the variables used in the templates are expanded from...  I think of it as lsb_release supercharged.18:36
DavieyHere is an example of the output, just generated:18:37
Davieyhttp://paste.ubuntu.com/584952/18:37
DavieyThis is a list of 'facts' about the system18:37
DavieyOne of the really nice things about the manifests... is that they can be conditional18:38
=== Mike is now known as Guest49828
DavieySo, i could do a different task based on they virtual type (or lack of) for example.18:38
DavieyThere is no point trying to use this machine as a virtual machine server, if it doesn't fit the requirements18:39
DavieyUsually bare metal - and amount of memory free18:39
DavieyThe configuration files are largely inheritance based, which fully supports overriding of configurations from the base class.18:39
DavieyWhen puppet is installed on a client / server basis... it uses SSL for secure communciation between the elements18:40
DavieyThe server runs on port 8140. so make sure firewall is opened (or ec2 security group allows communication!)18:41
DavieyClient (Agent) - puppetd18:41
DavieyServer - puppetmasterd18:41
Daviey^^ This is the name of the applications18:41
DavieyThe puppetd runs on all the clients, and polls the Server with the default value of every 30 minutes looking for changes18:41
DavieyIt defaults to looking for the dns hostname of 'puppet'18:42
DavieySo, it's a good idea for the puppet master to have that dns entry set for a local network18:42
DavieyEqually, i could have set puppet.mydomain.com18:42
DavieyThis is probably a good place to stop the demo.  I will make my puppet configuration avaiable for others to experiment with.18:43
DavieyIt really is not as complicated as it seems to get started.18:43
DavieyWhen i first tried puppet, i found the 'getting started' docs to be somewhat complicated.18:43
DavieyI would recommend people start with a minimal example like this.. and then build from there.18:44
DavieyThe puppet website has some excellent recipies to use as an example... but probably a good idea to start simple.18:44
DavieyI will now take questions, and answer them as best as i can18:44
=== drc is now known as Guest51125
DavieyAnnnnd. classbot, i hate you18:45
Davieyclassbot isn't +v18:45
Daviey<ClassBot> sveiss asked: how large is 'large'? is there a rule of thumb as to how much data a FileBucket can cope with? -- There is 1 additional question in the queue.18:45
Davieysveiss, that is a good question.. I seem to remember reading that since 2.6... massive improvements have gone into increasing it's efficiency18:46
DavieyHowever, it is still believed to be the likely bottlekneck18:46
DavieyI haven't found the later versions to suffer to badly from this bottlekneck18:47
Davieybut others have commented.18:47
DavieyI think it depends on load..18:47
DavieyI would ask that if you do try the filebucket that you report back to the ubuntu server team with your success.18:48
Daviey(We often don't get enough feedback)18:48
ClassBotsveiss asked: how large is 'large'? is there a rule of thumb as to how much data a FileBucket can cope with?18:48
Davieykim0 asked: Wouldn't clients looking for dns name "puppet" and blindly following it .. be a secruity risk18:49
DavieyWell yes.. this is true.. This is one of the reasons SSL is used.18:49
DavieyEssentially, the pupper master (usually has a self signed key)18:49
Davieybut the client needs to accept it.18:49
DavieyThis would normally happen as part of the installation, or bootstrapping18:49
DavieyWhich is an area before puppet works.18:50
ClassBotkim0 asked: Wouldn't clients looking for dns name "puppet" and blindly following it .. be a secruity risk18:50
DavieyWow.. i now understand ClassBot18:51
ClassBotThere are 10 minutes remaining in the current session.18:51
ClassBotkim0 asked: Do you reuse ready made recipies18:51
DavieyYou would be foolish not to!18:52
DavieyThere is a true gem of samples on the puppet wiki, and other locations.18:52
DavieyAdditionally, there are additional modules18:52
DavieyWhich allow you to reduce the burden of what you need to do18:52
Davieyshttp://forge.puppetlabs.com/18:53
Davieyhttp://forge.puppetlabs.com/ , rather18:53
DavieyIf there are no more questions, i will end my session.18:53
DavieyI would like to thank everyone for coming18:54
DavieyPlease do experiment with puppet, and report back to us.18:54
DavieyWe are a friendly team, which hang around in #ubuntu-server18:54
DavieyThank you for your time.18:54
* kim0 claps18:55
kim0Thanks Daviey for the awesome session18:55
kim0Next up is Edulix .. Presenting "Hadoop" The ultimate hammer to bang on big data :)18:56
ClassBotThere are 5 minutes remaining in the current session.18:56
Edulixhello people, thanks for your assistance. this is the session titled "Using hadoop, divide and conquer"19:00
Edulixkim0 told me about these ubuntu cloud sessions, and kidly asked me to do a talk over hadoop, so here I am  =)19:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat || Event: Ubuntu Cloud Days - Current Session: Using hadoop, divide and conquer - Instructors: edulix
Edulixfirst I must say that I am in no way a hadoop expert, as I have been working with hadoop just for a bit over a month19:01
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html following the conclusion of the session.19:01
Edulixbut I hope that I can help to show you a bit of hadoop and ease the learning curve for those who want to use it19:02
EdulixI'm going to base this talk in the hadoop tutorial available in http://developer.yahoo.com/hadoop/tutorial/ as it helped me a lot, but it's a bit dense, so I'll do a resumed version19:03
EdulixSo what's hadoop anyway?19:03
Edulixit's a large-scale distributed batch processing infrastructure, designed to efficiently distribute large amounts of work across a set of machines19:03
Edulixhere large amounts of work means really really large19:03
EdulixHundreds of gigabytes of data is low end for hadoop!19:04
Edulixhadoop supports handling hundreds of petabytes... Normally the input data is not that big, but the intermediate data is or can be19:04
Edulixof course, all this does not fit on a single hard drive, much less in memory19:04
Edulixso hadoop comes with support for its own distributed file system: HDFS19:05
Edulixwhich breaks up input data and sends fractions  (blocks) of the original data to some machines in your cluster19:05
Edulixeveryone that has tried will know that performing large-scale computation is difficult19:06
Edulixwhenever multiple machines are used in cooperation with one another, the probability of failures rises: partial failures are an expected and common19:06
EdulixNetwork failures, computers over heating, disks crashing, data corruption, maliciously modified data..19:07
Edulixshit happens, all the time (TM)19:07
EdulixIn all these cases, the rest of the distributed system should be able to recover and continue to make progress. the show must go on19:07
EdulixHadoop provides no security, and no defense to man in the middle attacks for example19:08
Edulixit assumes you control your computers so they are secure19:08
Edulixon the other hand, it is designed to handle hardware failure and data congestion issues very robustly19:08
Edulixto be successful, a large-scale distributed system must be able to manage resources efficiently19:09
EdulixCPU, RAM, Harddisk space, network bandwidth19:09
EdulixThis includes allocating some of these resources toward maintaining the system as a whole19:10
Edulix..... while devoting as much time as possible to the actual core computation19:10
EdulixSo let's talk about the hadoop approach to things19:10
Edulixbtw if you have nay questions, just ask in #ubuntu-classroom-chat with QUESTION: your question19:11
EdulixHadoop uses a  simplified programming model which allows the user to quickly write and test distributed systems19:11
Edulixand also to tests its efficient & automatic distribution of data and work across machines19:12
Edulixand also allows to use the underlying parallelism of the CPU cores19:13
EdulixIn a hadoop cluster, data is distributed to all the nodes of the cluster as it is being loaded in19:13
EdulixHDFS will split large data files into blocks which are managed by different nodes in the cluster19:13
EdulixAlso replicating data in different nodes, just in case19:13
ClassBotkim0 asked: Does hadoop require certain "problems" that fits its model ? can I throw random computations to it19:14
EdulixI'm going to answer that now =)19:15
Edulixbasically, hadoop uses the mapreduce programming paradigm19:16
EdulixIn hadoop, Data is conceptually record-oriented. Input files are split into input splits referring to a set of records.19:16
EdulixThe stragy of the scheduler is moving the computation to the data, i.e. which data will be processed by a node is chosen based on its locality to the node, which results in high performance.19:17
EdulixHadoop programs need to follow a particular programming model (MapReduce), which limits the amount of communication, as each individual record is processed by a task in isolation from one another19:17
EdulixIn MapReduce, records are processed in isolation by tasks called Mappers19:18
EdulixThe output from the Mappers is then brought together into a second set of tasks called Reducers19:18
Edulixwhere results from different mappers can be merged together19:18
EdulixNote that if you for example don't need the Reduce step, you can implement a Map-only processing.19:18
EdulixThis simplification makes the Hadoop framework much more reliable, because if a node is slow or crashes, other node can simply replace the former taking the same inputsplit and processing it again19:19
ClassBotchadadavis asked: Is there any facility for automatically determining how to partition the data, i.e. based on how long one chunk of processing takes?19:19
Edulixto be able to partitoon the data,19:21
Edulixyou need to have first a structure for that data. for example,19:21
Edulixif you have a png image that you need processthen the input data is the image file. you might partition your image in chunks that start in a given position (x,y) and have a height and a width19:22
Edulixbut the partitioning is usually done by you, the hadoop program developer19:22
Edulixthough hadoop is in charge of selecting where to send to that partition, depending on data locality19:23
Edulixwhen you partition the input data, you don't send the data (input split) to the node that will process it: ideally it will already have that data!19:24
Edulixhow is this possible?19:24
Edulixbecause when you do the partition, the InputSplit only defines this partition (so it might be in the image example 4 numbers: x,y, height, width) and depending on which nodes the file blocks of the input data reside, hadoop will send that split to that node19:25
Edulixand then the node will open the file in HDFS for reading starting (fseek) in there19:26
Edulixok, I continue =)19:26
Edulixseparate nodes in a Hadoop cluster still communicate with one another, implicitly19:26
Edulixpieces of data can be tagged with key names which inform Hadoop how to send related bits of information to a common destination node19:27
EdulixHadoop internally manages all of the data transfer and cluster topology issues19:27
EdulixOne of the major benefits of using Hadoop in contrast to other distributed systems is its flat scalability curve19:27
EdulixUsing other distributed programming paradigms, you might get better results for 2, 5, perhaps a dozen machines. But when you need to go really large scale, this is where hadoop excels19:28
EdulixAfter you program is written and functioning on perhaps ten nodes (to tests that it can be used in multiple nodes with replication etc and not only in standalone mode),19:29
Edulixthen  very little --if any-- work is required for that same program to run on a much larger amount of hardware efficiently19:29
Edulix== distributed file system ==19:29
Edulixa distributed file system is designed to hold a large amount of data and provide access to this data to many clients distributed across a network19:29
EdulixHDFS is designed to store a very large amount of information, across multiple machines, and also supports very large files19:30
Edulixsome of its requirements are:19:30
Edulixit should store data reliably even if some machines fail19:30
Edulixit should provide fast, scalable access to this information19:30
EdulixAnd finally it should integrate well with Hadoop MapReduce, allowing data to be read and computed upon locally when possible19:31
EdulixThis last point is crucial. HDFS is optimized for MapReduce, and thus has made some decisions/tradeoffs:19:31
EdulixApplications that use HDFS are assumed to perform long sequential streaming reads from file because of MapReduce19:31
Edulixso HDFS is optimized to provide streaming read performance19:31
Edulixthis comes at the expense of random seek times to arbitrary positions in fileswhen a node19:32
Edulixthis comes at the expense of random seek times to arbitrary positions in files19:32
Edulixi.e. when a node reads, it might start reading in the middle of a file, but then it will read byte after byte, not jumping here and there19:32
EdulixData will be written to the HDFS once and then read several times; AFAIK there is no file update support19:32
Edulixdue to the large size of files, and the sequential nature of reads, the system does not provide a mechanism for local caching of data19:33
Edulixdata replication strategies combat machines or even whole racks failing19:33
Edulixhadoop comes configured to have each file block stored in three nodes by default: two in the same rack, and the other block in another machine19:34
Edulixif the first rack fails, speed might degrade relatively but information wouldn't be lost19:35
EdulixBTW HDFS design is based on google file system (GFS)19:35
Edulixand as you probably  has guessed, in HDFS data/files is/are split in blocks of equal size in DataNodes (machines in the cluster)19:36
ClassBotgaberlunzie asked: would this sequential access mean hadoop can work with tape?19:36
EdulixI haven't heard anyone doing such a thing,19:37
Edulixand I don't think it's a good idea19:37
Edulixwhy? because the reads are sequential, but you need to do the first seek to start reading at the point your inputsplit indicates19:38
Edulixdoing this first seek might be too slow for a tape, but I might be completely wrong  here19:38
Edulixnote that the data stored in HDFS is supposed to be temporary, mostly, just for working19:39
Edulixso you copy the data there, do your thing, then copy the output result back19:39
Edulixin contrast, tapes are mostly used for large term storage19:39
Edulix(cotinuing) default block size in HDFS is very large (64MB)19:40
EdulixThis decreases metadata overhead and allows for fast streaming reads of data19:40
EdulixBecause HDFS stores files as a set of large blocks across several machines, these files are not part of the ordinary file system19:41
EdulixFor each DataNode machine, the blocks it stores reside in a particular directory managed by the DataNode service, and these blocks are stored as files whose filenames are their blockid19:41
EdulixHDFS comes with its own utilities for file management equivalent to ls, cp, mv, rm, etc19:41
Edulixthe metadata (names of files and dirs and where are the blocks stored) of the files can be modified by multiple clients concurrently19:41
EdulixThe metadata (names of files and dirs and where are the blocks stored) of the files can be modified by multiple clients concurrently. To orchestrate this, metadata is stored and handled by the NameNode, that stores metadata usually in memory (it's not much data), so that it's fast (because this data *will* be accessed randomly).19:42
ClassBotchadadavis asked: If I first have to copy the data (e.g. from a DB) to HDFS before splitting, couldn't the mappers just pull/query the data directly from the DB as well?19:43
Edulixyes you can =)19:43
Edulixand if the data is in a DB, you should19:43
Edulixinput data is read from an InputFormat19:45
Edulixand there are different input formats provided by hadoop: FileInputFormat for example to read from a single file19:45
Edulixbut there's also DBInputFormat, for example19:45
Edulixin my experience, you will probably create your own =)19:46
EdulixDeliveratively I haven't explained any code, but I recommend you that if you're interested you should start playing with hadoop locally in your own machine19:46
Edulixjust download hadoop from http://hadoop.apache.org/ and follow the quickstart http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html19:47
Edulixfor quickstart and for development, you typically use hadoop as standalone in your own machine19:47
Edulixin this case HDFS will simply refer to your own file system19:47
EdulixYou just need to download hadoop, configure Java (because hadoop is written in java), and execute the example as mentioned in the quickstart page19:47
Edulixas mentioned earlier, with hadoop you usually operate as follows, because of its batching nature: you copy input data to HDFS, then request to launch the hadoop task with an output dir, and when it's done, the output dir will have the task results19:47
EdulixFor starting developing a hadoop app was this tutorial because it explains pretty much everything I needed http://codedemigod.com/blog/?p=12019:48
Edulixbut note that it's a bit old19:48
Edulixand one of the things that I found most frustrating in hadoop while developing was that there are duplicated classes i.e. org.apache.hadoop.mapreduce.Job and org.apache.hadoop.mapre.jobcontrol.Job19:48
EdulixIn that case, use alwys org.apache.hadoop.mapreduce, because is the new improved API19:49
Edulixbe warned, the examples in http://codedemigod.com/blog/?p=120 use the old mapred api :P19:49
Edulixand hey, now I'm open to even more questions !19:50
Edulixand if you have questions later on, you can always join us in freenode.net, #hadoop, and hopefully someone will help you there =)19:51
ClassBotThere are 10 minutes remaining in the current session.19:51
ClassBotgaberlunzie asked: does hadoop have formats to read video (e.g., EDLs and AAFs)?19:52
Edulixmost probably.. not, but maybe someone has done that before19:52
Edulixanyway, creating a new input format is really easy19:52
ClassBotchadadavis asked: Mappers can presumably also be written in something other than Java? Are there APIs for other languages (e.g. Python?) Or is managed primarily at the shell level?19:53
Edulixgood question!19:54
Edulixyes, there are examples in python and in C++19:54
EdulixI haven't used them though19:55
ClassBotkim0 asked: Can I use hadoop to crunch lots of data running on Amazon EC2 cloud ?19:55
Edulixheh I forgot to mention it =)19:55
Edulixanswer is yes!19:56
Edulixmore details in http://aws.amazon.com/es/elasticmapreduce/19:56
ClassBotThere are 5 minutes remaining in the current session.19:56
Edulixthat's one of the nice things of using hadoop: many big people uses it in the industry. yahoo, for example, and amazon has support for it too19:56
Edulixso don't need to really have lots of machines for doing large computation19:57
Edulixjust use amazon =)19:57
ClassBotgaberlunzie asked: is there a hadoop format repository?19:57
EdulixI don't know huh19:58
Edulix:P19:58
EdulixI didn't investigate much about this because I needed to have my own19:58
Edulixbut probably in contrib there is19:58
Edulixok so that's it!20:00
EdulixThanks for your assistance to the talk, and thanks for the organizers20:00
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat || Event: Ubuntu Cloud Days - Current Session: UEC/Eucalyptus Private Cloud - Instructors: obino
* kim0 claps .. Thanks Edulix 20:01
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html following the conclusion of the session.20:01
obinothanks Edulix20:01
obinovery nice presentation20:01
obinoI am graziano obertelli and I work at eucalyptus systems20:02
obinofeel free to ask questions at any time20:02
obinoif they are about eucalyptus I may be able to answer them :)20:02
obinoEucalyptus powers the UEC20:03
obinoUbuntu added a nice themes to Eucalyptus, the image store and very nifty way to autoregister the components20:04
obinowhich makes it a breeze to install UEC on Ubuntu clusters20:04
obinoat http://open.eucalyptus.com/learn/what-is-eucalyptus you can quickly check what is eucalyptus20:05
obinowith it you can have your own private cloud20:05
obinocurrently Eucalyptus supports AWS EC2 and S3 API20:05
obinothus a lot of clients tools written for EC2 should work with Eucalyptus20:06
obinominus minor changes like the endpoint URL20:06
obinoEucalyptus has a modular architecture: there are 5 main components20:07
obinothe cloud controller (CLC)20:07
obinowalrus (W)20:07
obinothe cluster controller (CC)20:07
obinothe storage controller (SC)20:08
obinoand the node controller (NC)20:08
obinothe CLC and W are the user facing components20:08
obinothey are the endpoints for the client tools20:08
obinorespectively for the EC2 API and for the S3 API20:08
obinothere has to be 1 CLC and 1 W per installed cloud20:09
obinoand they need to be publicly accessible20:09
obinothe CC is the middle man20:09
obinoit controls a set of NCs20:10
obinoand reports them to the CLC20:10
obinoit controls the network for the instances running on its NCs20:10
obinothere can be multiple CCs in a cloud20:11
obinothe SC usually sits with the CC20:11
obinothere has to be one SC per CC20:11
obinootherwise EBS won't be available for that cluster20:11
obinoSC and CC needs to be able to reach (talk to) the CLC and W20:12
obinothe NC is the real worker20:12
obinoinstances runs on the machine hosting the NC20:12
obinothe previous tutorials explained a great deal of the user interaction, so I'll talk a bit of the behind the scene20:13
obinofor example what happened when an instance is launched20:14
obinothe CLC receive the requests20:14
obinodepending on the 'zone' the request asks, it will select the correspondent CC20:15
obinoafter of course having checked that there is enough capacity left in that zone20:15
obinowith it is sends information about the network to set up for the instance20:15
obinosince every instance belongs to a security group and each security group has its own private network20:16
obinothe CC will then decide which NC will run the instance20:17
obinobased on the selected scheduler20:17
obinoand it will setup the network for the security group20:17
obinothis step is dependent on how Eucalyptus is configured, since there are 4 different Networking Modes20:18
obinoonce the NC receives the requests it will need to emi file (the root fs of the future instance)20:18
obinothe NC keeps a local cache for the previous emi it saw before20:19
obinoit's a LRU cache so the least used image will be evicted if the cache grows too big20:19
obinoso the NC will check first to see if the emi is in the cache20:19
obinoif not it will have to contact W to get it20:20
obinothis is why W needs to be accessible by the NCs20:20
obinoof course it's not only the emi that the NC downloads but the eki and the eri too20:20
obinoonce the image is transferred. it is copied into the cache first20:21
obinoafter that the emi, eki and eri are assembled for the specific hypervisor the NC has access to20:22
obinoso, in the case of KVM, a single disk is created20:22
obinothe size of which depends on the configuration the cloud administrator gave to the instances20:22
obinoand the emi is copied into the first partition20:23
obinothe 3rd partition is populated with the swap20:23
obinoand the second one will be ephemeral20:23
obinolibvirt is finally instructed to start the instance20:24
obinoand of course the NC will take all the steps to setup the network  for the instance20:25
obinofrom this quick run down, you will see why the first time an instance is booted on one NC takes longer20:25
obinothere is an extra network transfer (from W) and an extra disk copy (to populate the cache) that takes place20:26
ClassBotsmoser asked: is Eucalyptus expecting to have EBS-root and accompaning API calls (StartInstances StopInstances ...) ?20:27
obinoboot from EBS is expected to be in the next release20:27
obinoat least that's what they told me :)20:28
obinoI'm not sure about the start and stop instances call20:28
obinothe above instance life cycle that I went through should be helpful to understand how to debug the most frequent problem on a Eucalyptus installation20:29
obinothe instance won't reach running state20:29
=== smspillaz is now known as smspillaz|zzz
obinofrom the above is easy to see that starting backward may be helpful20:30
obinoso starting from the NC logs to see if the instance started correctly (or at least libvirt tried to start it)20:30
obinoand if nothing is there, check the CC logs20:30
obinoto finish with the CLC logs20:30
obinodespite the complexity, eucalyptus is fairly easy to install20:32
obinoand the UEC has taken this step even further20:32
obinoso if you want to play with Eucalyptus or the UEC, you just need 2 machines available20:32
obinoif instead you want to play with Eucalyptus before installing to see what is can do and how good is the EC2/S3 APIs20:33
obinothen you can try our community cloud http://open.eucalyptus.com/CommunityCloud20:33
obinocalled ECC20:33
obinothe ECC is available to everybody20:34
obinothe SLA are designed to avoid abuses20:34
obinoso your instance(s) will be terminated after 6 hours of running time20:35
obinoyou can of course re-run instances at will, but no more than 4 at any point in time20:35
obinosame idea for the bucket, volumes and snapshots20:35
obinothe ECC runs the latest stable version of Eucalyptus, currently 2.0.220:36
obinoif you are a developers and you are more insterested in the code and architecture, we have assembled few pages at http://open.eucalyptus.com/participate/contribute20:37
obinowhich may be useful20:37
obinostarting from our launchpad home, and the API version we support20:38
obinowe have 2 launchpad branches, for stable version and for the development of the next version20:39
obinoboth are accessible of course20:39
obinowe provide also some 'nightly builds'20:39
obinothey are actually produced on a weekly basis, but they kept the name20:40
obinofinally we give some information on how to contribute back to eucalyptus20:40
obinoand the final page is an assortment of various tips which may be of use to developers20:41
obinolike debugging tricks, or using eclipse or partial compile/deploy20:41
obinowe are hoping to expand this area sooon20:41
obinofinally under http://open.eucalyptus.com/participate you will see various ways to interact with us and Eucalyptus20:42
obinoin particular the forum is quite active and it is quite a resource to solve issues20:42
obinoas well as the community wiki20:43
obinois there anything in particular that you want to hear about eucalyptus?20:43
obinoof questions?20:44
obino*or*20:44
obinowell then, it looks like I managed to put everyone to sleep! :)20:44
obinothis http://open.eucalyptus.com/contact contains all the different ways to reach us in case you have questions20:45
obinoand of course there is the UEC page https://help.ubuntu.com/community/UEC20:46
obinowhich contains very good information about Eucalyptus/UEC20:47
ClassBottonyflint1 asked: are there any plans for addition tools/utilities/documentation for creating custom images?20:48
obinowe currently have few pages under our community wiki under the tab images http://open.eucalyptus.com/wiki/images20:49
obinowhich could be a chore at times20:49
obinobut they are useful to understand how things works20:50
obinomost of the EC2 images should work with UEC/Eucalyptus20:50
obinoso any way you have to generate images should work20:50
obinothe kernel/initrd combination depends of course from the hypervisor the instances will use20:51
ClassBotThere are 10 minutes remaining in the current session.20:51
obinoin short we don't have short term plan to generate new tools but we are working with the current tools to make sure they are compatibles with eucalyptus20:52
obinoif you have a favorite tool to generate images, let us know :)20:52
obinotonyflint1: does it answer your question?20:52
ClassBotgholms|work asked: Boxgrinder seems to be a decent tool for building images.  Any idea if that works with Ubuntu?20:55
obinogood question20:55
obinowe are in contact with a developer (marek) of boxgrinder, who has been very helpful so we are hoping to have an official eucalyptus plugin soon20:55
obinoas for the question itself it could be interpreted in 2 ways: will boxgrinder create ubuntu images? or will boxgrinder be packaged for ubuntu?20:56
ClassBotThere are 5 minutes remaining in the current session.20:56
obinoboth are probably better answer by the boxgrinder developers20:57
obinoI would hope yes20:57
obinoboxgrinder should be fairly portable20:57
obinoand it has a nice plugin structure20:58
* kim0 claps 21:00
kim0Thanks a lot for this wonderful session21:00
* obino bows21:00
kim0Alright everyone .. Thank you for attending Ubuntu Cloud Days21:01
kim0I hope it was fun and useful21:01
kim0You can find us at #ubuntu-cloud21:01
ClassBotLogs for this session will be available at http://irclogs.ubuntu.com/2011/03/24/%23ubuntu-classroom.html21:01
kim0and feel free to ping me personally later21:01
kim0Thanks .. best regards .. till next time21:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom || Support in #ubuntu || Upcoming Schedule: http://is.gd/8rtIi || Questions in #ubuntu-classroom-chat ||
=== niemeyer is now known as niemeyer_away
=== sre-su_ is now known as sre-su
=== Meths_ is now known as Meths

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!