/srv/irclogs.ubuntu.com/2011/03/23/#ubuntu-classroom.txt

hungtran	hello	01:02
=== Omega is now known as bootri
=== james is now known as Guest60011
=== r is now known as Guest20791
=== _LibertyZero is now known as LibertyZero
=== msnsachin12 is now known as msnsachin
=== msnsachin12 is now known as msnsachin
sadeedtech	on	10:55
Siri_	hello	10:55
sadeedtech	hi	10:56
Siri_	can you tell me what's going on here	10:56
Siri_	i don't find kim0	10:56
sadeedtech	so do I not find him, may timing	10:57
daiver	as you see he is not online yet	10:57
Siri_	ohh...this was supposed to start at 4 right	10:58
sadeedtech	I think the event is three hours from now	11:00
sadeedtech	info	11:01
sadeedtech	help	11:01
=== smspillaz\|zzz is now known as smspillaz
=== ziviani is now known as JRBeer
anebi	hi guys	15:02
JoeyI	hi	15:06
ChrisRut	Hi	15:06
=== sadeeb is now known as olutayo
=== olutayo is now known as sadeeb
natea	is there a session on GlusterFS happening right now?	15:39
natea	i didn't get the timezone conversion right, so i'm coming late to the party :/	15:40
EvilPhoenix	natea: what timezone are you?	15:41
natea	EvilPhoenix: EST	15:41
EvilPhoenix	if you read the schedule, it starts at 4PM	15:41
EvilPhoenix	oh wait	15:41
EvilPhoenix	that's GMT	15:41
* EvilPhoenix does the math		15:41
EvilPhoenix	UTC... -0400...	15:41
EvilPhoenix	oh	15:42
EvilPhoenix	12PMish	15:42
natea	oh, it looks like it's not until 13:00	15:42
natea	according to http://www.timezoneconverter.com/	15:42
natea	"17.00 UTC Scaling shared-storage web apps in the cloud with Ubuntu & GlusterFS — semiosis"	15:42
natea	17.00 UTC -> 13.00 EST	15:43
EvilPhoenix	grah my system isnt displaying times right	15:43
* EvilPhoenix shoots his system		15:43
EvilPhoenix	13:00 is about 1PM	15:43
* EvilPhoenix shall return after destroying his system		15:43
kim0	Howdy	16:00
kim0	Hello everyone, welcome to the very first Ubuntu Cloud Days	16:01
ttx	yay!	16:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\| Event: Ubuntu Cloud Days - Current Session: Cloud Computing 101, Ask your questions - Instructors: kim0
* ttx attends two conferences at once.		16:01
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html following the conclusion of the session.	16:01
kim0	So again, good morning, good afternoon and good evening wherever you are	16:02
kim0	Please be sure you're joined to this channel plus	16:02
kim0	#ubuntu-classroom-chat : For Questions	16:02
kim0	In case you would like to ask a question	16:03
kim0	please start it with "QUESTION: <question goes here>	16:03
kim0	and write it down in the #ubuntu-classroom-chat channels	16:04
kim0	This session is mostly about taking questions and making sure everyone is well seated :)	16:04
kim0	Seems like I have a question already	16:04
ClassBot	EvilPhoenix asked: I think this could be the start of it. Could you give a brief explanation of what "Cloud Computing" is defined as?	16:05
kim0	Hi EvilPhoenix .. Good question indeed	16:05
kim0	Trying to answer your question .. I will begin by saying	16:06
kim0	Cloud has so many different definitions already :)	16:06
kim0	Almost all companies by bent it to mean whatever product they're selling	16:06
kim0	the term has really been abused	16:06
kim0	The are also various definitions by institutions like NIST and others	16:07
kim0	since there is no one single true definition .. I'll lay down some properties	16:07
kim0	that almost everyone agrees should be present in a "cloud"	16:07
kim0	1- Pay per use .. Cloud are online resources that can be characterized by "pay per use"	16:08
kim0	you only pay for the resources that you need .. the storage you consume	16:08
kim0	the CPU/Memory compute capacity that you are using ..etc	16:08
kim0	You never really (or should never) pay in advance .. (just in case you need that resource)	16:09
kim0	2- Instant scalability: Cloud solutions should be instantly scalable	16:09
kim0	that is .. with one api call (that's one command, or a click of a button for non programmers)	16:09
kim0	you should be able to allocate more resources	16:09
kim0	Clouds convey the feeling of inifinite scale .. of course in reality it's not truly infinite .. but it's large enough	16:10
kim0	3- API programmability .. Most cloud solutions are going to have an API .. an API is a programmatic way to control your resources	16:11
kim0	Taking a prime example .. The largest commercial compute and storage cloud today is Amazon's AWS cloud	16:11
kim0	With Amazon's cloud, with an api call (or running a command)	16:11
kim0	you can instantly allocate "servers"	16:12
kim0	so it's got an API interface	16:12
kim0	it's scalable .. since you can always add more servers (or S3 storage) should you want to	16:12
kim0	and you only pay for the consumed CPU hours .. or gigabytes of storage	16:12
kim0	Clouds are usually split up by their type as well	16:13
kim0	IaaS , PaaS and SaaS	16:13
kim0	let me quickly comment on those types	16:13
kim0	IaaS : Infrastructure as a Service	16:13
kim0	This basically means you get "infrastructure" components (that is servers, storage space, networking ...etc" as as service ..	16:14
kim0	You use those to build your own cloud or application	16:14
kim0	PaaS : Moves a little up the value stack	16:14
kim0	It provides a complete development environment as a service	16:14
kim0	so you basically upload some code .. and without needing to worry about servers or networks/switches or storage ..etc	16:15
kim0	your application just runs on the "cloud" .. is scalable, is redundant	16:15
kim0	someone else (the PaaS provider) did that work for you	16:15
kim0	Examples of PaaS would be Google's AppEngine .. salesforce.com or others	16:16
kim0	The last type is SaaS : Software as a Service	16:16
kim0	This basically means providing a full complete application, that you are directly using in the cloud	16:16
kim0	examples of that would be facebook, gmail, twitter ..etc	16:16
kim0	Those are "applications" if you come to think of it .. more so than the notion of webpages	16:17
* kim0 checks if he has more questions		16:17
ClassBot	BluesKaj asked: ok then what Ubuntu Cloud about ?	16:17
kim0	Hi BluesKaj	16:18
kim0	Very good question as well	16:18
kim0	So Amazon's cloud is a very popular IaaS cloud. However, some people are not totally happy with the fact that they'd upload their data to amazon's datacenters	16:19
kim0	some enterprises or ISPs .. would like to utilize the improved economics of the cloud model	16:20
kim0	however still keeping their data and servers in-house (whatever that means to them)	16:20
kim0	In order to build a cloud that competes with Amazon's cloud	16:20
kim0	you need various software components	16:20
kim0	Ubuntu packages, integrates and makes available the best of breed open-source software	16:21
kim0	that enables you to build and operate your own cloud should you want to	16:21
kim0	In the upcoming 11.04 natty release	16:21
kim0	Ubuntu packages two open-source complete cloud stacks	16:21
kim0	those would be	16:21
kim0	- Ubuntu Enterprise Cloud : An Ubuntu integrated and polished cloud stack based on the popular Eucalyptus stack	16:22
kim0	- OpenStack : A new opensource cloud stack that's gaining a lot of popularity	16:22
kim0	Actually we have dedicated sessions for each of those cloud stacks!	16:22
kim0	An interesting fact .. is that UEC and OpenStack both allow you to expose an API that is the equivalent of Amazon's API	16:23
kim0	that means you can use the same management tools to control both the public (Amazon's ) cloud and your own private one!	16:23
kim0	This is also great for providers wanting to run their own clouds	16:24
kim0	so that was an overview of the cloud stacks available to enable	16:24
kim0	you to build your own cloud envrionment	16:24
kim0	Other than that .. and to fully answer the question of "What is ubuntu cloud" .. I need to add a few more points	16:25
kim0	Ubuntu makes available official Ubuntu images that run on the Amazon cloud as well	16:25
kim0	You can check them out (as they're regularly updated) on http://cloud.ubuntu.com/ami/	16:25
kim0	you basically search for what you want, like (maverick 64 us-east) pick the ami-id	16:26
kim0	and launch that	16:26
kim0	Also Canonical makes available Landscape a cloud management tool .. you can check it out at https://landscape.canonical.com/	16:27
kim0	Also, Ubuntu is soon unleashing cloud management and orchestration tool called "ensemble"	16:27
kim0	that is going to revolutionize cloud deployments and management .. it's still in early tech-preview stage	16:28
kim0	however we're having an ensemble session and demo today	16:28
kim0	I think that mostly covers a broad definitions of ubuntu and cloud	16:28
ClassBot	Kruptein asked: so dropbox isn't cloud related? as you don't have to pay for it (basic)	16:28
kim0	Hi Kruptein	16:29
kim0	Well .. dropbox is cloud storage indeed	16:29
kim0	I meant that with cloud .. when you want to grow you pay for what you used/need	16:29
kim0	as opposed to buying a 1TB disk that lays on your desk so that when you need the capacity it'll be available for you	16:30
=== cmagina is now known as cmagina-lunch
kim0	with dropbox you pay for what you use .. although I believe they only allow payment in coarse packages	16:30
kim0	as opposed to Amazon's S3 which charges you per GB of storage per month	16:30
kim0	which is a more fine grained model	16:31
ClassBot	BluesKaj asked: ok then what is Ubuntu Cloud about ?	16:31
kim0	So I believe we covered that	16:31
kim0	To quickly recap	16:31
kim0	- Building your own private cloud : UEC/Eucalyptus or OpenStack	16:31
kim0	- Running over the Public Amazon Cloud : Official Ubuntu Server images http://cloud.ubuntu.com/ami/	16:32
kim0	- Systems Management tools : https://landscape.canonical.com/	16:32
kim0	- Infrastructure automation : Ensemble (tech-preview)	16:32
kim0	Again all of those tools and technologies (except for landscape) are having their own sessions that you'll enjoy :)	16:33
kim0	Let me not forget as well about "Ubuntu ONE"	16:34
kim0	a personal storage cloud (very similar to dropbox)	16:34
kim0	Check it out at https://one.ubuntu.com/	16:34
ClassBot	popey asked: Should your average end-user care about Ubuntu cloud? If so, why? If not, what do we say to end users when they see all this promotion of Ubuntu cloud stuff?	16:34
kim0	Hi popey	16:34
kim0	Great question	16:35
kim0	It really depends on your point of view	16:35
kim0	The usual-suspects to care about "cloud" stuff are going to be sys-admins, devops, IT professionals .. people who care about server environments and such .. However!	16:36
kim0	If you ask me, yes non IT pros should care as well	16:36
kim0	because the computing model is quickly shifting to a cloud model	16:37
kim0	that is .. instead of you buying a pc, loading it with your personal applications and settings	16:37
kim0	and being a sysadmin for yourself .. handling backups .. troubleshooting, software upgrades ..etc	16:37
kim0	the world is shifting into an ipad/iphone/thin-client/mobile devices world	16:38
kim0	where your data lives on a cloud	16:38
kim0	is accessible by a wide varierty of tools	16:38
kim0	and all tools sync up together	16:38
kim0	obviously the point of interest is going to be different, however it remains that the cloud touches all of us	16:39
ClassBot	cdbs asked: The Clous world is buzzing about OpenStack. Natty will include support for OpenStack along with Eucalyptus. Once OpenStack Nova becomes stable enough (should happen soon, by May) then will Ubuntu begin recommending OpenStack for its cloud offerings?	16:39
kim0	Hi cdbs	16:40
kim0	Seems you're on top of things hehe I can't really claim to foresee the future. Ubuntu is and has always aimed at providing the best of class open-source cloud technologies and software	16:40
kim0	As it stands, UEC product is based on Eucalyptus bec it is a mature product	16:41
kim0	however since openstack is rapidly maturing, it has been packaged and made available as well	16:41
kim0	I am confident Ubuntu will continue to make available all mature choices of best of breed software	16:42
ClassBot	Yuvi_ asked: you can differentiate between public cloud and private cloud?	16:42
kim0	Hi Yuvi_	16:42
kim0	Well, yeah I guess	16:42
kim0	Public clouds are cloud operated by an entity you don't control	16:43
kim0	and that provide services to multiple other tenants	16:43
kim0	examples would be Amazon cloud, rackspace, go-grid, terremark ...etc	16:43
kim0	A private cloud, is a cloud that probably runs behind your firewall on your own servers	16:43
kim0	and that you can control, i.e. is operated by IT people you have direct influence upon	16:44
ClassBot	at141am asked: Is the demo open to all for ensemble, if so when and where?	16:44
kim0	Hi at141am	16:44
kim0	Yes absolutely!	16:44
kim0	The Ensemble session is today in less than a couple of hours	16:45
kim0	right here in this same channel	16:45
kim0	The session leader is probably going to be copy/past'ing text so that you can follow up the demo	16:46
kim0	I'm not really sure how it would go .. but I'm sure it's gonna be loads of fun	16:46
ClassBot	marenostrum asked: What does "Ubuntu One" have to do something with "cloud" concept?	16:46
kim0	Hi marenostrum	16:46
kim0	Ubuntu ONE is a personal cloud service	16:47
kim0	It is designed for end-users .. that is non IT pros	16:47
kim0	It provides services to sync your files and folders to the cloud	16:47
kim0	sharing them to other people	16:47
kim0	not only that .. but also	16:47
kim0	sync's your "notes" across multiple machines	16:47
kim0	your music	16:47
kim0	Bookmarks	16:47
kim0	I think soon it might sync application settings and the apps installed	16:48
kim0	so that when you get a new Ubuntu machines .. it installs all your applications, applies all settings, syncs your data/notes/bookmarks ..etc	16:48
kim0	that would be lovely indeed .. I'm not sure if it can do all that just yet thought	16:48
kim0	though*	16:48
ClassBot	sveiss asked: do the official Ubuntu EC2 images receive updates? Specifically kernel updates, which are a bit of a pain to deal with via apt-get on boot.	16:49
* kim0 trying to answer questions quickly :)		16:49
kim0	Hi sveiss	16:49
kim0	The answer is absolutely YES	16:49
kim0	they do receive regular updates	16:49
kim0	of course you can always apt-get upgrade them any way	16:49
kim0	the one potential pain point .. is the one you have mentioned "kernel upgrades"	16:50
kim0	for that .. I've some good news	16:50
kim0	Newer AMIs are designed to use pv-grub	16:50
kim0	which is a method exposed by Amazon to load the kernel from inside the image	16:50
kim0	which means .. you can now apt-get upgrade your kernel .. and very simply reboot into it	16:51
ClassBot	There are 10 minutes remaining in the current session.	16:51
kim0	if you need to know which exact version switched to pvgrub .. check in at #ubuntu-cloud	16:51
ClassBot	IdleOne asked: Repost for AndrewMC :What would be the benifits of using the "cloud" instead of, say a dedicated server?	16:51
kim0	Hi IdleOne	16:52
kim0	the main benefits is really	16:52
kim0	- Pay per use .. I might need ten servers today .. but only one tomorrow .. cloud allows that .. dedicated servers don't (you'd have to buy 10 servers all the time)	16:52
kim0	- flexibility .. If we web application gets slashdotted .. and the load is too high .. within a few seconds .. I can spin up 20 extra cloud servers to handle the load	16:53
kim0	- Also .. since almost all clouds provide an extensive API	16:53
kim0	it really helps with IT automation .. spin up servers, assign them IPs, attach storage to them, mount a load balancer on top	16:54
kim0	all by running a script .. not by running around connection cables :)	16:54
ClassBot	Yuvi_ asked: What is hybrid cloud? Under which scenario we can use that	16:54
kim0	A hybrid cloud is a mix of public + private	16:54
kim0	a typical use case would be	16:55
kim0	you prefer running everything on a private cloud that you own and operate	16:55
kim0	however should the incoming load by too high	16:55
kim0	like your application was slashdotted	16:55
kim0	you would dynamically "expand" to using a public cloud like amazon/rackspace	16:56
kim0	to take some heat for you .. to lessen the load on your servers	16:56
ClassBot	There are 5 minutes remaining in the current session.	16:56
kim0	You can pull off something like that today with UEC and some smart scripts	16:56
ClassBot	chadadavis asked: what advantage does a private cloud provide, vs a traditional server cluster, assuming that then the sysadmin work is not outsourced?	16:56
kim0	running out of time ..	16:56
kim0	trying to quickly answer	16:57
kim0	well basically it's the same concept of public cloud	16:57
kim0	Benefits would be	16:57
kim0	- Complete infrastructure automation	16:57
kim0	- Enabling "teams" to handle their own needs .. a team would spin up/down servers according to their needs	16:57
kim0	lessening the load on IT staff	16:57
kim0	also .. "pooling" of IT servers into one private cloud	16:58
kim0	means providing a better service to everyone	16:58
kim0	since everyone can use some of the resources when they need it	16:58
kim0	so in short .. pooling, self service, low overhead, spin up/down	16:58
kim0	Great	16:59
kim0	Seems like I did manage to bust all questions :)	16:59
kim0	If anyone would like to get a hold of me afterwards	16:59
kim0	I am always hanging out in #ubuntu-cloud	16:59
kim0	you can ping me any time and I will get back to you once I can	17:00
kim0	The next session is by semiosis	17:00
kim0	o/	17:00
=== cmagina-lunch is now known as cmagina
kim0	Using gluster to scale .. very intersting stuff!	17:00
kim0	I love scalable file systems :)	17:00
semiosis	Thanks kim0	17:00
semiosis	Hello everyone	17:00
semiosis	This Ubuntu Cloud Days session is about scaling legacy web applications with shared-storage requirements in the cloud.	17:01
semiosis	I should mention up front that I'm neither an official nor an expert, I don't work for Amazon/AWS, Canonical, Gluster, Puppet Labs, or any other software company.	17:01
semiosis	I'm just a linux sysadmin who appreciates their work and wanted to give back to the community.	17:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\| Event: Ubuntu Cloud Days - Current Session: Scaling shared-storage web apps in the cloud with Ubuntu & GlusterFS - Instructors: semiosis
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html following the conclusion of the session.	17:01
semiosis	My interest is in rapidly developing a custom application hosting platform in the cloud. I'd like to avoid issues of application design by assuming that one is already running and can't be overhauled to take advantage of web storage services.	17:01
semiosis	I'll follow the example of migrating a web site powered by several web servers and a common NFS server from a dedicated hosting environment to the cloud. In fact this is something I've been working on lately, as I think others are as well.	17:02
semiosis	I invite you to ask questions throughout the session. I had a lot of questions when I began working on this problem, but finding answers was very time-consuming and sometimes impossible.	17:02
semiosis	My background is in Linux system administration in dedicated servers & network appliances, and I just started using EC2 six months ago. I'll try to keep my introduction at a high level, and assume some familiarity with standard Linux command line tools and basic shell scripting & networking concepts, and the AWS Console.	17:02
semiosis	Some of the advanced operations will also require euca2ools or AWS command line tools (or the API) because they're not available in the AWS Console.	17:02
semiosis	Cloud infrastructure and configuration automation are powerful tools, and recent developments have brought them within reach of a much wider audience. It is easier than ever for Linux admins who are not software developers to get started running applications in the cloud.	17:03
semiosis	I've standardized my platform on Ubuntu 10.10 in Amazon EC2, using GlusterFS to replace a dedicated NFS server, and CloudInit & Puppet to automate system provisioning and maintenance.	17:03
semiosis	GlusterFS has been around for a few years, and its major recent development (released in 3.1) is the Elastic Volume Manager, a command-line management console for the storage cluster. This utility controls the entire storage cluster, taking care of server setup and volume configuration management on servers & clients.	17:04
semiosis	Before the EVM a sysadmin would need to tightly manage the inner details of configuration files on all nodes, now that burden has been lifted enabling management of large clusters without requiring complex configuration management tools. Another noteworthy recent development in GlusterFS is the ability to add storage capacity and performance (independently if necessary) while the cluster is online and in use.	17:04
semiosis	I'll spend the rest of the session talking about providing reliable shared-storage service on EC2 with GlusterFS, and identifying key issues that I've encountered so far. I'd also be happy to take questions generally about using Ubuntu, CloudInit, and Puppet in EC2. Let's begin.	17:04
semiosis	There are two types of storage in EC2, ephemeral (instance-store) and EBS. There are many benefits to EBS: durability, portability (within an AZ), easy snapshot & restore, and 1TB volumes; the drawback of EBS is occasionally high latency.	17:05
semiosis	Ephemeral storage doesn't have those features, but it does provide more consistent latency, so it's better suited to certain workloads.	17:05
semiosis	I use EBS for archival and instance-store for temporary file storage. And I can't recommend enough the importance of high-level application performance testing to determine which is best suited for your application.	17:05
semiosis	GlusterFS is an open source scale-out filesystem. It's developed primarily by Gluster and has a large and diverse user community. I use GlusterFS on Ubuntu in EC2 to power a web service.	17:05
semiosis	What I want to talk about today is my experience setting up and maintaining GlusterFS in this context.	17:06
semiosis	First I'll introduce glusterfs architecture and terminology. Second we'll go through some typical cloud deployments, using instance-store and EBS for backend storage, and considering performance and reliability characteristics along the way.	17:06
semiosis	I'll end the discussion then with some details about performance and reliability testing and take your questions.	17:06
semiosis	I think some platform details are in order before we begin.	17:07
semiosis	I use the Ubuntu 10.10 EC2 AMIs for both 32-bit and 64-bit EC2 instances that were released in January 2011. You can find these AMIs at the Ubuntu Cloud Portal AMI locator, http://cloud.ubuntu.com/ami/.	17:07
semiosis	I configure my instances by providing user-data that cloud-init uses to bootstrap puppet, which handles the rest of the installation. Puppet configures my whole software stack on every system except for the glusterfs server daemon, which I manage with the Elastic Volume Manager (gluster command.)	17:07
semiosis	I've deployed and tested several iterations of my platform using this two-stage process and would be happy to take questions on any of these technologies.	17:07
semiosis	Unfortunately the latest version of glusterfs, 3.1.3, is not available in the Ubuntu repositories. There is a 3.0 series package but I would recommend against using it.	17:07
semiosis	I use a custom package from my PPA which is derived from the Debian Sid source package, with some metadata changes that enable the new features in 3.1, my Launchpad PPA's location is ppa:semiosis/ppa.	17:08
semiosis	Gluster also provides a binary deb package for Ubuntu, which has been more rigorously tested than mine. You can find the official downloads here: http://download.gluster.com/pub/gluster/glusterfs/LATEST/	17:08
semiosis	You can also download and compile the latest source code yourself from Github here: https://github.com/gluster/glusterfs	17:08
semiosis	Now I'd like to begin with a quick introduction to GlusterFS 3.1 architecture and terminology.	17:08
ClassBot	EvilPhoenix asked: repost for marktma: any consideration for using Chef instead of Puppet?	17:09
semiosis	i chose puppet because it seemed to be best integrated with cloud-init, it's mature, and has a large user community	17:09
ClassBot	kim0 asked: Could you please mention a little intro about cloud-init	17:10
semiosis	CloudInit bootstraps and can also configure cloud instances. This enables a sysadmin to use the standard AMI for different purposes, without having to build a custom AMI or rebundle to make changes.	17:11
semiosis	CloudInit takes care of setting the system hostname, installing the master SSH key and evaluating the userdata from EC2 metadata. That last part, evaluating the userdata, is the most interesting.	17:11
semiosis	It allows the sysadmin to supply a brief configuration file (called cloud-config), shell script, upstart job, python code, or a set of files or URLs containing those, which will be evaluated on first boot to customize the system.	17:11
semiosis	CloudInit even has built-in support for bootstrapping Puppet agents, which as I mentioned was a major deciding factor for me	17:12
semiosis	Now getting back to glusterfs terminology and architecture...	17:13
semiosis	Of course there are servers and there are clients. With version 3.1 there came the option to use NFS clients to connect to glusterfs servers in addition to the native glusterfs client based on FUSE.	17:13
semiosis	Most of this discussion will be about using native glusterfs clients, but we'll revisit NFS clients briefly at the end if theres time. I havent use the NFS capability myself because I think that the FUSE client's "client-side" replication is better suited to my application	17:13
semiosis	Servers are setup in glusterfs 3.1 using the Elastic Volume Manager, or gluster command. It offers an interactive shell as well as a single-executable command line interface.	17:13
semiosis	In glusterfs, servers are called peers, and peers are joined into (trusted storage) pools. Peers have bricks, which are just directories local to the server. Ideally each brick is its own dedicated filesystem, usually mounted under /bricks.	17:14
ClassBot	natea asked: Given the occasional high latency of EBS, do you recommend it for storing database files, for instance PostgreSQL?	17:14
semiosis	my focus is hosting files for web, not database backend storage. people do use glusterfs for both, but I haven't evaluated it in the context of database-type workloads, YMMV.	17:15
semiosis	as for performance, I'll try to get to that in the examples coming up	17:15
ClassBot	natea asked: Can you briefly explain the differences between GlusterFS and NFS and why I would choose one over the other?	17:16
semiosis	simply put, NFS is limited to single-server capacit, performance and reliability, while glusterfs is a scale out filesystem able to exceed the performance and/or capacity of a single server (independently) and also provides server-level redundancy	17:17
semiosis	there are some advanced features NFS has that glusterfs does not yet support (UID mapping, quotas, etc.) so please consider that when evaluating your options	17:18
semiosis	Glusterfs uses a modular architecture, in which “translators” are stacked in the server to export bricks over the network, and in clients to connect the mount point to bricks over the network. These translators are automatically stacked and configured by the Elastic Volume Manager when creating volumes (under /etc/glusterd/vols).	17:18
semiosis	A client translator stack is also created and distributed to the peers which clients retrieve at mount-time. These translator stacks, called Volume Files (volfile) are replicated between all peers in the pool.	17:18
semiosis	A client can retrieve any volume file from any peer, which it then uses to connect to directly to that volume's bricks. Every peer can manage its own and every other peer's volumes, it doesn't even need to export any bricks.	17:19
semiosis	There are two translators of primary importance: Distribute and Replicate. These are used to create distributed or replicated, or distributed-replicated volumes.	17:19
semiosis	In the glusterfs 3.1 native architecture, servers export bricks to clients, and clients handle all file replication and distribution across the bricks.	17:19
semiosis	All volumes can be considered distributed, even those with only one brick, because the distribution factor can be increased at any time without interrupting access (through the add-brick command).	17:19
semiosis	The replication factor however can not be changed (data needs to be copied into a new volume).	17:19
semiosis	In general, glusterfs volumes can be visualized as a table of bricks, with replication between columns, and distribution over rows.	17:19
semiosis	So a volume with replication factor N would have N columns, and bricks must be added in sets (rows) of N at a time.	17:20
semiosis	For example, when a file is written, the client first figures out which replication set the file should be distributed to (using the Elastic Hash Algorithm) then writes the file to all bricks in that set.	17:20
semiosis	Some final introductory notes... First as a rule nothing should ever touch the bricks directly, all access should go through the client mount point.	17:20
semiosis	Second, all bricks should be the same size, which is easy with using dedicated instance-store or EBS bricks.	17:20
semiosis	Third, files are stored whole on a brick, so not only can't volumes store files larger than a brick, but bricks should be orders of magnitude larger than files in order to get good distribution.	17:20
semiosis	Now I'd like to talk for a minute about compiling glusterfs from source on Ubuntu. This is necessary if one wants to use glusterfs on a 32-bit system, since Gluster only provides official packages for 64-bit.	17:21
semiosis	(as a side note, the packages in my PPA are built for 32-bit, but they are largely untested, i have only begun testing the 32 bit builds myself yesterday, and although it's going well so far, YMMV)	17:21
semiosis	Compiling glusterfs is made very easy by the use of standard tools.	17:22
semiosis	First, some required packages need to be installed, these are: gnulib, flex, byacc, gawk, libattr1-dev, libreadline-dev, libfuse-dev, and libibverbs-dev.	17:22
semiosis	After installing these packages you can untar the source tarball and run the usual “./configure; make; make install” sequence to build & install the program.	17:22
semiosis	By default, this will install most of the files under /usr/local, with the notable exceptions of the initscript placed in /etc/init.d/glusterd, the client mount script placed in /sbin/mount.glusterfs, and the glusterd configuration file /etc/glusterfs/glusterd.vol.	17:22
semiosis	(thats a static config file which you'll never need to edit, btw)	17:23
semiosis	If you wish to install to another location (using for example ./configure –prefix=/opt/glusterfs) make sure those three files are in their required locations.	17:23
semiosis	Once installed, either from source or from a binary package, the server can be started with “server glusterd start”. This starts the glusterd management daemon, which is controlled by the gluster command.	17:23
semiosis	The glusterd management daemon takes care of associating servers, generating volume configurations (for servers & clients,) and managing the brick export daemon (glusterfsd) processes. Clients that only want to mount glusterfs volumes do not need the glusterd service running.	17:23
semiosis	Another packaging note... the official deb package from Gluster is a single binary package that installs the full client & server, but the packages in my PPA are derived from the Debian Sid packages, which provide separate binary pkgs for server, client, libs, devel, etc allowing for a client-only installation	17:24
semiosis	Now, getting back to glusterfs architecture, and setting up a trusted storage pool...	17:25
semiosis	Setting up a trusted storage pool is also very straightforward. I recommend using hostnames or FQDNs, rather than IP addresses, to identify the servers.	17:25
semiosis	FQDNs are probably the best choice, since they can be updated in one place (the zone authority) and DNS takes care of distributing the update to all servers & clients in the cluster, whereas with hostnames, /etc/hosts would need to be updated on all machines	17:26
semiosis	Servers are added to pools using the 'gluster peer probe <hostname>' command. A server can only be a member of one pool, so attempting to probe a server that is already in a pool will result in an error.	17:26
semiosis	To add a server to a pool the probe must be sent from an existing server to the new server, not the other way. When initially creating a trusted storage pool, it's easiest to use one server to send out probes to all of the others.	17:26
ClassBot	remib asked: Would you recommend using separate glusterfs servers or use the webservers both as glusterfs server/client?	17:26
semiosis	excellent question! there are benefits to both approaches. Without going into too much detail, read-only can be done locally but there are some reasons to do writes from seperate clients if those clients are going to be writing to the same file (or locking on the same file)	17:28
semiosis	there's a slight chance for coherency problems if the client-servers lose connectivity to each other, and writes go to the same file on both... that file will probably not be automatically repaired, but that's an edge case that may never happen in yoru application. testing is very important	17:30
semiosis	thats called a split-brain in glusterfs terminology	17:30
semiosis	writes can go to different files under that partition condition just fine, it's only an issue if the two server-clients update the same file and they're not synchronized	17:31
semiosis	and i dont even know if network partitions are likely in EC2, it's just a theoretical concern for me at this point, so go forth an experiment!	17:31
semiosis	When initially creating a trusted storage pool, it's easiest to use one server to send out probes to all of the others.	17:32
semiosis	As each additional server joins the pool it's hostname (and other information) is propagated to all of the previously existing servers.	17:32
semiosis	One cautionary note, when sending out the initial probes, the recipients of the probes will only know the sender by its IP address.	17:32
semiosis	To correct this, send a probe from just one of the additional servers back to the initial server – this will not change the structure of the pool but it will propagate an IP address to hostname update to all of the peers.	17:32
semiosis	From that point on any new peers added to the pool will get the full hostname of every existing peer, including the peer sending the probe.	17:32
ClassBot	kim0 asked: What's your overall impression of glusterfs robustness and ability to recover from split-brains or node failures	17:33
semiosis	it depends heavily on your application's workload, for my application it's great, but Your Mileage May Vary. this is the biggest concern with database-type workloads, where you would have multiple DB servers wanting to lock on a single file	17:34
semiosis	but for regular file storage i've found it to be great	17:34
semiosis	and of course it depends also a great deal on the cloud-provider's network, not just glusterfs...	17:34
semiosis	resolving a split-brain issue is relatively painless... just determine which replica has the "correct" version of the file, and delete the "bad" version from the other replica(s) and glusterfs will replace the deleted bad copies with the good copy and all futhre access will be synchronized, so it's usually not a big deal	17:35
ClassBot	natea asked: Is the performance of GlusterFS storage comparable to a local storage? What are the downsides?	17:36
semiosis	that sounds like a low-level component performance question, and I recommend concentrating on high-level aggregate application throughput.	17:37
semiosis	i'll get to that shortly talking about the different types of volumes	17:37
semiosis	Once peers have been added to the pool volumes can be created. But before creating the volumes it's important to have set up the backend filesystems that will be used for bricks.	17:37
semiosis	In EC2 (and compatible) cloud environments this is done by attaching a block device to the instance, then formatting and mounting the block device filesystem.	17:38
semiosis	Block devices can be added at instance creation time using the EC2 command ec2-run-instances with the -b option.	17:38
semiosis	EBS volumes are specified for example with -b /dev/sdd=:20 where /dev/sdd is the device name to use, and :20 is the size (in GB) of the volume to create.	17:38
semiosis	Glusterfs recommends using ext4 filesystems for bricks since it has good performance and is well tested.	17:38
semiosis	As I mentioned earlier, the two translators of primary importance are Distribute and Replicate. All volumes are Distributed, and optionally also Replicated.	17:38
semiosis	Since volumes can have many bricks, and servers can have bricks in different volumes, a common convention is to mount brick filesystems at /bricks/volumeN. I'll follow that convention in a few common volume configurations to follow.	17:39
semiosis	The first and most basic volume type is a distributed volume on one server. This is essentially unifying the brick filesystems to make a larger filesystem.	17:39
semiosis	Remember though that files are stored whole on bricks, so no file can exceed the size of a brick. Also please remember that it is a best-practice to use bricks of equal size. So, lets consider creating a volume of 3TB called “bigstorage”.	17:39
semiosis	We could just as easily use 3 EBS bricks of 1TB each, 6 EBS bricks of 500GB each, or 10 EBS bricks of 300GB each. Which layout to use depends on the specifics of your application, but in general spreading files out over more bricks will achieve better aggregate throughput.	17:39
semiosis	so even though the performance of a single brick is not as good as a local filesystem, spreading over several bricks can achieve comparable aggreagate throughput	17:40
semiosis	Assuming the server's hostname is 'fileserver', the volume creation command for this would be simply “gluster volume create bigstorage fileserver:/bricks/bigstorage1 fileserver:/bricks/bigstorage2 … fileserver:/bricks/bigstorageN”.	17:40
semiosis	This trivial volume which just unifies bricks on a single server has limited performance scalability. In EC2 the network interface is usually the limiting factor, and although in theory a larger instance will have a chance at a larger slice of the network interface bandwidth, in practice I have found that this usually exceeds the bandwidth available on the network.	17:40
semiosis	And by this I mean what I've found is that larger instances do not get much more bandwidth to EBS or other instances (going beyond Large instance anyway, i'm sure smaller instances could get worse but haven't really evaluated them.)	17:40
semiosis	Glusterfs is known as a scale-out filesystem, and this means that performance and capacity can be scaled by adding more nodes to the cluster, rather than increasing the size of individual nodes.	17:41
ClassBot	neti asked: Is GLusterFS using local caching in memory?	17:41
semiosis	yes it does do read-caching and write-behind caching, but I leave their configuration at the default, please check out the docs at gluster.org for details, specifically http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Setting_Volume_Options	17:42
semiosis	Glusterfs is known as a scale-out filesystem, and this means that performance and capacity can be scaled by adding more nodes to the cluster, rather than increasing the size of individual nodes.	17:43
semiosis	So the next example volume after 'bigstorage' should be 'faststorage'. With this volume we'll combine EBS bricks in the same way but using two servers.	17:43
semiosis	First of course a trusted storage pool must be created by probing from one server (fileserver1) to the other (fileserver2) by running the command 'gluster peer probe fileserver2' on fileserver1, then updating the IP address of fileserver1 to its hostname by running 'gluster peer probe fileserver1' on fileserver2.	17:43
semiosis	After that, the volume creation command can be run, 'gluster volume create faststorage fileserver1:/bricks/faststorage1 fileserver2:/bricks/faststorage2 fileserver1:/bricks/faststorage3 fileserver2:/bricks/faststorage4 ...” where fileserver1 gets the odd numbered bricks and fileserver2 gets the even numbered bricks.	17:43
semiosis	In this example there can be an arbitrary number of bricks. Because files are distributed evenly across bricks, this has the advantage of combining the network performance of the two servers.	17:43
semiosis	(interleaving the brick names is just my convention, it's not required and you're free to use any convention you'd like)	17:44
ClassBot	kim0 asked: Since you have redudancy through replication, why not use instance-store instead of ebs	17:44
semiosis	ah I was just about to get into replication, great timing. in short, you can, and I do! instance-store has consistent latency going for it, but EBS volumes can be larger, can be snapshotted & restored, and can be moved between instances (within an availability zone) so that makes managing your data much easier	17:46
semiosis	Now I'd like to shift gears and talk about reliability.	17:46
semiosis	In glusterfs clients connect directly to bricks, so if one brick goes away its files become inaccessible, but the rest of the bricks should still be available. Similarly if one whole server goes down, only the files on the bricks it exports will be unavailable.	17:46
semiosis	This is in contrast to RAID striping where if one device goes down, the whole array becomes unavailable. This brings us to the next type of volume, distributed-replicated. In a distributed- replicated volume as I mentioned earlier files are distributed over replica sets.	17:46
semiosis	Since EBS volumes are already replicated in the EC2 infrastructure it should not be necessary to replicate bricks on the same server.	17:46
semiosis	In EC2 replication is best suited to guard against instance failure, so its best to replicate bricks between servers.	17:47
semiosis	The most straightforward replicated volume would be one with two bricks on two servers.	17:47
semiosis	By convention these bricks should be named the same, so for a volume called safestorage the volume create command would look like this, “gluster volume create safestorage replica 2 fileserver1:/bricks/safestorage1 fileserver2:/bricks/safestorage1 fileserver1:/bricks/safestorage2 fileserver2:/bricks/safestorage2 ...”	17:47
semiosis	Bricks must be added in sets of size equal to the replica count, so for replica 2, bricks must be added in pairs.	17:47
semiosis	Scaling performance on a distributed-replicated volume is similarly straightforward, and similar to adding bricks, servers should also be added in sets of size equal to the replica count.	17:47
semiosis	So, to add performance capacity to a replica 2 volume, two more server should be added to the pool, and the volume creation command would look like this, “gluster volume create safestorage replica 2 fileserver1:/bricks/safestorage1 fileserver2:/bricks/safestorage1 fileserver3:/bricks/safestorage2 fileserver4:/bricks/safestorage2 fileserver1:/bricks/safestorage3 fileserver2:/bricks/safestorage3 fileserver3:/bricks/	17:47
semiosis	safestorage4 fileserver4:/bricks/safestorage4...”	17:47
semiosis	Up to this point all of the examples involve creating a volume, but volumes can also be expanded while online. This is done with the add-brick command, which takes parameters just like the volume create command.	17:48
semiosis	Bricks still need to be added in sets of size equal to the replica count though.	17:48
semiosis	also note, the "add-brick" operation requires a "rebalance" to spread existing files out over the new bricks, this is a very costly operation in terms of CPU & network bandwidth so you should try to avoid it.	17:49
semiosis	A similar but less costly operation is "replace-brick" which can be used to move an existing brick to a new server, for example to add performance with the addition of new servers without adding capacity	17:50
ClassBot	There are 10 minutes remaining in the current session.	17:51
semiosis	another scaling option is to use EBS bricks smaller than 1TB, and restore from snapshots to 1TB bricks. this is an advanced technique requriring the ec2 command ec2-create-vol & ec2-attach-vol	17:51
semiosis	Well looks like my time is running out, so I'll try to wrap things up. please ask any questions you've been holding back!	17:52
semiosis	Getting started with glusterfs is very easy, and with a bit of experimentation & performance testing you can have a large, high throguhput file storage service running in the cloud. Best of all in my opinion is the ability to snapshot EBS bricks with the ec2-create-image API call/command which is also available in the AWS console	17:53
ClassBot	kim0 asked: Did you evaluate ceph as well	17:53
semiosis	I am keeping an eye on ceph, but it seemed to me that glusterfs is already well tested & used widely in production, even if not yet used widely in the cloud... it sure will be soon	17:54
ClassBot	neti asked: Is GlusterFS Supporting File Locking?	17:54
semiosis	yes glusterfs supports full POSIX semantics including file locking	17:55
semiosis	one last note about snapshotting EBS bricks... since bricks are regular ext4 filesystems, they can be restored from snapshot & read just like any other EBS volume, no hassling with mdadm or lvm to reassemble volumes like with RAID	17:56
ClassBot	remib asked: Does GlusterFS support quota's?	17:56
ClassBot	There are 5 minutes remaining in the current session.	17:56
semiosis	no quota support in 3.1	17:57
semiosis	Thank you all so much for the great questions. I hope you have fun experimenting with glusterfs, I think it's a very exciting technology. One final note for those of you who may be interested in commercial support...	17:58
semiosis	Gluster Inc. has recently released paid AMIs for Amazon EC2 and Vmware that are fully supported by the company. I've not used these, but they are there for your consideration.	17:59
semiosis	The glusterfs community is large and active. I usually hang out in #gluster which is where I've learned the most about glusterfs. There's a lot of friendly and knowledgeable people there, as well as on the mailing list, who enjoy helping out beginners	18:00
semiosis	thanks again!	18:00
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\| Event: Ubuntu Cloud Days - Current Session: What is Ensemble? - Presentation and Demo - Instructors: SpamapS
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html following the conclusion of the session.	18:01
SpamapS	So, I have prepared a short set of slides to try and explain what Ensemble is here: http://spamaps.org/files/Ensemble%20Presentation.pdf	18:02
SpamapS	I will elaborate here in channel.	18:02
SpamapS	Ensemble is an implementation of Service Management	18:03
SpamapS	up until now this has also been called "Orchestration", and the term is not all that inaccurate, though I feel that Service Management is more appropriate	18:03
SpamapS	"What is Service Management?"	18:03
SpamapS	Service Management is focused on the things that servers do that end users consume	18:04
SpamapS	Users connect to websites, dns servers, or (at a lower level) databases, cache services, etc	18:04
SpamapS	Ensemble models how services relate to one another.	18:04
SpamapS	Web applications need to connect to a number of remote resources. Load balancers need to connect to web application servers.. monitoring services need to connect to services and test that they're working.	18:05
SpamapS	Ensemble models all of these in what we call "formulas" (more on this later)	18:05
SpamapS	If this starts to sound like Configuration Management, you won't be the first to make that mistake.	18:06
SpamapS	However, this sits at a higher level than configuration management.	18:06
SpamapS	"Contrast With Configuration Management"	18:06
SpamapS	Configuration management grew from the time when we had a few servers that were expensive to buy/lease/provision, and lived a long time.	18:07
SpamapS	Because of this, system administrators modeled system configuration very deeply. Puppet, chef, etc., first and foremost, model how to configure a server	18:07
SpamapS	As the networks grew and became more dependent on one another, the config management systems have grown the ability to share data about servers.	18:08
SpamapS	However the model is still focused on "how do I get my server configured"	18:08
SpamapS	Ensemble seeks to configure the service.	18:09
SpamapS	With the cloud, we have the ability to rapidly provision and de-provision servers. So service management is tightly coupled with provisioning.	18:09
SpamapS	Chef, in particular, from the config management world, has done a good job of adding this in with their management tools.	18:10
SpamapS	However, where we start to see a lot of duplication of work in configmanagement, is in sharing of the knowledge of service configuration.	18:10
SpamapS	Puppet and Chef both have the ability to share their "recipes" or "cookbooks"	18:11
SpamapS	However, most of these are filled with comments and variables "change this for your site"	18:11
SpamapS	The knowledge of how and when and why is hard to encode in these systems.	18:11
SpamapS	Ensemble doesn't compete directly with them on this level. Ensemble can actually utilize configuration management to do service management.	18:12
SpamapS	The comparison is similar to what we all used to do 15+ years ago with open source software	18:12
SpamapS	download tarball, extract, cd, ./configure --with-dep1=/blah && make && sudo make install	18:13
SpamapS	This would be an iterative process where we would figure out how to make the software work for our particular server every time.	18:13
SpamapS	Then distributions came along and created packaging, and repositories, to relieve us from the burden of doing this for most low level dependencies.	18:13
SpamapS	So ensemble seeks to give us, in the cloud, what we have on the single server.. something like 'apt-get install'	18:14
SpamapS	"Terms"	18:14
SpamapS	"service unit" - for the most part this means "a server", but it really just means one member of a service deployment. If you have 3 identical web app servers, these are 3 service units, in one web app service deployment.	18:15
SpamapS	"formula" - this is the sharable, ".deb" for the cloud. It encodes the relationships and runtime environment required to configure a service	18:16
SpamapS	"environment" - in ensemble, this defines the machine provider and settings for deploying services together. Right now this means your ec2 credentials and what instance type. But it could mean a whole host of things.	18:17
SpamapS	"bootstrap" - ensemble's first job in any deployment is to "boostrap" itself. You run the CLI tool to boostrap it	18:17
SpamapS	that means it starts a machine that runs the top level ensemble agent that you will communicate with going forward	18:18
SpamapS	"Basic Workflow"	18:18
SpamapS	This is how we see people using ensemble, though we have to imagine the details of this will change as ensemble grows, since it hasn't even been "released" yet.	18:19
SpamapS	(though, as a side note, it is working perfectly well, and available for lucid at https://launchpad.net/~ensemble/+archive/ppa)	18:19
SpamapS	0. (let this out of the slide) - configure your environment. This means establish AWS credentials, and record them in .ensemble/environment.yaml	18:20
=== daker is now known as daker_
SpamapS	1. Bootstrap (ensemble bootstrap) - this connects to your machine provider (EC2 right now) and spawns an instance, and seeds it using cloud-init to install ensemble and its dependencies	18:20
SpamapS	2. Deploy Services (ensemble deploy mysql wiki-db; ensemble deploy mediawiki demo-wiki)	18:21
SpamapS	This actually spawns nodes with the machine provider, and runs the ensemble agent on them, telling them what service they're a part of and running the service "install" hooks to get them ready to participate in the service	18:21
=== niemeyer is now known as niemeyer_bbl
SpamapS	3. Relate Services (ensemble add-relation demo-wiki:db wiki-db:db)	18:22
SpamapS	This part won't always be necessary. Automatic relationship resolution is being worked on right now. But sometimes you will want to be explicit, or do a relation that is optional.	18:23
SpamapS	In the example above, this tells demo-wiki and wiki-db about eachother. I will pastebin a formula example to clear this up.	18:23
SpamapS	http://paste.ubuntu.com/584424/	18:24
SpamapS	This is the metadata portion of the mediawiki formula, which I created recently as part of the "Principia" project, which is a collection of formulas for ensemble: https://launchpad.net/principia	18:24
SpamapS	If you look there, you see that it 'requires:' a relationship called 'db'	18:25
SpamapS	the interface for that relationship is "mysql"	18:25
SpamapS	These interface names are used to ensure you don't relate two things which have different interfaces	18:25
SpamapS	(almost done will take questions shortly)	18:26
SpamapS	http://paste.ubuntu.com/584425/	18:26
SpamapS	This is the corresponding metadata for mysql..	18:26
SpamapS	as you see, it provides a relationship called 'db' as well, which uses the interface 'mysql'	18:27
SpamapS	What this means is that the 'requires' side of the formula can expect certain data to be passed to it when it joins this relationship	18:27
SpamapS	and likewise, the provides side knows that its consumers will need certain bits	18:28
SpamapS	When this relationship is added, "hooks" are fired	18:28
SpamapS	These are just scripts that are run at certain events in the relationship lifecycle	18:29
SpamapS	These scripts use helper programs from ensemble to read and write data over the two-way communication channel.	18:29
SpamapS	In the case of mysql, whenever a service unit joins a relationship, it creates a database for the service if it doesn't exist, and then creates a username/password/etc. and sends that to the consumer	18:30
SpamapS	and the mediawiki hook for the relationship will configure mediawiki to use that database	18:31
SpamapS	The code for all of this is in lp:principia if you are curious.	18:31
SpamapS	the final slide is just an overview of ensemble's architecture under the hood.	18:32
SpamapS	I will take questions now...	18:32
SpamapS	marktma: GREAT question. Definitely. One of the goals is to make it easy to write new "machine providers". By doing EC2 first though, we should have a reasonable chance at working with UEC/Eucalyptus and maybe even OpenStack out of the box.	18:34
ClassBot	marktma asked: is there any chance ensemble will be used for private clouds as well?	18:35
SpamapS	hah, ok, see answer ^^	18:35
ClassBot	kim0 asked: What does the interface: mysql .. actually mean	18:35
SpamapS	I think I may have answered that already in the ensuing description..	18:35
SpamapS	but essentially its a loose contract between providers/requirerers/peers on what will be passed through the communication channel	18:36
ClassBot	EvilPhoenix asked: (for kim0): that contract .. is it defined somewhere	18:37
SpamapS	It is only defined via the formulas. It is intentially kept as a loose coupling to make formulas flexible. I could see it being strengthened a bit in the future.	18:38
SpamapS	Now, I wanted to stream my desktop to demo ensemble in action..	18:39
SpamapS	but that has proven difficult given the 20 minutes I had to attempt to set that up.	18:39
SpamapS	So I will paste bin the terminal output of an ensemble run...	18:39
SpamapS	I have setup a lucid instance for this, and the only commands not seen here are: sudo add-apt-repository ppa:ensemble/ppa ; apt-get update ; apt-get install ensemble ; bzr branch lp:principia ; cat > aws.sh	18:40
SpamapS	the last bit is to store my aws credentials	18:40
SpamapS	http://paste.ubuntu.com/584430/	18:41
SpamapS	this is the boostrap phase	18:41
SpamapS	bootstrap even	18:41
SpamapS	I now need to wait for EC2 to start an instance	18:42
=== daneroo_ is now known as daneroo
SpamapS	ubuntu@ip-10-203-81-87:~$ ensemble status	18:42
SpamapS	2011-03-23 18:42:02,263 INFO Connecting to environment.	18:42
SpamapS	2011-03-23 18:42:18,586 INFO Environment still initializing. Will wait.	18:42
SpamapS	And now it has spawned my bootstrap	18:43
SpamapS	there will be live DNS names here, so hopefully my security groups will keep your prying eyes out..	18:43
SpamapS	machines: 0: {dns-name: ec2-50-17-142-155.compute-1.amazonaws.com, instance-id: i-10f63f7f}	18:43
SpamapS	services: {}	18:43
SpamapS	2011-03-23 18:42:50,216 INFO 'status' command finished successfully	18:43
ClassBot	TeTeT asked: what would a system administrators task with ensemble be - write formulas or just deploy them or a mix?	18:43
SpamapS	I'd imagine sysadmins would write the formulas for an organization's own services which consume existing services.	18:44
SpamapS	The common scenario is a LAMP application which takes advantage of memcached, mysql, and has a load balancer	18:44
SpamapS	The lamp app needs to have its config files written with the db, cache servers, etc., so the sysadmin would write the relation hooks for mysql and memcached. OR a developer could write these. The devops paradigm kind of suggests that they work together on this.	18:45
SpamapS	Ok now I'll run my "demo.sh" script which builds a full mediawiki stack	18:46
SpamapS	While this is going, I will stress that this is unreleased alpha software, though the dev team has been very dilligent and the code is of a very high quality (written in python with twisted, and available at lp:ensemble	18:47
SpamapS	http://paste.ubuntu.com/584433/	18:47
SpamapS	Now we'll need to wait a few minutes while all of those nodes spawn	18:47
SpamapS	Now, I'm using t1.micro, so these provision fast .. we can watch their hooks run w/ debug-log...	18:49
SpamapS	However they may already be done..	18:49
SpamapS	Ideally, we'll have a wiki accessible at the address of 'wiki-balancer' .. lets see	18:50
ClassBot	There are 10 minutes remaining in the current session.	18:51
ClassBot	TeTeT asked: is the deployment through ensemble itself or via cloud-init or puppet or other config tools?	18:51
SpamapS	http://paste.ubuntu.com/584437/	18:52
SpamapS	While you guys try to decipher that I'll answer TeTeT	18:52
SpamapS	TeTeT: the nodes are configured via cloud-init to run ensemble's agent. After that, ensemble is in control running hooks. The formulas are pushed into S3, and then downloaded by the agent once it starts.	18:52
SpamapS	So unfortunately, our load balancer has failed.. it is "machine 4" http://ec2-50-17-47-115.compute-1.amazonaws.com/ ... but the individual mediawiki nodes are working..	18:53
SpamapS	http://ec2-204-236-202-35.compute-1.amazonaws.com/mediawiki/index.php/Main_Page	18:53
SpamapS	Ahh, there was a bug in my demo.sh :)	18:55
SpamapS	$ENSEMBLE add-relation wiki-balancer demo-wiki:reverseproxy	18:55
SpamapS	mediawiki has no relation named reverseproxy	18:55
SpamapS	2011-03-23 18:46:37,900 INFO Connecting to environment.	18:55
SpamapS	No matching endpoints	18:55
SpamapS	2011-03-23 18:46:38,473 ERROR No matching endpoints	18:55
SpamapS	2011-03-23 18:46:38,865 INFO Connecting to environment.	18:55
SpamapS	We actually had that error but missed it. ;)	18:56
SpamapS	lets relate the load balancer now	18:56
ClassBot	There are 5 minutes remaining in the current session.	18:56
SpamapS	ubuntu@ip-10-203-81-87:~$ ensemble add-relation wiki-balancer:reverseproxy demo-wiki:website	18:56
SpamapS	2011-03-23 18:56:28,059 INFO Connecting to environment.	18:56
SpamapS	2011-03-23 18:56:28,691 INFO Added http relation to all service units.	18:56
ClassBot	kim0 asked: Can't a cache and a wiki service-units share the same ec2 instance	18:56
SpamapS	2011-03-23 18:56:28,691 INFO 'add_relation' command finished successfully	18:56
SpamapS	kim0: the idea is that in that instance, its simpler to use something like LXC containers to make it easier to write formulas. However, in the case of purely non-conflicting formulas, there should be a way in the future to do that yes	18:57
SpamapS	http://ec2-50-17-47-115.compute-1.amazonaws.com/mediawiki/index.php/Main_Page	18:57
SpamapS	And there you have a working mediawiki	18:57
ClassBot	TeTeT asked: will ensemble also provide service monitoring, or is that better left to munin/nagios and alike	18:58
SpamapS	TeTeT: The latter. nagios/munin/etc are just services in themselves. And they speak the same protocols as consuming services. If a formula wants to explicitly expose more over a monitoring interface they certainly can	18:59
SpamapS	I think thats about all the time we have	18:59
SpamapS	Thanks so much for taking the time to listen. https://launchpad.net/ensemble has more information!	19:00
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\| Event: Ubuntu Cloud Days - Current Session: Using Linux Containers in Natty - Instructors: hallyn
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html following the conclusion of the session.	19:01
hallyn	Ok, hey all	19:02
hallyn	I'm going to talk about containers on natty.	19:03
hallyn	In the past, that is, until definately lucid, there were some constraints which made containers more painful to administer -	19:03
hallyn	i.e. you couldn't safely upgrade udev	19:03
hallyn	that's now gone!	19:03
hallyn	but, let me start at the start	19:04
hallyn	containers, for anyone really new, are a way to run what appear to be different VMs, but without the overhead of an OS for each VM, and without any hardware emulation	19:04
hallyn	so you can fit in a lot of containers on old hardware with little overhead	19:04
hallyn	they are similar to openvz and vserver - they're not competition, though.	19:04
hallyn	rather, they're the ongoing work to upstream the functionality from vserver and openvz	19:05
hallyn	Containers are a userspace fiction built on top of some nifty kernel functionality.	19:05
hallyn	There are two popular implementations right now:	19:05
hallyn	the libvirt lxc driver, and liblxc (or jsut 'lxc') from lxc.sf.net	19:05
hallyn	Here, I'm talking about lxc.sf.net	19:06
hallyn	All right, in order to demo some lxc functionality, I set up a stock natty VM on amazon. You can get to it as:	19:06
hallyn	ssh ec2-50-17-73-23.compute-1.amazonaws.com -l guest	19:06
hallyn	password is 'none'	19:06
hallyn	that should get you into read-only screen session. To get out, hit '~.' to kill ssh.	19:06
hallyn	One of the kernel pieces used by containers is the namespaces.	19:07
hallyn	You can use just the namespaces (for fun) using 'lxc-unshare'	19:07
hallyn	it's not a very user-friendly command, though.	19:07
hallyn	because it's rarely used...	19:07
hallyn	what I just did there on the demo is to unshare my mount, pid, and utsname (hostname) namespaces	19:08
hallyn	using 'lxc-unshare -s 'MOUNT\|PID\|UTSNAME" /bin/bash'	19:08
hallyn	lxc-unshare doesn't remount /proc for you, so I had to do that. Once I've done that, ps only shows tasks in my pid namespace	19:08
hallyn	also, I can change my hostname without changing the hostname on the rest f the system	19:08
hallyn	When I exited the namespace, I was brought back to a shell with the old hostname	19:09
hallyn	all right, another thing used by containers is bind mounts. Not much to say about them, let me just do a quick demo of playing with them:	19:10
ClassBot	ToyKeeper asked: Will there be a log available for this screen session?	19:10
hallyn	yes,	19:11
hallyn	oh, no. sorry	19:11
hallyn	didn't think to set that up	19:11
hallyn	hm,	19:11
hallyn	ok, i'm logging it as of now. I'll decide where to put it later. thanks.	19:11
hallyn	nothing fancy, just bind-mounting filesystems	19:12
hallyn	which is a way of saving a lot of space, if you share /usr and /lib amongst a lot of containers	19:13
hallyn	anyway, moving on to actual usage	19:13
hallyn	Typically there are 3 ways that I might set up networking for a container	19:14
hallyn	Often, if I'm lazy or already have it set up, I'll re-use the libvirt bridge, virbr0, to bind container NICs to	19:14
hallyn	well, at least apt-get worked :)	19:16
hallyn	If I'm on a laptop using wireless, I"ll usually do that route, because you can't directly bridge a wireless NIC.	19:16
hallyn	And otherwise I'd have to set up my own iptables rules to do the forwarding from containers bridge to the host NIC	19:16
hallyn	If I'm on a 'real' host, I'll bridge the host's NIC and use that for containers.	19:17
hallyn	that's what lxc-veth.conf does	19:17
hallyn	So first you have to set up /etc/network/interfaces to have br0 be a bridge,	19:17
hallyn	have eth0 not have an address, and make eeth0 a bridge-port on br0	19:18
hallyn	as seen on the demo	19:18
hallyn	Since that's set up, I can create a bridged container just using:	19:18
hallyn	'lxc-create -f /etc/lxc-veth.conf -n nattya -t natty'	19:18
hallyn	nattya is the naem of the container,	19:18
hallyn	natty is the template I'm using	19:18
hallyn	and /etc/lxc-veth.conf is the config file to specify how to network	19:18
hallyn	ruh roh	19:19
hallyn	so lxc-create is how you create a new container	19:20
hallyn	The rootfs and config files for each container are in /var/lib/lxc	19:20
hallyn	you see there are three containers there - natty1, which I created before this session, and nattya and nattyz which I jsut created	19:20
hallyn	The config file under /var/lib/lxc/natty1 shows some extra information,	19:21
hallyn	including howmany tty's to set up,	19:21
hallyn	and which devices to allow access to	19:21
hallyn	the first devices line, 'lxc.cgroup.devices.deny = a' means 'by default, don't allow any access to any device.'	19:21
hallyn	from there any other entries are whitelist entries	19:21
ClassBot	kim0 asked: Can I run a completely different system like centos under lxc on ubuntu ?	19:22
hallyn	yes, you can, and many people do.	19:22
hallyn	The main problem, usually, is in actually first setting up a container with that distro which works	19:22
hallyn	You can't 100% use a stock iso install and have it boot as a container	19:23
hallyn	It used to be there was a lot of work you had to do to make that work,	19:23
hallyn	but now we're down to very few things. In fact, for ubuntu natty, we have a package called 'lxcguest'	19:23
hallyn	if you take a stock ubuntu natty image,	19:23
hallyn	and install 'lxcguest', then it will allow that image to boot as a container	19:23
hallyn	It actually only does two things now:	19:24
hallyn	1. it detects that it is in a container (based on a boot argument provided by lxc-start),	19:24
hallyn	uh, that wasn't suppsoed to be 1 :),	19:24
hallyn	and based on that, if it is in a container, it	19:24
hallyn	1. starts a console on /dev/console, so that 'lxc-start' itself gets a console (like you see when i start a container)	19:24
hallyn	2. it changes /lib/init/fstab to one with fewer filesystems,	19:25
hallyn	bc there are some which you cannot or should not mount in a container.	19:25
hallyn	now, lxc ships with some 'templates'.	19:25
hallyn	these are under /usr/lib/lxc/tempaltes	19:25
hallyn	/usr/lib/lxc/templates that is	19:25
hallyn	some of those templates, however, don't quite work right. So a next work item we want to tackle is to make those all work better, and add more	19:26
hallyn	let's take a look at the lxc-natty one:	19:26
hallyn	it takes a MIRROR option, which I always use at home, which lets me point it at a apt-cacher-ng instance	19:27
hallyn	it starts by doing a debootstrap of a stock natty image into /var/cache/lxc/natty/	19:28
hallyn	so then, every time you create another container with natty template, it will rsync that image into place	19:28
hallyn	then it configures it, setting hostname, setting up interfaces,	19:29
hallyn	shuts up udev,	19:29
hallyn	since the template by default creates 4 tty's, we get rid of /etc/init/tty5 and 6	19:29
hallyn	since we're not installing lxcguest, we just empty out /lib/init/fstab,	19:30
hallyn	actually, that may be a problem	19:30
hallyn	upstart upgrades may overwrite that	19:30
hallyn	so we should instaed have lxc-natty template always install the lxcguest package	19:30
hallyn	(note to self)	19:30
hallyn	and finally, it installs the lxc configuration, which is that config file we looked at before with device access etc	19:30
hallyn	ok, i've been rampling, let me look for and address any/all questions	19:31
ClassBot	kapil asked: What's the status of using lxc via libvirt?	19:31
hallyn	good question, zul has actually been working on that.	19:31
hallyn	libvirt-lxc in natty is fixed so that when you log out from console, you don't kill the container any more :	19:32
hallyn	seconly, you can use the same lxcguest package I mentioned before in libvirt-lxc,	19:32
hallyn	so you can pretty easily debootstrap an image, chroot to it to install lxcguest, and then use it in libvirt	19:32
hallyn	we still may end up writing a new libvirt lxc driver, as an alternative to the current one, which just calls out to liblxc, so that libvirt and liblxc can be used to maniuplate the same containers	19:33
hallyn	but still haven't gotten to that	19:33
ClassBot	kim0 asked: can I live migrate a lxc container	19:34
hallyn	nope	19:34
hallyn	for that, we'll first need checkpoint/restart.	19:34
hallyn	I have a ppa with some kernel and userspace pieces - basically packaging the current upstream efforts. But nothing upstream, nothing in natty, not very promising short-term	19:34
ClassBot	ToyKeeper asked: Why would you want regular ttys in a container? Can't the host get a guest shell similar to openvz's "vzctl enter $guestID" ?	19:35
hallyn	nope,	19:35
hallyn	if the container is set up right, then you can of course ssh into it;	19:35
hallyn	or you can run lxc-start in a screen session so you can get back to it like that,	19:35
hallyn	what the regular 'lxc.tty = 4' gives you is the ability to do 'lxc-console' to log in	19:36
hallyn	as follows:	19:36
hallyn	I start the container with '-d' to not give me a console on my current tty	19:36
hallyn	then lxc-console -n natty1 connects me to the tty...	19:36
hallyn	ctrl-a q exits it	19:37
hallyn	now, the other way you might want to enter a container, which i think the vzctl enter does,	19:37
hallyn	is to actually move your current task into the container	19:37
hallyn	That currently is not possible	19:37
hallyn	there is a kernel patch, being driven now by dlezcano, to make that possible, and a patch to lxc to use it using the 'lxc-attach' command.	19:37
hallyn	but the kernel patch is not yet accepted upstream	19:38
hallyn	so you cannot 'enter' a container	19:38
=== niemeyer_bbl is now known as niemeyer
ClassBot	rye asked: Are there any specific settings for cgroup mount for the host?	19:38
hallyn	Currently I just mount all cgroups.	19:38
hallyn	Using fstab in the demo machine, or just 'mount -t cgroup cgroup /ccgroup'	19:39
hallyn	the ns cgroup is going away soon,	19:39
hallyn	so when you don't have ns cgrou pmounted, then you'll need cgroup.clone_children to be 1	19:39
hallyn	however, you don't need that in natty. in n+1 you probably will.	19:40
ClassBot	kim0 asked: How safe is it to let random strangers ssh into containers as root ? how safe is it to run random software inside containers .. can they break out	19:40
hallyn	not safe at all	19:40
hallyn	If you google for 'lxc lsm' you can find some suggestions for using selinux or smack to clamp down	19:40
hallyn	and, over the next year or two, I'm hoping to keep working on, and finally complete, the 'user namespaces'	19:41
=== Jackson is now known as Guest46715
hallyn	with user namespaces, you, as user 'kim0' and without privilege, woudl create a container. root in that container would have full privilege over things which you yourself own	19:41
hallyn	So any files owned by kim0 on the host; anything private to your namespaces, like your own hostname;	19:41
hallyn	BUT,	19:41
hallyn	even when that is done, there is another consideration: nothing is sitting between your users and the kernel	19:42
hallyn	so any syscalls which have vulnerabilities - and there are always some - can be exploited	19:42
hallyn	now,	19:42
hallyn	the fact is of course that similar concerns should keep you vigilent over other virtualization - kvm/vmware/etc - as well. The video driver, for instance, may allow the guest user to break out.	19:43
ClassBot	kim0 asked: Can one enforce cpu/memory/network limits (cgroups?) on containers	19:43
hallyn	you can lock a container into one or several cpus,	19:44
hallyn	you can limit it's memory,	19:44
hallyn	you can, it appears (this is new to me) throttle block io (which has been in the works for years :)	19:44
hallyn	the net_cls.classid has to do with some filtering based on packet labels. I've looked at it in the past, but never seen evidence of anyone using it	19:45
hallyn	for documentation on cgroups, I would look at Documentation/cgroups in the kernel source	19:46
hallyn	oh yes, and of course you can access devices	19:46
hallyn	you remove device access by writing to /cgroup/<name>/devices.deny, an entry of the form	19:47
hallyn	major:minor rwm	19:47
hallyn	where r=read,w=write,m=mknod	19:47
hallyn	oh, i lied,	19:47
hallyn	first is 'a' for any, 'c' for char, or 'b' for block,	19:48
hallyn	then major:minor, then rwm	19:48
hallyn	you can see the current settings for cgroup in /cgroup/devices.list	19:48
hallyn	and allow access by writing to devices.allow	19:48
ClassBot	sveiss asked: is there any resource control support integrated with containers? Limiting CPU, memory/swap, etc... I'm thinking along the lines of the features provided by Solaris, if you're familiar with those	19:48
hallyn	you can pin a container to a cpu, and you can track its usage, but you cannot (last I knew) limit % cpu	19:49
hallyn	oh, there is one more cgroup i've not mentioned, 'freezer', which as the name sugguests lets you freeze a task.	19:49
hallyn	so i can start up the natty1 guest and then freeze it like so	19:50
hallyn	lxc-freeze just does 'echo "FROZEN" > /cgroup/$container/freezer.state' for me	19:50
hallyn	lxc-thaw thaws it	19:50
hallyn	make that lxc-unfreeze :)	19:50
hallyn	can't get a console when it's frozen :)	19:51
ClassBot	There are 10 minutes remaining in the current session.	19:51
hallyn	there are a few other lxc-* commands to help administration	19:51
hallyn	lxc-ls lists the available containers in the first line,	19:52
hallyn	and the active ones inthe second	19:52
hallyn	lxc-info just shows its state	19:52
hallyn	lxc-ps shows tasks int he container, but you have to treat it just right	19:52
hallyn	lxc-ps just does 'ps' and shows you if any tasks in your bash session are in a container :)	19:53
hallyn	lxc-ps --name natty1 shows me the processes in container natty1	19:53
hallyn	and lxc-ps -ef shows me all tasks, prepended by the container any task is in	19:53
hallyn	lxc-ps --name natty1 --forest is the prettiest :)	19:53
hallyn	now, i didn't get a chance to try this in advance so iwll probably fail, but	19:54
hallyn	hm	19:54
ClassBot	There are 5 minutes remaining in the current session.	19:56
hallyn	there is the /lib/init/fstab which lxcgueset package will use	19:56
hallyn	ok, what i did there,	19:57
hallyn	was i had debootstrapped a stock image into 'demo1', i jsut installed lxcguest,	19:57
hallyn	and fired it up as a container	19:57
hallyn	only problem ims i don't know the password :)	19:57
ClassBot	kim0 asked: Any way to update the base natty template that gets rsync'ed to create new guests	19:57
hallyn	sure, chroot to /var/cache/lxc/natty1 and apt-get update :)	19:58
hallyn	ok, thanks everyone	19:58
kim0	Thanks a lot .. It's been a great deep dive session	19:59
kim0	Next Up is OpenStack Intro session	19:59
soren	o/	19:59
soren	kim0: How does it work? Do you copy questions from somewhere else or do I need to do that myself?	20:00
soren	Or do people just ask here?	20:00
kim0	soren: you "/msg ClassBot !q" then !y on every question	20:00
kim0	soren: please join #ubuntu-classroom-chat as well	20:01
soren	This is complicated :)	20:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\| Event: Ubuntu Cloud Days - Current Session: Open-Stack Introduction - Instructors: soren
soren	Hello, everyone!	20:01
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html following the conclusion of the session.	20:01
soren	I'm Soren, I'm one of the core openstack developers.	20:02
soren	OpenStack consists of two major components and a couple of smaller ones.	20:02
soren	The major ones are OpenStack Compute, codenanmed nova.	20:02
soren	...and OpenStack Storage, codenamed Swift.	20:02
soren	Swift is what drives Rackspace Cloud Files, which is a service very much like Amazon S3.	20:03
soren	It's massively scalable, and is used to store petabytes of data today.	20:03
soren	I work on Nova, though, so that's what I'll spend most time talking about today.	20:03
soren	Nova is a project that started at NASA.	20:03
soren	Apart froms ending stuff into space, NASA also does a bunch of other research things for the US government.	20:04
soren	AMong them: "Look into this cloud computing thing"	20:04
soren	This is what turned into the NASA Nebula project.	20:04
soren	If you google it (I forgot to do so in advance), you'll find images of big containers that say Nebula on the side.	20:05
soren	They're building blocks for NASA's cloud.	20:05
soren	Anyways, they started our running this on Eucalyptus.	20:05
soren	The same stuff that drives UEC.	20:05
soren	This got.. uh... "old" eventually, and they decided to throw it out and write their own thing.	20:06
soren	..so they did, and they open sourced it.	20:06
soren	RAckspace had plans for open sourcing their cloud platform, too, so they called NASA and said "wanna play?" (paraphrasing a little bit), and they were up for it.	20:07
soren	So Rackspace had Swift, NASA had Nova. We put it together and called it OpenStakc.	20:07
soren	OpenStack, even.	20:07
soren	If you go to look at them, and they don't look like two pieces of the same puzzle, this is why. They share no ancestry, really.	20:08
soren	They now work happily together, though.	20:08
* soren attempts to work that qeustions thing		20:08
ClassBot	EvilPhoenix asked: What exactly IS Open-Stack?	20:09
soren	I guess that one is answered..	20:09
ClassBot	medberry asked: Can you briefly differentiate openstack from eucalyptus	20:09
soren	Yes. Yes, I can.	20:09
soren	So, Eucalyptus corresponds to Nova.	20:10
soren	They both focus on the compute side of things, while providing a very simple object store. Neither try to do any sort of large scale stuff.	20:10
soren	Err..	20:10
soren	For storage, I mean.	20:10
soren	For the compute part, the architectures are very dissimilar.	20:11
soren	So, last I looked (admittedly 1½ year ago, but I'm told this is still true), Eucalyptus is strictly hierarchical.	20:11
soren	There's one "cloud controller" at the top.	20:12
soren	There's a number of cluster controllers beneath this one cloud controller.	20:12
soren	...and there's a number of "node controllers" beneath the cluster controllers.	20:12
soren	Eucalyptus is written in Java, and uses XML and web services for all its communication.	20:13
soren	It polls from the top down.	20:13
soren	Never the other way around.	20:13
soren	Nova uses message queues.	20:14
soren	Nova is written in Python.	20:14
soren	We have no specific structure that must be followed.	20:14
soren	There are a number of components: compute, network, scheduler, objectstore, api, and volume.	20:14
soren	There can be any number of each of them.	20:14
soren	So Nova itself has no single points of failure.	20:15
soren	Oh, Eucalyptus's cluster and node controllers are written in C, by the way. I forgot.	20:15
soren	All of Nova is Python.	20:16
soren	AFAIK, Eucalyptus supports KVM and Xen.	20:16
soren	We support KVM, Xen, Hyper-V, user-mode-linux, LXC (if not now, then very soon), VMWare vsphere..	20:16
soren	Eerr..	20:17
soren	Yeah, I think that's all.	20:17
soren	WE also support a number of different storage backends (for EBS-like stuff): iSCSI, sheepdog, Ceph, AoE..	20:17
soren	And one more, which I forget what is.	20:17
soren	We're very, very modular in this way.	20:17
soren	Last I checked, Eucalyptus supported AoE. They may or may not support more now. I'm not sure.	20:18
ClassBot	kim0 asked: I understand openstack focuses on large scale deployments .. How suitable is it for openstack to be deployed in a small setting (5 servers?)	20:18
soren	I'm glad you asked.	20:18
soren	The Ubuntu packages I made of Nova work out-of-the-box on a single machine.	20:19
soren	Scaling it out to 5 servers shouldn't be much work. There's some networking things that need to be set up, you need to point it at a shared database (so far, we're working towards a completely distributed data store) and a rabbitmq server.	20:20
soren	We're suffreing a bit from our flexibility, really.	20:20
soren	We can make very few assumptions about people's set up, so there might be a number of things that need to be set up correctly (e.g. which ip to use to reach an api server (or a load balancer in front of them)), which server to use for this, whcih server to use for that).	20:21
soren	It's pretty obvious pretty quickly, though, if something isn't pointed the right way.	20:22
soren	We're "blessed" with a team of people in Europe and in most US timezones, so if you run into trouble #openstack (irc channel) is open almost 24/7 :)	20:22
ClassBot	kim0 asked: Is nova deployed at rackspace in production yet ? did you guys go with xen or kvm, and why ?	20:23
soren	Nova is not in production at Rackspace yet, no.	20:23
soren	Rackspace has an existing platform with which we've not completely hit feature parity.	20:23
soren	...and apparently, it's not ok to make Rackspace's customers suffer because we want to run a different platform :)	20:23
soren	Rackspace will be using Xen Server.	20:23
soren	Oh, I forgot to list that as a supported hypervisor. It is.	20:24
soren	That's what they're used to, and that's what they can get support for for running Windows and stuff.	20:24
ClassBot	markoma asked: Gluster was mentioned in a previous discussion. Is swift the right way to go, or Gluster?	20:24
soren	They do very different things.	20:25
soren	Gluster aims to provide a POSIX compliant filesystem.	20:25
soren	Swift is an object store.	20:25
soren	You address full objects. You cannot seek back and forth, replace parts of objects, etc.	20:25
soren	Very much like Amazon S3.	20:26
soren	Gluster recently announced they want to contribute to Swift. I don't know exactly how, but something's afoot :)	20:26
ClassBot	jrisch asked: I think it's still unclear from the documentation, but it mentions something about a cloudpipe vm, but doesn't clarify it's role nor it's usage. Can you elaborate on that?	20:26
soren	Ah, yes.	20:26
soren	Cloudpipe is something NASA uses.	20:27
soren	I don't think anyone else does, and perhaps will.	20:27
soren	Each project has its own private subnet assigned.	20:27
soren	Typically in the 10.0.0.0/8 range.	20:27
soren	It's not reachable from the internet.	20:27
soren	Cloudpipe images are images with an openvpn server in them.	20:27
soren	Each project has such an instance running. They can connect to it using openvpn, and they can then reach their instances.	20:28
soren	It's not required at all.	20:28
soren	I've never used it.	20:28
ClassBot	topper1 asked: is rabbitmq a SPOF since its clustering doesn't replicate queues?	20:28
soren	In a sense.	20:29
soren	From Nova's point of view, it's a bit of a black box.	20:29
soren	We speak to something that speaks AMQP. We expect it to behave.	20:30
soren	Just like we use an SQL database of some sort and expect it to behave.	20:30
soren	RabbitMQ is way more stable than what we could have hacked up in the time it took to run "apt-get isntall rabbitmq-server".	20:31
soren	way more stable.	20:31
soren	There's work in progress to build a queueing service for OpenStack, but in general, we try to use existing components.	20:32
ClassBot	n1md4 asked: There seems to be install guides for CentOS, RHEL, and Ubuntu, is there nothing specifically for Debian?	20:32
soren	Not right now, I don't think.	20:32
soren	I'd be thrilled if a DD stepped up and put OpenStack into Debian.	20:33
soren	...and sorted out all the dependencies.	20:33
soren	It's silly not to, really.	20:33
soren	It's just that noone has done it yet.	20:33
ClassBot	markoma asked: do you, would you, use Ensemble to manage services for OpenStack?	20:33
soren	I've no clue about what Ensemble does at the moment, so I can't really answer that.	20:34
ClassBot	jrisch asked: If cloudpipe isn't required, how do you set up access to the VM's, IP mappings and stuff. Do the physical node act as a pipe/NAT device?	20:34
soren	I tend to use floating ip's.	20:34
soren	They're public IP's that you can dynamically assign to instances.	20:34
soren	Alterntively, you can just use one of the other netowrk managers and use a subnet that's routable.	20:35
ClassBot	jrisch asked: So if you speak AMQP to the message queue, could one use ActiveMQ instead? (it supports clustering as far as I know).	20:35
soren	AFAIK, we don't do anything that requires RabbitMq.	20:35
soren	So I guess ActiveMQ would work, if it speaks AMQP.	20:36
ClassBot	topper1 asked: Is there work afoot to create API documentation (rest api) for swift... right now it requires 'you read the python')	20:36
soren	Uh, there's plenty of docs.	20:36
soren	HAng on.	20:36
soren	http://www.rackspace.com/cloud/cloud_hosting_products/files/api/	20:37
soren	Same thing.	20:37
soren	I don't know where the ones labeled "openstack" are, but it's the same thing.	20:37
soren	Ah, question queue is empty..	20:38
soren	Where was I? :)	20:38
* soren scrolls up		20:38
soren	Nowhere, apparantly.	20:38
soren	Ok, process..	20:38
soren	We do time based releases.	20:38
soren	Just like Ubuntu.	20:39
soren	Except we have 3 months cycles, rather than 6 months.	20:39
soren	We align with Ubuntu so that every other OpenStack release should almost coincide with an Ubuntu release.	20:39
soren	We have feature freezes, beta freezes, RC freezes and final freezes just like Ubuntu.	20:40
soren	This is no coincidence :)	20:40
soren	Ubuntu is our reference platform.	20:40
soren	I'm a core dev of Ubuntu, too, so if we have an problem with a component outside Nova, we can fix it and get it into our reference platform quite easily.	20:41
soren	This holistic view of the distribution has served us very well, I think.	20:41
soren	Nova can be way cool, but if there are bugs in libvirt, we're going to suffer, too, for instance.	20:42
soren	Ok, so say you wanted to work on something in Nova (or other parts of Openstack).	20:42
soren	You can branch the code from launchpad (which we use for everything: blueprints, bugs, code, answers) using "bzr branch lp:nova"	20:43
soren	Hack on it, upload a branch to launchpad, and click the "propose for merge" button.	20:43
soren	Within a couple of days someone should have looked at it and reviewed id.	20:43
soren	it.	20:44
soren	If it's good, it gets approved. If it's less good, we (try to) give constructive feedback so that you can fix ti.	20:44
soren	Once it's good, it's approved.	20:44
soren	Once approved, a component called Tarmac takes over.	20:44
soren	Tarmac is run from our Jenkins instance: http://jenkins.openstack.org/	20:44
soren	It looks for approved branches on Launchpad, merges them, and runs our test suite.	20:45
soren	We have around 75% code coverage, I think.	20:45
soren	Far from ideal, but it cathces quite a few things.	20:45
soren	If the tests pass, your branch is merged.	20:45
soren	And that's it.	20:45
soren	If the tests fail, your branch gets set back to "needs review" and you can go and fix it again.	20:46
soren	This is fine. It happens all the time. Don't sweat it.	20:46
soren	WE're also doing some integration tests.	20:46
soren	Oh, one other thing:	20:46
soren	When a patch gets merged, it triggers a package build.	20:47
soren	This means that if Launchpad doesn't have a huge backlog, less than 20 minutes after your branch has been reviewed, you can "apt-get upgrade" and get a fresh version of Nova with your patch in it.	20:48
soren	So we continuously test that our packages build.	20:48
soren	I have a Jenkins instance that checks the PPA for updates.	20:48
soren	If there are updates, it installs the updates and runs a bunch of integration tests.	20:48
soren	So within... I dunno, 35 minutes or so, probably, your patch has gone through unit tests, packages builds, and integration tests.	20:49
soren	I think that's pretty cool.	20:49
soren	We're working on expanding these tests.	20:50
soren	So that we test more combinations of stuff.	20:50
soren	I currently test KVM with the EC2 API using iSCSI volumes on Lucid, Maverick, and Natty.	20:51
soren	We provide backported versions of stuff that is needed to run Openstack on Lucid, which we do support.	20:51
ClassBot	There are 10 minutes remaining in the current session.	20:51
soren	...as well as Maverick and NAtty.	20:51
soren	Well, there's nothing backported for Natty, because we put that directly into UBuntu.	20:51
ClassBot	kim0 asked: Can you talk a bit about nova's roadmap	20:52
soren	Sort of.	20:52
soren	There are some things on the road map already.	20:52
soren	...but we have a design summit coming up, where we'll be talking much more about the roadmap.	20:53
soren	It's an open event in Santa Clara in about a month, if anyone wants to come.	20:53
soren	Should be fun.	20:53
soren	Things that I do know on the road map already:	20:53
* soren looks desperately for the list.		20:54
soren	https://blueprints.launchpad.net/nova	20:55
soren	Well, this is the list of everything.	20:55
soren	Cactus is the release we're working on now.	20:55
soren	Bexar is the previous one.	20:55
soren	Diablo the next one.	20:55
soren	Lots of different companies work on OpenStack. They have their own priorities.	20:56
soren	Whatever they want to work on, they can.	20:56
ClassBot	There are 5 minutes remaining in the current session.	20:56
soren	So in that respect, it's hard to say what's going to land at any given time. It depends on what people feel like working on.	20:56
soren	We're going to split out some stuff from nova (volume and network services), though.	20:57
soren	That seems pretty certain right now.	20:57
soren	And add support for the EC2 SOAP API.	20:57
soren	People keep telling me no-one uses it, but... meh. I want to add it.	20:57
soren	MAn, I can't really remember more stuff right now :(	20:58
ClassBot	jrisch asked: I know that Swift is in production several places (other than Rackspace) - do you know of any companies that are using NOVA (besides NASA)...?	20:58
soren	Not at the moment, no.	20:58
soren	This current dev cycle has been one focused on stability and deployability.	20:58
soren	The goal has been to get Nova to a point where people could actually use it in production.	20:59
soren	I've blogged a bit about some the stuff I've done on that.	20:59
soren	..but lots of others have worked on it, too.	20:59
soren	I guess that's it?	20:59
soren	I hope it's been useful.	21:00
kim0	Thanks soren	21:00
kim0	This has been great	21:00
kim0	Thanks everyone ..	21:00
kim0	Hope you enjoyed the sessions	21:00
kim0	See you tomorrow for the second day	21:01
ClassBot	Logs for this session will be available at http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html	21:01
=== ChanServ changed the topic of #ubuntu-classroom to: Welcome to the Ubuntu Classroom - https://wiki.ubuntu.com/Classroom \|\| Support in #ubuntu \|\| Upcoming Schedule: http://is.gd/8rtIi \|\| Questions in #ubuntu-classroom-chat \|\|
* DigitalFlux Missed today's Cloud day :(		21:09
Meths	http://irclogs.ubuntu.com/2011/03/23/%23ubuntu-classroom.html	21:09
DigitalFlux	Meths: Cool Thanks, may be tomorrow i can catch up	21:10
=== neversfelde_ is now known as neversfelde
=== sre-su_ is now known as sre-su
=== niemeyer is now known as niemeyer_dinner

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!