[14:49] <EBC3> morning all
[15:49] <the_hydra> i am ready for the upcoming session...
[15:54] <resno> 6 more mins
[15:58] <nigelb> Folks, special announcement.
[15:58] <nigelb> Today is the_hydra's birthday!
[15:58] <nigelb> And he's celebrating that by giving a session in -classroom!
[15:58]  * the_hydra bows to everyone 
[15:59] <resno> happy birthday the_hydra
[15:59] <resno> by the way, we arent muted ;)
[15:59] <the_hydra> resno: thanks...I hope I do something meaningful
[15:59] <the_hydra> ok, is it now?
[15:59] <resno> the_hydra: me too
[16:00] <resno> its 11 am!
[16:00] <the_hydra> ok, let's begin, shall we?
[16:01] <nigelb> hang on for a minute
[16:01] <the_hydra> k...
[16:01] <nigelb> the classbot should kick in
[16:02] <resno> whats the chat channel?
[16:03] <the_hydra> "ubuntu-classroom-chat" AFAIK
[16:03] <resno> thanks
[16:03] <the_hydra> ok people, let's roll
[16:03] <the_hydra> today, I am gonna discuss about how to use monitoring tools in Linux to pinpoint system problems such as lack of RAM and so on
[16:04] <the_hydra> what we're gonna quickly observe here is top, vmstat and friends
[16:04] <the_hydra> "top" is a classic...you will likely find it in every UNIX flavour....
[16:05] <the_hydra> if you look closer, "top" display almost everything, your CPU usage, memory usage, process statistics
[16:05] <the_hydra> they're either displayed as percentage or number...mostly..
[16:06] <the_hydra> for example, CPU usage...are divided into user and system (marked as "us" and "sy")
[16:06] <the_hydra> anyone has idea what they mean?
[16:07] <c2tarun> us: processes that are controlled by user account
[16:07] <c2tarun> and don't have there system account (i guess)
[16:07] <the_hydra> other ideas?
[16:07] <obengdako> well like you said us for user and sy for system
[16:08] <the_hydra> alright, "us" means time spent in user mode
[16:08] <resno> user: processes that the user owns or runs
[16:08] <the_hydra> while "sy" is system
[16:08] <the_hydra> let's pick cp
[16:08] <the_hydra> when you copy a file with it, cp do two things actually:
[16:08] <the_hydra> 1. read the data from disk
[16:09] <the_hydra> 2. copy it to memory area, and write it somewhere
[16:09] <the_hydra> read and write to disk are example of system work
[16:09] <the_hydra> while pasting them to temporary buffer in RAM is user mode
[16:10] <the_hydra> or to make it simple, "sy" is anything where kernel is involved
[16:10] <c2tarun> the_hydra: What does it actually mean by user mode?
[16:10] <the_hydra> meaning, it could entirely done without kernel involved
[16:10] <c2tarun> ok
[16:10] <the_hydra> e.g: number operation (substract, multiplication)
[16:11] <the_hydra> so, let's bring this knowledge when observing top
[16:11] <the_hydra> high "us" means CPU is busy doing very likely number crunching
[16:11] <the_hydra> you guys got a figure what number crunching is?
[16:11] <obengdako> sorry guys but aren't we supposed to ask questions in the chat section because i just asked there and see others ask here
[16:12] <the_hydra> ok wait
[16:13] <the_hydra> Q: obengdako : i realise for cpu there is also ni id wa hi si st what do they mean?
[16:13] <the_hydra> ni is a field that tells us how much CPU time is spent to work on "niced" jobs
[16:14] <the_hydra> nice means any process with priority higher than 0
[16:14] <the_hydra> these jobs are likely to be scheduled less frequently...because CPU prioritize bigger ones
[16:15] <the_hydra> id --> idle... CPU idle a.k.a resting :)
[16:15] <the_hydra> wa--> ok this is interesting
[16:15] <the_hydra> ever wonder on event like waiting cp or rsync doing large files transfer?
[16:16] <the_hydra> although they spent time in user and system mode, but they spend more in waiting....
[16:16] <the_hydra> why wait? because the data need to be read from the disk spindle
[16:17] <the_hydra> reading from disk are thousands time slower rather than from RAM...so you got the picture
[16:17] <the_hydra> other example are waiting data from network
[16:17] <the_hydra> so next time you see high number it %wa, go check if an application is sitting in background doing I/O
[16:18] <the_hydra> Q: <obengdako> when you say higher you mean greater than zero and meaning a lower priority?
[16:18] <the_hydra> in priority, bigger number means lower priority
[16:18] <the_hydra> so +5 vs -5, -5 wins :)
[16:19] <the_hydra> next: si and hi...
[16:19] <the_hydra> they are interrupts actually
[16:20] <the_hydra> if it's quite big, let's say 50%, then very likely a device is "interrupting" your CPU too much
[16:20] <the_hydra> it could be sending data, a damaged hardware and so on
[16:21] <the_hydra> lately, thanks to DMA (Direct Memory access), CPU is less likely involved in hardware operation, so you can be more relaxed in regards to CPU utilization
[16:21] <the_hydra> Q: <obengdako> so ni will be for the higher priority stuff that have figures less than zero eg. pulseaudio at -11
[16:22] <the_hydra> nope, -11 isn't accounted in %ni
[16:22] <the_hydra> only if e.g pulse audio is marked as +5
[16:23] <the_hydra> Q: <obengdako> so would si be system interrupt and hi be hardware interrupt?
[16:23] <the_hydra> nope, si is "soft interrupt"
[16:23] <the_hydra> while "hi" is hard interrupt
[16:23] <the_hydra> "hi" for example is interrupt coming from your keyboard received by the CPU
[16:24] <the_hydra> and "si"? you can think like this: Linux kernel splits the job of taking care your hardware in two "session"...initially, in "hard interrupt" mode
[16:24] <the_hydra> which is done as fast as possible
[16:25] <the_hydra> and later, the rest is done in "soft interrupt"
[16:25] <the_hydra> so both represents the job when system is handling hardware most of the times
[16:25] <the_hydra> let me pause briefly here...
[16:26] <the_hydra> note: likely you will just need to watch %id
[16:26] <the_hydra> because, by knowing it, you could have idea how busy the CPUs are :)
[16:26] <the_hydra> for example, %id is 99%
[16:26] <the_hydra> so CPU is only busy 1%
[16:27] <the_hydra> press "1" to get per core/ per physical CPU statistic
[16:27] <the_hydra> by default what you see there is  the sum of all CPUs
[16:28] <the_hydra> Q: <obengdako> final about ni so desktopcouch-se with ni of 10 will be accounted for in %ni and not pulseaudio with -11 ?
[16:28] <the_hydra> correct obengdako
[16:28] <the_hydra> and if you suspect slowness but you think Linux is "doing nothing", check %wa
[16:29] <the_hydra> if it's bigger than 0, then.....it's doing I/O e.g reading your USB thumbdrive
[16:29] <the_hydra> Q: <himuraken-mobile> How to switch back to sum of all cpu?
[16:30] <the_hydra> simple, "1" again :D
[16:30] <the_hydra> it's a toggle :D
[16:30] <the_hydra> ok, slipped something...
[16:30] <the_hydra> see "zombie" field?
[16:30] <the_hydra> got idea what that is?
 zombies are processes that havent quit yet
[16:31] <the_hydra> correct...
[16:32] <the_hydra> so, if you see that field is bigger than zero...very likely you see bugs in the current running applications
[16:32] <the_hydra> to pin point it, you can use "ps" or "top" in batch mode and look out for process with "Z" status
[16:33] <the_hydra> such as "ps auxww | grep Z"
[16:33] <the_hydra> AFAIK to get rid of it, you have to kill the "master" process
[16:33] <the_hydra> pstree could help locate the parent of the zombie
[16:34] <the_hydra> Q: <obengdako> how do i use top in batch mode?
[16:34] <the_hydra> use -b switch
[16:34] <the_hydra> please note that batch mode will use any setting you have written to ~/.toprc
[16:35] <the_hydra> for example, you have set that you only display 20 process (by pressing "n" in top"), then in batch mode you will only see 20 process listed too
[16:36] <the_hydra> batch mode is an alternative way to continously monitor your system but in non interactive mode :)
[16:36] <the_hydra> ok, may I move to memory part?
[16:37] <the_hydra> you will likely think "great, all my RAM are used?"
[16:37] <the_hydra> "all are my app that hungry"
[16:37] <the_hydra> most likely no...check buffers and cached field
[16:38] <the_hydra> the bigger they are, meaning Linux is caching some of recently accessed files in your RAM
[16:38] <the_hydra> the purpose? to speed up the next access toward those files
[16:39] <the_hydra> Q: <resno> the_hydra: did you talk about st?
[16:39] <the_hydra> st is "steal"
[16:39] <the_hydra> what is stealed?
[16:39] <the_hydra> it's introduced in this virtualization hype era
[16:40] <the_hydra> it means some of your CPU time are accounted for doing something for the virtual machine
[16:40] <the_hydra> so if you run plenty virtual machines (be it KVM, Xen, Qemu in KVM mode etc), you will see this field increased
[16:41] <the_hydra> swap used is bigger than 0? meaning: you need more RAM... :)
[16:42] <the_hydra> seriously: swapping should be avoided actually...so if you have budget and can afford for bigger RAM, buy them..it will make your machine runs faster
[16:42] <the_hydra> these days, 2 GiB is the lowest amount of RAM you should have IMHO
[16:43] <nigelb> doh
[16:46] <ClassBot> resno asked: Is there a way to tell when the CPU is not keeping up.  As in when more RAM wont help.
[16:47] <the_hydra> while %sy and %us are nice clue, for better indication use load average field
[16:47] <the_hydra> those three number represents loads during the last 5, 10 and 15 minutes respectively
[16:47] <the_hydra> but what is "load"?
[16:48] <the_hydra> the simplest meaning is: the average number of process running per core
[16:48] <the_hydra> so if you have quad core and you see -/+ 4.0, it means per core are fairly constantly running 1 process
[16:49] <ClassBot> obengdako asked: so if my swap is never used according to top can i delete the swap partition?
[16:49] <the_hydra> better not..think swap like the emergency room
[16:50] <the_hydra> if you have no swap and suddenly your application need more "memory", you will ended with situation called OOM (out of memory)
[16:50] <the_hydra> the effect? your app will be killed....:D
[16:50] <the_hydra> must be hurry :)
[16:50] <ClassBot> There are 10 minutes remaining in the current session.
[16:51] <the_hydra> process field....you can sort them here..press "M" to sort them by memory usage
[16:51] <the_hydra> and "P
[16:51] <the_hydra> for CPU usage
[16:52] <the_hydra> so if you wanna get quick idea which one is hungry for RAM...press M...then kazaam...you know it :)
[16:53] <the_hydra> pay attention that for finer grained statistics, you need to lower the update...press "s" and make it somewhere between 3 and 5
[16:53] <the_hydra> anything higher will make top update quite slow
[16:54] <the_hydra> but if it is lower than that, you will make top itself one of top  CPU consumer :)
[16:54] <the_hydra> VSZ or RSS, which one represents memory usage?
[16:54] <the_hydra> correct answer is RSS
[16:55] <the_hydra> you can think VSZ like the final plan construction of  a building
[16:55] <the_hydra> while RSS is current progress of building construction itself
[16:55] <ClassBot> There are 5 minutes remaining in the current session.
[16:56] <the_hydra> effectively use "P" and "M" will help you to find out question like "machine feels slow, ok who's the suspect?"
[16:56] <the_hydra> again, questions?
[16:56] <ClassBot> obengdako asked: sorry but where from RSS and VSZ are they in top? or should i google?
[16:57] <the_hydra> oh sorry, I mean RES and VIRT
[16:57] <the_hydra> VIRT=virtual size (final plan construction"
[16:57] <the_hydra> RES= current construction progress
[16:58] <the_hydra> i think we got 2-3 minutes left
[16:58] <the_hydra> btw, press "f" in top...you have plenty choice of fields there
[16:59] <the_hydra> perhaps another time,I shall discuss them
[16:59] <ClassBot> obengdako asked: so is that like the maximum possible allocation for an app (virt) and res the current allocation?
[16:59] <the_hydra> not really...virt itself can expand
[17:00] <the_hydra> but compared to RSS, it is likely to change too frequently
[17:00] <the_hydra> it is not, I mean
[17:01] <ClassBot> Logs for this session will be available at http://irclogs.ubuntu.com/2011/01/27/%23ubuntu-classroom.html
[17:01] <obengdako> wow great session the_hydra
[17:02] <nigelb> Thanks for the wonderful session the_hydra and Happy birthday again!
[17:03] <the_hydra> sorry if the time feels too narrow, guys