[03:54]  * kiall really hopes I havent found another eucalyptus bug ;) Metadata sent in via EC2 PHP SDK ends up as jibberish :/
[04:24] <kiall> <-- feels stupid ... needed tp base64_encode it ;)
[04:25] <kiall> (But - at least EC2 told me .. unlike ueca :P)
[06:10] <kiall> grr - UEC elastic IP "stuck" on an instances that no longer exists .. cant disassocate it :/
[07:04] <Makere> how do I remove the store-images from cloud and disk?
[11:13] <TritoLux_> Hello there, I upgraded my cloud from 10.04LTS to 10.10, nodes went fine, but the CLC is now extremely slow with high cpu usage, even when eucalyptus is shut down. It seems like if the kernel is not doing well. Did anybody experience the same here or is willing to point me to the right direction to solve this annoying problem please?
[11:15] <TritoLux_> If I run top, then all processes seem at 0%, but cpu usage is always between 10% and 50% anyway. The CLI is quite unresponsive and I cannot even troubleshoot well.
[11:18] <TritoLux_> ksoftirqd processes often reach 20%, it seems that the system is struggling
[11:22] <TritoLux_> the ubuntu forums are down at the moment, so there's not much room for investigation out there at the moment. any help here would be appreciated
[11:25] <kiall> TritoLux_, I havent updated from 10.04, but my 10.10 install is pretty responsive (unless i do loads of euca-get-consoleoutput's in a row .. where euca-cloud will chew 100%*Num of cores for a few mins!)...  Cant be of any real help tho .. sorry ;)
[11:27] <TritoLux_> thanks kiall, my problem persists even if eucalyptus is not running actually
[11:27] <kiall> Wait - the CLC is slow even when all euca services are stopped? It cant be UEC related then!
[11:27] <kiall> dooh .. beat me to it ;)
[11:27] <TritoLux_> I have no idea what is going wrong
[11:28] <TritoLux_> yes kiall, I just found out that it is UEC unrelated
[11:28] <TritoLux_> it took me ages to shut down eucalyptus though
[11:29] <kiall> I presume you didnt do any HW changes at the same time? and you've shut down everything you can in case its another service?
[11:30] <kiall> Beyond that .. it kinda has to be a kernel issue once you have everything off, and know the hardware's good..
[11:30] <TritoLux_> I still trying to understand what other service could be, but apart from ksoftirqd all other processes seem to be sleeping
[11:30] <TritoLux_> the HW is still the same, I performed an ssh upgrade
[11:31] <TritoLux_> the server is quite fast and it was running well with  10.04
[11:32] <kiall> I wonder if you had any 3rd party hardware drivers installed (RAID cards etc) that you may be using an open source alternative now?
[11:33] <TritoLux_> I had not installed any driver manually to be honest, I have an Areca controller that was automatically recognized flawlessly with 10.04
[11:33] <TritoLux_> and since the system starts, I guess that it was recognized on 10.10 as well
[11:34] <TritoLux_> hey wait a sec..
[11:35] <TritoLux_> it is apparently eucalyptus related
[11:35] <kiall> Humm? I thought euca was off?
[11:35] <TritoLux_> it just took more than 30 mins to shutdown the cloud
[11:35] <TritoLux_> but now the system is responsive again
[11:35] <TritoLux_> 99.9% idle
[11:35] <kiall> Oh - wow. that cant be right
[11:35] <kiall> anything in the logs between 30mins ago and now¿
[11:36] <TritoLux_> I'll try to see the logs now.. as I finally can access them
[11:41] <TritoLux_> I received a mail from the cloud saying: [FATAL] There is nothing to do here, since there are no nodes with any plugins.  Please refer to http://munin-monitoring.org/wiki/FAQ_no_graphs at /usr/share/munin/munin-html line 38
[11:42] <kiall> I was wondering why munin was a dep ;) .. but that *probably* shouldnt cause any major issues... nothing else?
[11:42] <TritoLux_> I'm checking
[11:43] <kiall> `tail -n 50 /var/log/eucalyptus/* | less` and done ;)
[11:44] <TritoLux_> I have many java cloud errors
[11:45] <kiall> Not sure .. I dont get many errors showing up in mine ..
[11:46] <TritoLux_> ERROR [TxHandle:New I/O server worker #1-7] javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
[11:46] <TritoLux_> javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Cannot open connection
[11:46] <kiall> Cant connect the the database .. something must have mucked up during the upgrade
[11:47] <TritoLux_> to be honest, at the end of the CLC-CC-SC-Wlarus machine upgrade, I received some weird errors, such as:
[11:47] <TritoLux_> File descriptor 42 (/dev/pts/0) leaked on lvs invocation. Parent PID 3754: /bin/sh
[11:47] <TritoLux_>   /dev/etherd/e0.4p1: read failed after 0 of 4096 at 0: Input/output error
[11:47] <TritoLux_>   /dev/etherd/e0.5p1: read failed after 0 of 2048 at 0: Input/output error
[11:47] <TritoLux_>   /dev/etherd/e0.6p1: read failed after 0 of 4096 at 0: Input/output error
[11:47] <TritoLux_>   /dev/etherd/e0.6p2: open failed: No such device
[11:48] <kiall> looks like it was shutting down some EBS volume exports, or trying to?
[11:48] <TritoLux_> those are the only ones I saw, apart from a non fatal dhcpd3 failure
[11:49] <TritoLux_> the reasons I wanted to upgrade to 10.10 is actually because I suspected some AoE issues
[11:50] <kiall> humm ... anyway .. i've gotta run! good luck
[11:50] <TritoLux_> thanks kiall
[11:50] <kiall> (btw - its be 99% that failing to connect to the DB is your issue ...)
[11:51] <kiall> (assuming that log repeats a few times a min ...)
[11:51] <TritoLux_> yes, it's quite often
[11:51] <TritoLux_> what should I look for in order to troubleshoot this DB issue?
[11:53] <kiall> Honestly - Not sure .. still at the trial stage here :)
[11:53] <kiall> i'd check /var/lib/eucalyptus/db/ and find out how hsql or whatever its called works!
[12:00] <TritoLux_> ok thanks
[17:47] <TritoLux_> kiall, dunno if you are still there.. about the slow response issue after my maverick upgrade, the funny part is that if I reboot the server I have a slow system, but if I manage to stop eucalyptus and then I start it manually, then the system is responding fine. As if there is some conflict during startup.
[17:48] <TritoLux_> those db messages were only reported during the shut down process, which took ages, so the system was still looking for the db I guess
[17:50] <TritoLux_> why eucalyptus is slow at startup and whyh ittakes ages to be stopped.. that I dunno yet
[17:50] <TritoLux_> but it works fine if I manually start it again, really weird
[18:02] <TritoLux_> did anybody else experience the above slow response at startup?