[00:14] will trigger a new image so I can have a working emulator [03:25] something happened with the download taking way too long for the latest image it seems... I restarted them, so results should start to show up in a bit [05:12] good morning [05:18] robru: I'll test unity8 AP too.. [05:18] robru: I've gotten sometimes problematic results where another run(s) have been fluent [05:31] cihelp daily-release-executor is reported offline [05:31] so all builds halted as well [05:33] plars: there was a regression in the recovery image, which is fixed in the archive but needs a newer image [05:33] not sure if the regression would cause the device to fail in someway [07:12] Mirv: hey! around already? :) [07:14] cihelp: hey, all executors on q-jenkins are down, (so everything is blocked). Seems the reason why robru/ken couldnt publish Mir yesterday [07:14] didrocks, Mirv: on it [07:15] thanks [07:15] vila: I guess the daily-release executor is the first one to investigate why it went down [07:16] didrocks: yes, Mirv said that already, we all agree [07:21] didrocks: yeah, as written on langing plan [07:22] didrocks: robert also had some unity8 AP problem, but I couldn't reproduce it so it's probably his environment (like wrongly updated unity8 from daily-build, or click tests being run and messing up unity8 testing) [07:22] Mirv: ah, the notes were not from yesterday [07:22] Mirv: yeah, probably [07:22] Mirv: I'm not sure to understand the unity-mir rebuild? [07:23] if it wasn't rebuilt already, you can't test, right? [07:23] didrocks: the mir 0.1.1 build-dep was not in, otherwise it's ok. same for u-s-c. [07:23] Mirv: ah, it's just the bump build-dep MP [07:23] Mirv: well, if we release without that one, I won't die TBH ;) [07:23] as long as the things are rebuilt as expected [07:24] didrocks: ok ;) yeah the libmirserver10 dependency is there anyway, as it was built after the mir build [07:24] Mirv: ok, just don't stress on that, if you feel that we should skip the build, please do (if we are ready to publish) [07:25] ogra_: asac: did one of you requested a new image? [07:25] ogra_: asac: I see a new image and 2 android uploads in particular to the archive (the second to fix a regression it seems) [07:25] didrocks: ok. we're ready to publish, aside from that I don't see a note if unity-system-compositor was smoke tested. [07:25] so that needs to be done still [07:25] ogra_: asac: but the image which was built was between the 2 android uploads, so the "recovery regression" is in the current proposed image? [07:26] Mirv: ok, maybe you should kill in jenkins if there are some mir/platform/unity8 stacks planned to be rebuilt? [07:26] so that one q-jenkins is repaired, you are not trapped by a rebuild :) [07:26] didrocks: yes, I did that already [07:27] (except for platform it seems, now did that too) [07:27] thanks Mirv :) [07:30] waow, it's snowing here [07:31] didrocks: cool, it's just rain currently here :) [07:32] I'd welcome snow already, it's been unusually warm (which means also dark) [07:32] Mirv: well, I think it'll transform soon in rain as well :) [07:32] ah ah, for some kind of "warm" ;) [07:33] here, for us, it's freezing (like we have the temperature we do have generally in 3 weeks) [07:33] daily-release-executor is back [07:41] thanks vila [08:04] ok got to the desktop with updated mir + u-s-c too [08:04] Mirv: and still get graphics? ;) [08:04] didrocks: even that :) not much multimonitor though [08:04] Mirv: not a surprise, but not a bad surprise either :) [08:05] sweetness! [08:31] didrocks: the mir stack had already queued the prepare tasks before daily release executor broke earlier, so they got rebuilt. mir is a no-change rebuild and u-s-c got now the mir 0.1.1 b-d in. I'm rerunning unity8 AP still and after then asking for packaging acks to publish. [08:31] Mirv: sounds good, just do minor dogfooding [08:32] don't stress it too much in some words :) [08:32] thanks! [08:50] didrocks: ok then, http://q-jenkins.ubuntu-ci:8080/view/cu2d/view/Head/view/Mir/job/cu2d-mir-head-3.0publish/lastSuccessfulBuild/artifact/packaging_changes_mir_0.1.1+14.04.20131120-0ubuntu1.diff + http://q-jenkins.ubuntu-ci:8080/view/cu2d/view/Head/view/Mir/job/cu2d-mir-head-3.0publish/lastSuccessfulBuild/artifact/packaging_changes_unity-system-compositor_0.0.2+14.04.20131120-0ubuntu1.diff + http://q-jenkins.ubuntu-ci:8080/view/cu2d/view/Head/view [08:50] so the actual u-s-c session moved to ubuntu-desktop-mir which I already noticed [08:50] Mirv: sorry, my system is in autodestruction mode after removing the emulator, I can't use sudo, ls or any other tool like chromium, can you get someone to review it? [08:50] I can't even open a webpage [08:56] didrocks: ok, have a nice debugging session.. [08:57] well, nice with no "ls", I hope that my home dir is still intact… [09:01] didrocks: from backlog: [09:01] 09:59 < -bip> 20-11-2013 00:08:09 -!- alesage!~alesage@c-98-212-4-39.hsd1.il.comcast.net has quit [Quit: Leaving] [09:01] 09:59 < rsalveti> 00:15:21> will trigger a new image so I can have a working emulator [09:01] :/ [09:02] asac: well, and no second image with "recovery fixed" [09:02] right? [09:02] or did you find anything in the backlog? [09:02] didrocks: let me look more careful :) [09:02] asac: let's discuss in a few, my install is broken by the emulator I guess [09:03] 09:59 < plars> 03:25:24> something happened with the download taking way too long for the latest image it seems... I restarted them, so results should start to show up in a bit [09:03] 09:59 < rsalveti> 05:33:30> plars: there was a regression in the recovery image, which is fixed in the archive but needs a newer image [09:03] 09:59 < rsalveti> 05:33:42> not sure if the regression would cause the device to fail in someway [09:03] didrocks: next line spoken here was you pinging Mirv [09:03] so guess not [09:03] though the need for an image was identified [09:04] didrocks: so feels we want another image? whatelse happened in the meantime? [09:05] asac: yeah, I guess we want one (but again, I can't even ls right now, let me fix my box) [09:05] and post a warning if it's really due to the emulator replacing libc6 with the armhf version on your system [09:05] didrocks: ok. guess just boot in a live session and fix it from there [09:06] :/ [09:06] asac: I'm trying with busybox here [09:06] or from initrd if you are brave [09:06] yeah [09:06] but dpkg from busybox doesn't show which arch is installed :p [09:07] didrocks: guess check in /var/lib/dpkg on your own [09:07] 2013-11-20 09:48:34 status half-configured libc-bin:amd64 2.17-93ubuntu4 [09:07] from removing the emulator [09:09] * asac wonders what the emulator does that can cause this [09:09] didrocks, weird, usually app ask you to type the "yes, I'm sure I want to do that" before removing libc-bin [09:10] seb128: I think it didn't remove, it has overriden with the armhf version [09:10] seb128: I clearly didn't get "remove libc6" [09:10] half-configured probably means that things fell over during upgrade [09:10] ok, busybox didn't work from my needs, let me try to find an usb stick and fix from there [09:10] yeah [09:10] I guess there are some divert magic for the emulator [09:11] didrocks: where did you get the emulator from? [09:11] and something when went on removing [09:11] ack [09:11] asac: followed the wiki page [09:11] (but then, tried to remove the android-emulator package) [09:12] I hope other people don't try the emulator [09:12] well, it's not 100% confirmed, but it was the only thing I was doing at this time [09:12] didrocks, do you want me to follow up on the list to "be careful, Didier bricked his system with that"? [09:12] didrocks: you hvae a URL to the wiki? [09:12] seb128: would be nice to have someone confirming (in a vm first) [09:12] asac: well, I just have a weechat window here :p it was on the ML [09:13] kk [09:13] ok, see you (hopefully) in a few [09:13] asac, https://wiki.ubuntu.com/Touch/Emulator [09:14] you should try on amd64 to reproduce, as jibel noted, it's probably a i386<->amd64 issue [09:14] so seb128 is safe, you can try in production! :p [09:14] lol [09:15] it refers to an android-emulator package [09:15] but thats noot in the archive... which ppa is it? [09:15] asac, binaries are only available for i386 [09:15] in the archive [09:16] asac, it's in multiverse for i386 [09:16] https://launchpad.net/ubuntu/+source/android/20131120-0225-0ubuntu1 [09:16] oh its part of android [09:16] yeah i can imagine that that might cause accidential havoc [09:20] on i386 it wants to install libc6-amd64 [09:20] not sure I want to risk trying on my system :p [09:22] so the package doesnt look harmful by itself [09:23] trying in an amd64 VM, it's installing 723MB of i386 packages [09:23] seb128: it includes stuff like: ./usr/share/android/emulator/out/host/linux-x86/bin/emulator64-arm [09:23] so it somewhat makes sense that it brings in amd64 stuff [09:23] Commandline: apt-get install android-emulator [09:24] Install: android-emulator:i386 (20131120-0225-0ubuntu1), lib64gcc1:i386 (4.8.2-1ubuntu2, automatic), lib64stdc++6:i386 (4.8.2-1ubuntu2, automatic), libc6-amd64:i386 (2.17-93ubuntu4, automatic) [09:24] (that was what I had) [09:24] and it worked after that, it's really when I removed the package and autoremove the deps [09:24] asac, it's a bit weird that it's an i386 package and that it pulls in amd64 binaries [09:24] didrocks, I reproduced your problem [09:24] http://paste.ubuntu.com/6447226/ [09:24] and broke the VM after removing the emulator [09:24] asac, ^ [09:25] jibel, can you paste the log of what apt did when you removed the emulator? [09:25] right. that would be good to know [09:25] seb128, sure [09:25] jibel: ah, so not rebooting immediatly, would be interesting to get your VM fix rather than me being offline :) [09:25] didrocks, indeed, 1 minute [09:26] (ok, just finished getting an usb stick ready with an old trusty image) [09:26] so the emulator package has nothing in the postrm postinst scripts [09:26] so the bustage must come genuinely from apt and libc6-amd64 removal [09:26] right [09:27] my guess is that if i just instlal libc6-amf64 and remove it will also be broken [09:27] seb128, http://paste.ubuntu.com/6447229/ [09:27] -> e.g. if thats the case thats a doko bug, no? [09:27] at least, I'm looking at busybox dpkg and it doesn't have a lot of options [09:27] asac: you're probably right, indeed [09:28] seb128, asac to reproduce on an amd64 VM, dpkg --add-architecture i386 && apt-get update && apt-get install -y android-emulator && apt-get autoremove --purge android-emulator [09:28] libstdc++6:i386* [09:28] odd. [09:31] yeah, so it probably removed the libc6-amd64 files configured to be used against [09:31] and then, can't run the postrm script to configure the new links? [09:32] jibel: can you reproduce by just installing the libc-amd64 and purging that? [09:32] (in case you have a VM image easily available) [09:32] asac, hold on, I'm trying to recover the system first to help didrocks save his machine [09:32] I would appreciate, indeed :) [09:32] sure [09:33] that package has a postrm [09:33] if [ "$1" = remove ]; then [09:33] ARCH=${DPKG_MAINTSCRIPT_ARCH:-$(dpkg --print-architecture)} [09:33] if [ "${ARCH}" = "amd64" ] && [ "libc6-amd64" = "libc6-i386" ]; then [09:33] if [ -h /lib/ld-linux.so.2 ] && [ ! -f /lib/ld-linux.so.2 ]; then [09:33] rm /lib/ld-linux.so.2 [09:33] however, maybe first narrowing it down will help :) [09:33] greedilyu removing ld-linux [09:34] hum, I still have /lib/ld-linux.so.2 here [09:34] seb128: that only happens on amd64, no? [09:34] i think that might not be it [09:34] but the postrm failed itself [09:34] so it's not executed [09:34] (as per log) [09:34] all those postrm fail [09:34] right [09:34] seems stuff is already busted after prerm [09:34] yep [09:34] and after just removing the files [09:34] int he packages [09:35] maybe having the log from installing might show some diversions [09:35] etc. [09:35] I guess when removing libc6-amd64 files [09:36] are you 100% sure the last android build made it into the image ? [09:36] asac, trying now [09:36] * ogra_ sees that publishing and image build are very close [09:37] ogra_: was it supposed to be reflected in your diff? [09:37] no [09:37] the android package isnt inside anywhere [09:37] so the manifest generator cant pick it up [09:37] Get:1 http://ftpmaster.internal/ubuntu/ trusty/multiverse android all 20131120-0225-0ubuntu1 [348 MB] [09:37] Fetched 348 MB in 48s (7243 kB/s) [09:38] ok, seems it made it [09:38] (you can find it in the livefs build log) [09:38] http://people.canonical.com/~ubuntu-archive/livefs-build-logs/trusty/ubuntu-touch/20131120.1/livecd-20131120.1-armhf.out [09:38] ogra_: I guess you need a new image then [09:39] ? [09:39] ogra_: see the followup android upload at 5 UTC [09:39] 20131120-0225-0ubuntu1 is the last upload [09:39] ah, you got with that one? [09:39] see above [09:39] ogra_: can't click on links right now :p [09:40] if you follow what was discussed, system hosed ^ [09:40] asac, confirmed: apt-get install libc6-amd64 && apt-get remove libc6-amd64 [09:40] and get a broken system for free [09:40] I would have prefered to not have it at all than for free ;) [09:41] thanks for digging jibel :) [09:41] didrocks, yeah [09:42] jibel: thats on i386? [09:43] asac, amd64 with i386 enabled [09:43] which is the same cofigiration than didrocks. I can try on i386 too [09:43] kk [09:43] dont worry. guess file a bug so we can have doko etc. look at it [09:44] k [09:59] didrocks: I filed bug #1253008 about pitti's worry about u-s-c configuration file removal (or actually it's a move to another package) [09:59] bug 1253008 in Unity System Compositor "Clean configuration file on upgrade" [Undecided,New] https://launchpad.net/bugs/1253008 [10:00] Mirv: thanks [10:35] r26 looks ok on magureo [10:35] (havent done any call or sms tests, but it boots fine) [10:42] blimey, 101MB to go to 26 [10:42] did you add the kitchen sink to it? === vrruiz_ is now known as rvr [11:08] ogra_: mind updating the spreadsheet with latest infos? I'm a little bit lost as the android update is still to TODO [11:08] so not sure if the upload was that one (it seems so though) [11:12] * ogra_ can look, but i wasnt involved in any of the last three android uploads [11:12] (i didnt even know about them until i saw them on the changes ML) [11:16] ogra_: I guess the planned image # needs to change as well [11:26] ogra_: hum, ok, I think, I'll bump everyting to 27 [11:27] well, system settings is in 25 from what I see [11:27] didrocks, oh, you wanted me to bump the others ? [11:27] sorry, i understood i should sort the android line [11:27] ogra_: yeah, as they are not exact, but no worry :) [11:28] sorry [11:29] jibel: lool: can one of you confirm https://bugs.launchpad.net/ubuntu/+source/ubuntu-download-manager/+bug/1240656? that will get it moved and then we can kick a saucy image build [11:29] Ubuntu bug 1240656 in ubuntu-download-manager (Ubuntu) "disable debug logging by default" [Critical,In progress] [11:40] ogra_: didrocks: not sure if that's known, but with image 26, we see "failed to copy '/var/lib/jenkins/phablet-flash/imageupdates/pool/ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz' to '/cache/recovery//ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz': No space left on device" on mako [11:40] have seen that in two devices [11:41] interesting, as the broken recovery was installed in image 25? that can be it, right? [11:41] psivaa, the recovery from r25 was broken ... could be that you still havethat in place [11:41] i just did an OTA upgrade on maguro, that one worked fine [11:42] i tried on two devices, not sure if both of them had attempted r25 as well, i'll try another device [11:42] didrocks, you should probably not use IRC nicks (or make clear they are IRC nicks) in mails to ubuntu-phone :) [11:42] ogra_: good point [11:43] "infinity is looking into it" sounds a bit cryptic for non IRC users [11:43] :) [11:43] (like "eternity will solve it") [11:44] go, builders, go [11:45] waiting for amd64+i386 on https://launchpad.net/~ubuntu-unity/+archive/daily-build/+sourcepub/3671940/+listing-archive-extra ... [11:45] ogra_: well, it can be true :) [11:45] haha [11:46] * didrocks doesn't want to hear about amd64 and i386 in the same sentence today [11:46] oh, how rude of me [11:49] yippiiee [11:49] root@ubuntu-phablet:/# lxc-console -nandroid -t0 [11:49] Connected to tty 0 [11:49] Type to exit the console, to enter Ctrl+a itself [11:49] root@android:/ # [11:49] it works ! [11:51] cihelp, hey, unity8 otto is getting stuck due to some dependency issue http://s-jenkins.ubuntu-ci:8080/job/autopilot-testrunner-otto-trusty/817/console [11:54] I *feel* like it might be caused by https://code.launchpad.net/~kgunn72/unity-mir/bump-mir-dep-0.1.1/+merge/193521 [11:55] if unity-mir built against mir > 0.1.1 somehow [11:55] and is kept in the mbs repo [11:55] but then there's no mir > 0.1.1 in the archive yet, so I'm not sure how that could happen [11:55] Mirv, ideas ↑? [11:55] how did that even get through autolanding? [12:00] Saviq: the transition is in the daily-build ppa normally [12:00] didrocks, right... but otto isn't using it [12:01] grr [12:01] Saviq: otto for integration tests is [12:01] Saviq: aanyhow, once the amd64 PPA build of this https://launchpad.net/~ubuntu-unity/+archive/daily-build/+sourcepub/3671940/+listing-archive-extra finally starts, we're really close of having Mir 0.1.1 in archives [12:01] maybe not for upstream merger? not sure [12:03] my plan is 1. wait for amd64 build of that, 2. upgrade desktop, 3. reboot and login, 4. publish mir + platform-api + unity-mir + unity-system-compositor [12:03] Mirv, ok [12:04] we'll just have to wait then [12:04] didrocks, you mean in cu2d it does? [12:04] Saviq: yeah, when releasing, we are adding the ppa in the otto integration tests [12:04] didrocks, in -ci -autolanding it doesn't, as I understood it using daily is unsafe 'cause you never know if what you've built against is really going to end up in distro [12:06] Saviq: yeah, and we have issues when there are some transitions [12:06] like today [12:06] that's why we need CI Airline :p [12:06] didrocks, ok, so nobody is doing anything wrong, just that our tools are not letting do us what we really need, gotcha [12:06] we'll just wait, then [12:06] Saviq: I guess yeah === alan_g is now known as alan_g|afk [12:18] didrocks: ogra_: so the issue of no space in /cache/recovery/ on mako is not due to image r25 remnants, i flashed a device which had r23 and it still occurs [12:19] weird [12:19] psivaa: ok, if you changed the recovery as well and you're sure you don't have image 25 recovery, one less ;) [12:22] on maguro this is not happening though === alan_g|afk is now known as alan_g === MacSlow is now known as MacSlow|lunch [12:31] didrocks: the only files in /cache/recovery/ are very small log files so they can not be the reasons. [12:32] i mean before flashing with r26 [12:41] check one level up in /cache [12:42] i think /cache is the partition, recovery/ is only a dir [12:46] ogra_: 32k on /cache [12:47] used you mean ? [12:47] yes [12:47] before flashing with r26 [12:48] thats not much ... and should really not cause out of space warnings [12:49] right during the flashing on *maguro i see 1069 M in /cach [12:49] *cache [12:51] ogra_: not sure what's the capacity though on each devices [12:51] http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_build.git;a=log;h=refs/heads/phablet-trusty [12:51] i see nothing that even touches recovery ... strange [12:52] but as you said it could be anything that goes into /cache [12:52] now it's 1386M [12:53] aha [12:55] http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_system_core.git;a=blobdiff;f=init/init.c;h=e5f265858d0e0890deff18424ab09b174a4ce3f0;hp=462130711e331820467e2b8c427c1e2f7be9cdf2;hb=a1527ce86c90842b693941723888f7739f3bf473;hpb=7a318cec11f017c5ae9c186dae2b01f2dc4d3ef4 [12:55] psivaa, does /sbin/recovery exist ? [12:56] sil2100: I hoped you wouldn't block the stacks, that's why I asked for publishing jobs only [12:58] ogra_: the installation on maguro is finished and now /sbin/recovery is not there.. but i dont know if it would have existed during the flash [12:59] psivaa, well, it should be existent in the recovery mode only [13:00] oh, meeting ... [13:00] * ogra_ totally forgot ... gimme a sec to relocate [13:02] sil2100: coming? [13:02] hey-- I'm looking at landing no 303 and wondering if now is the time to upload [13:03] (it is marked TODO) [13:03] didrocks: I'll be a bit late, start off without meee [13:05] Mirv: you told me that too late I guess, I just redeployed unity8 though [13:05] sil2100: you didn't wait for answer :) you also started a build, so it's now waiting for the scope build [13:08] Mirv: ok, so I'm not building/doing anything until I get a greenish light from you now ;p [13:08] Mirv: since I would need to rebuild ubuntu-settings-components as well... [13:09] But I'll do that after you're done ;) [13:09] sil2100: ok :) let's wait for the current build to finish, then I'll publish, then I'll let you a sign that you can publish / do what you want [13:10] ;) [13:12] sil2100: when the build job is published, isn't it in theory ok if I cancel the check job since you need to rebuil more anyway? [13:12] s/published/finished/ [13:13] Mirv: you can cancel everything besides build ;p [13:13] Mirv: since it's not being tested anyway anywhere [13:13] sil2100: yeah. what I think I'll do that I'll wait for build to finish, then cancel check, then do my publishing and then I can let it run check again via foo [13:15] Mirv: ok, but I think that for my needs a check re-run won't be needed anyway [13:15] Since for me it's enough that it builds, so I can just do a direct force publish after your publishing is done [13:17] right [13:17] rsalveti, could the above git commit have any impact on that "/cache running out of space" issue when flashing ? [13:18] sil2100: and actually the check didn't run since unity8 build happens to fail [13:18] Oh [13:18] we should look into that at some point too, and file a bug as usual, there just hasn't been time for that [13:20] and we probably need to remind the other team mates too that the usual cu2d vanguarding starts again now that it's up [13:21] Mirv: quite, publish mir then! [13:21] quick* [13:21] as the stack stopped :p [13:22] didrocks: already ... done! [13:22] sooo fast, it's not an airline, it's a jet \o/ [13:22] sil2100: so I'll wait still to see with my own eyes that the packages are in LP, and ping you after that [13:24] rsalveti, oooh, i think i see whats wrong with your patch ... you actually want the mkdir for /proc and /sys outside of the "if" in http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_system_core.git;a=blobdiff;f=init/init.c;h=e5f265858d0e0890deff18424ab09b174a4ce3f0;hp=462130711e331820467e2b8c427c1e2f7be9cdf2;hb=a1527ce86c90842b693941723888f7739f3bf473;hpb=7a318cec11f017c5ae9c186dae2b01f2dc4d3ef4 [13:28] psivaa, could you boot one of the failing devices into recovery and check that ? [13:29] ogra_: the devices are in the lab and the failing devices are not reachable by adb. [13:29] hmm [13:29] i dont know any other way to connect them. [13:29] so we need someone in the lab to check them ... ok [13:30] ogra_: yea waiting till rfowler comes online === MacSlow|lunch is now known as MacSlow [13:36] sil2100: so "ping", feel free [13:45] ogra_: that patch is fine [13:45] ogra_: the problem was the previous one that got us into a broken recovery [13:45] ogra_: http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_system_core.git;a=commitdiff;h=aefb9653a367a6fd528f2ad48a1cce8945aecb3c;hp=87b40ee4a93bd682506fa082f2c2d1a0f5a60f04 [13:46] rsalveti, well, that still doesnt create /proc or /sys [13:46] it tries to mount them to a nonexisting mountpoint [13:46] didrocks: I did trigger another one with the fix included in it [13:46] rsalveti: hum, it seems to have never gotten anywhere (we only had one new image) [13:46] rsalveti, though i still dont have any explanation for the out of space issue [13:46] or ogra's script is broken :p [13:46] didrocks, ? [13:47] ogra_: that is ok in our default image, because they are created by lxc [13:47] there are 20 and 20.1 [13:47] 20 is the broken one [13:47] yes [13:47] 20.1 is fine [13:47] ogra_: but you did build 20.1, no? [13:47] rsalveti, in recovery ? [13:47] ogra_: in recovery it'll be created [13:47] oh, i see what you are saying [13:47] yeah [13:47] ogra_: + if (stat("/sbin/recovery", &s) == 0) { [13:47] well, still, all makos run our of space when flaching [13:48] that's the fix I did yesterday when I noticed the regression [13:48] how? [13:48] 12:40:27 psivaa | ogra_: didrocks: not sure if that's known, but with image 26, we see "failed to copy [13:48] | '/var/lib/jenkins/phablet-flash/imageupdates/pool/ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz' to [13:48] | '/cache/recovery//ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz': No space left on device" on mako [13:49] rsalveti: ^ [13:49] rsalveti, ogra_: didrocks: not sure if that's known, but with image 26, we see "failed to copy '/var/lib/jenkins/phablet-flash/imageupdates/pool/ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz' to '/cache/recovery//ubuntu-418162f2a9ba9a320a2b460938e3d333f84b6f35826200c2a6b6c342ba1e21ef.tar.xz': No space left on device" on mako [13:50] didrocks: why did you remove the emulator? [13:50] :P [13:50] ogra_: right, but is 26 = 20.1? [13:51] rsalveti, he did an apt-get autremove ... [13:51] rsalveti, it is [13:51] rsalveti: yeah, I know you wanted to punish people doing that! :p [13:51] I'll note this for next time "never ever remove what rsalveti installed on your system" :) [13:51] ogra_: right, not sure if that is indeed using the cache partition, need to give it a try here [13:51] rsalveti, r26 works fine on my maguro with OTA upgrade [13:51] ogra_: I know the cdimage one works fine [13:51] just flashed it [13:53] ogra_: are we flashing the lab devices with -b? [13:53] could be, not sure [13:53] doanac` should know [13:53] or sergio [13:54] the devices in the lab use - phablet-flash ubuntu-system -b --channel trusty-proposed [13:55] psivaa: I just got to my desk, looks like there's some trouble this morning? [13:56] rsalveti, ogra_: we are using -b for quite some time now [13:56] plars: yes, a couple but the latest impacting one is that flashing on mako fail due to no space available error on /cache/recovery [13:56] let me flash my mako to see [13:57] the cdimage one worked fine [13:57] Mir 0.1.1 now in release pocket [13:57] and time for UDS [14:04] psivaa: we seem to have a lot of devices showing up with the 0123456789ABCDEF adb id again now [14:05] plars: yes, vila mentioned about it. dont seem to be an id that's used for smoke [14:05] psivaa: it's a generic one that's used when it can't sort out the real one [14:06] psivaa: we'll probably need rfowler to reflash them [14:06] plars: ahh ok, could be that we have a bunch of mako's that failed to flash due to the no space reason and they could be it [14:07] plars: on another issue, we had adb protocol fault in kinnara in the morning [14:07] psivaa: just one? :) [14:07] psivaa: we do get it a lot less frequently I think, but it still happens from time to time [14:08] even running 'adb devices' from kinnara constantly threw this error [14:08] plars: how do you normally recover from it [14:08] psivaa: oh, that's different [14:09] psivaa: in a case like that, I guess we'd just need to restart adb server... I'm not sure I've seen that happen before [14:09] psivaa: we added quite a few more devices to that system though [14:10] plars, psivaa, this will be fixed within the next weeks [14:10] plars: yea, it turned out to be one of the old adb processes was causing the issue [14:10] plars: ogra_: mako worked fine here with system-image [14:10] very weird [14:10] plars: killing that process fixed it [14:10] rsalveti: you flashed with -b also? [14:11] Installed: 1.0+14.04.20131108-0ubuntu1 is the phablet tools in the lab [14:11] plars: yes [14:11] i dont think it has anything to do with phablet-tools, after all the out of space message seems to come from the device itself [14:12] plars: http://paste.ubuntu.com/6448304/ [14:12] brb [14:14] ogra_: ok [14:22] ogra_: so I think the cache related issue could be a side effect from the broken image [14:23] rsalveti, thats what i thought too, but -b pushes a new recovery [14:23] so it should boot the fixed one [14:24] (it actually boots the new recovery via fastboot) [14:25] ogra_: right, but that doesn't necessarily mean that it'll clean up the cache partition [14:26] INFO:phablet-flash:Clearing /data and /cache [14:26] h,mm, i thought we do that from ubuntu_commands or how thats called [14:26] but it should in theory clean it as part of the flashing process [14:26] so was the bad recovery actually on image 25 then? [14:26] right [14:26] plars, yes [14:28] plars: ogra_ i tried flashing from image 23 as well [14:28] and the same thing happend [14:29] psivaa, if you use -b it weill use the new recovery ... and if you were unlucky you had r25 [14:30] though looking at the above it smells more like the cleanup step failed [14:30] indeed [14:30] but you should have seen that in the log [14:31] ogra_: I don't see anywhere in phablet-tools where fastboot erase is called, could be that we need to really erase those partitions [14:31] i dont think it calls erase [14:32] plars: the clean is a shell call done in recovery [14:32] yeah [14:32] not via fastboot [14:32] so that might be failing [14:32] rsalveti: yeah, it just does rm [14:48] plars: you might want to force a clean with fastboot then and retry to see [14:49] rsalveti: yeah, I'm trying to reproduce locally. Flashing 25 right now. [14:55] fastboot -w created too many support issues [14:55] yeah [14:56] well, i can imagine that if there was no /dev and no /proc in r25's recovery that the rm might indeed have failed ... i just dont get why it would also fail in r26 === alex-abreu|afk is now known as alex-abreu [15:22] plars: were you able to reproduce the issue locally? [15:23] rsalveti: yes, but psivaa said he's seeing the issue going from 23->26 also [15:23] oh =\ [15:23] rsalveti: I was also able to get it to flash just fine from 25->26 after it failed once... it was in recovery and not getting the bad adbid like we see in the lab [15:23] rsalveti: and I just added -d mako [15:23] rsalveti: it all just worked... so I'm trying to figure out what's going on === alan_g is now known as alan_g|tea === alan_g|tea is now known as alan_g [15:55] psivaa, rsalveti: so I just watched it get the same error when trying to go from 26->25 [15:55] wait, maybe not [15:55] this one might have just been an adb failure [15:56] adb/mtp thing probably [15:56] right [15:57] I think you might be able to extract the logs from the broken flags if you're able to boot the device [15:57] check /cache/recovery/last_log [16:22] rsalveti: ok, after coming back up, all I have is busybox and there's no fstab or /cache. If I mount mmcblk0p22 by hand though, I found what seems to be a cache partition there, and here's the last_log: http://paste.ubuntu.com/6448925/ [16:22] rsalveti: this was after installing 25, and then trying to install 26 [16:23] oddly: 'Skipping missing file: version-25.tar.xz' was in the log [16:25] plars: hm, right, but the log is from 25 I believe [16:25] weird [16:26] rsalveti: yeah, so maybe we didn't get far enough for that log to happen [16:26] if you are in busybox you are not in recovery [16:26] so indeed there is no /cache [16:26] yeah [16:26] since it happened on the adb push, that makes sense [16:27] plars: let me try to flash 25 to see [16:40] plars: got the issue here [16:40] plars: so, what happens, when flashing 25 with -b it will first flash the recovery image with fastboot [16:40] plars: then it'll try to load recovery, which will fail [16:41] plars: meanwhile it waits for adb, thinking that the recovery will be up [16:41] as recovery fails to boot, the device boots the previous image instead [16:41] and phablet-tools thinks that the recovery is then up, because adb is there [16:41] rsalveti: ah, I see. so we think we're booting 25 but really we're not [16:42] rsalveti: I see [16:42] http://paste.ubuntu.com/6449025/ [16:42] sergiusens: we should be checking for /sbin/recovery when flashing [16:42] sergiusens: to check if we're indeed in recovery [16:43] rsalveti, sure, I'm already checking for something; checking for recovery specifically shouldn't be complicated [16:43] rsalveti, my wonder is, what did it boot? [16:44] plars, any indications of why it failed to boot? [16:44] sergiusens: it fails to load the recovery, then it tries to boot the previous image [16:44] but it seems that the 25->26 migration worked fine [16:44] rsalveti, yeah, but why? :-) [16:45] so the error when copying the file into /cache happens because it's not really in the recovery [16:45] rsalveti: really? for me, I get a broken system - just boots to busybox [16:45] sergiusens: that's the default behavior [16:45] rsalveti, how could've recovery been broken :-) [16:45] sergiusens: we had a regression [16:45] rsalveti, that's the how I'm asking :-) [16:45] ack [16:45] rsalveti: you are flashing cdimage? or ubuntu-system image? [16:45] plars, rsalveti super easy to have a better check if we are in recovery, no worries [16:46] sergiusens: http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_system_core.git;a=commitdiff;h=aefb9653a367a6fd528f2ad48a1cce8945aecb3c;hp=87b40ee4a93bd682506fa082f2c2d1a0f5a60f04 [16:46] which was fixed at http://phablet.ubuntu.com/gitweb?p=CyanogenMod/android_system_core.git;a=commitdiff;h=a1527ce86c90842b693941723888f7739f3bf473;hp=7a318cec11f017c5ae9c186dae2b01f2dc4d3ef4 [16:46] plars: ubuntu-system [16:46] ogra_: can we get a new image build? [16:46] ogra_: now that Mir entered :) [16:46] strange... for me, I'm getting exactly what we have seen on multiple devices in the lab [16:46] didrocks: sure :-) [16:47] rsalveti: thanks! [16:47] didrocks, yeah [16:47] why does everyone get to push but me? :-P [16:47] it's christmas [16:47] rsalveti, do you already ? [16:47] I saw some snow this morning ;) [16:47] didrocks: having newer images is always a good thing [16:47] didrocks: easier to test and revert broken changes as well [16:47] we should just build one every hour ! [16:47] ogra_: nops [16:47] ogra_: yeah [16:47] then i'll do [16:47] rsalveti, ogra_ you will starve the testing infra though [16:47] rsalveti: fully agreed, we just need to have a way to communicate those to you and us more easily [16:48] didrocks: why do we need the communication here? [16:48] sergiusens, nah, we'll have our fast emulator and all will be fine [16:48] we don't [16:48] the communication is the changelog :-) [16:48] ogra_, big LOL [16:48] :) [16:48] rsalveti: well, we do plan landings on certain image/try to communicate that [16:48] didrocks: and that's wrong [16:48] ogra_, the emulator would be mostly useful for apps, but not image testing ;-) [16:48] didrocks: you should only know when something lands, not planning when it'll land [16:48] === Image r27 building === [16:49] rsalveti: will be easier once we have the CI Airline I hope to know more on feature-based what landed [16:49] ogra_, we can certainly only run on real hw if the emulator test pass [16:49] sergiusens, well, i thought the plan was to use it in all testing infra [16:49] we don't need to know when it'll be landing [16:49] we just need to know when it *landed* [16:49] well, we want to know when big thangs are landing at the same time [16:49] to avoid clashes [16:49] but apart from that, yeah [16:49] right, only have a test on real HW if all emu tests were good [16:50] didrocks, and that ifno comes from the changelogs [16:50] right, but even if we get clashes, we want more images to test all the steps [16:50] ogra_: note on a "feature-approach", and not planning clashes [16:51] rsalveti, the point is that asac wants us to have stuff tested in full context before we upload [16:51] nobody does that [16:51] (or do you run the full AP suite for an android change you do) [16:51] ogra_: right, but that's not a blocker here [16:51] having more images is *always* a good thing [16:51] thats true [16:51] but sergiusens has a point [16:51] we'd saturate the lab [16:52] right, that's our only limitation [16:52] but we can do as most images we can depending on how long it takes to validate an image [16:52] what would that be? [16:52] at every 4 hours? [16:52] rsalveti: in theory we can produce images in parallel and validate them [16:52] the problem is not parallelizing the validation [16:53] but the building [16:53] its not really clear what atomic snapshot you will take when you start the job i think :) [16:53] asac, we only talk about the actual images [16:53] not about any parallel test builds etc [16:53] (though you could mitigate that by pulling the packages file out of the tool and passing itas an input) [16:53] we currenntly only build two per day at most [16:54] right [16:54] and need to speed this up [16:54] so for now we talk about the future anyway because emulator is not there etc. [16:54] having a more fine grained view in the changelogs through building more of them will make identifying issues a lot easier [16:54] right, but I think we can already improve by building at least a few images a day [16:54] however, if the emulator is fully landed, i anticipate that we can produce as manages as we want a day. [16:54] from a builder POV we could build one every hour [16:54] as we always get stuff that lands in the archive which is not part of the landing spreadsheet [16:54] just not test all on all phones [16:54] so it's *always* good to produce and test newer images [16:55] 30min for cdimage + 15min for system-image post processing [16:55] we would basically validate all iamges in emlator and then pick those that we are interested in (for instance, because we consider them a release image) and test them everywhere [16:55] we discussed hooking image productionm up with the publisher runs [16:55] so we get one image somewhere stored for each injection of what really goes in the archive [16:56] has to wait for cloud buildd which is worked on somewhere [16:56] * cjwatson forces android-emulator back in despite p-m not quite understanding multiarch [16:56] p-m? [16:56] proposed-migration [16:56] ah [16:57] asac: https://rt.admin.canonical.com/Ticket/Display.html?id=62272 and https://bugs.launchpad.net/bugs/1247461 [16:57] Ubuntu bug 1247461 in launchpad-buildd "Move live filesystem building into Launchpad" [High,In progress] [16:58] right, that will give us parallel builds [16:58] asac: once an image + new packages is promoted don't all previous images need to be updated and retested? [16:59] not really. all we do is snapshotting the archive after a publisher run in the form of an image [16:59] why would they ? [16:59] so whatever is in there is already finally committed [16:59] until overwritten for which a new image would exist [17:01] we create an image for archive+A and archive+B in parallel, archive+A passes all tests so now archive = archive+A, if archive+B passes all tests we don't know that archive+A+B will pass all tests and need to test that scenario, right? [17:02] asac: right, but let's try to improve our current situation [17:02] josepht: yeah you are right, but you are thinking about a different level [17:02] how long does it take to test an image currently? [17:02] plars: ^ [17:02] josepht: you are thinking about the premerge level... there indeed we need to invalidate A after we land B [17:02] rsalveti: somewhere around 3 hours I believe [17:02] hwoever, we can try to be smart and only invalidate if the changes are in the same dependency tree for instance [17:02] rsalveti: assuming everything goes right :) [17:02] plars: but then, you rekick jobs, right? [17:02] didrocks: yes [17:03] josepht: we talk about images produced after the fact from whatever landed in the real archive [17:03] so how long until we get a clean state, I guess [17:03] and you rekicked enough? :) [17:03] didrocks: if I'm up and see it needs to be done, and there's not another image already coming [17:03] asac: ah I see [17:03] right, can we build a new image at every 6 hours then? [17:03] * asac hope that makes sense [17:03] or 8? [17:03] didrocks: for instance, if the devices could work right now, I wouldn't bother retesting because I see there's a new image on the way, and it would just delay [17:03] rsalveti: well. so i dont want random images [17:03] i want precisely cut images [17:03] asac: why not? [17:04] asac: that's *wrong* [17:04] its not :) [17:04] not sure how that can be wrong :) [17:04] asac: we want one image per change [17:04] or the most images we can [17:04] right. [17:04] otherwise it's a pita to find regressions [17:04] one image a day is *wrong* and only causes pain [17:04] one image per chagne... but now that we can't get that (for various reasons) short term, it doesnt mean we should give up [17:04] an image isn't any more precisely cut just because somebody asked for it manually [17:04] and cut images in the middle of changes :) [17:05] there shouldn't be a "middle of changes" [17:05] asac: give me one example [17:05] rsalveti: i am supportive of more images per day. just not at cronjob kicked [17:05] exactly [17:05] cjwatson: depends on which level you define changes :) [17:05] once it's in the archive, there's no "middle of changes" [17:05] get your packaging metadata right so that proposed-migration can migrate it properly [17:05] exactly [17:05] and then roll [17:05] we should put it in cron [17:06] manual image building requests are a workaround for lazy metadata maintenance [17:06] image is just an archive snapshot [17:06] "get your packaging metadata right" ... please give up on that hope. all you can hope for is "get youir packaging meta data right most of the times please" [17:06] but even if everyone does it right most of the times, if the engineering team is big enough you will always have problems :) [17:06] I'm not going to give up on that hope, because working on that basis made, empirically, a *massive* difference to the day-to-day quality of Ubuntu. [17:06] cjwatson, *my* manually built images are *allways* more precisely cut, i sand and olish them :P [17:06] *polish [17:07] In fact I think it's mostly reality rather than hope. [17:07] as long we're not saturating our test lab, we should be fine to produce the most images we can [17:07] exactly... its mostly [17:07] :) [17:07] Yes, there's the odd exception, but they're, well, exceptional [17:07] We shouldn't structure processes around exceptions [17:07] +1 [17:07] and will alwayus be mostly... but 600* mostly might mean rarely :) [17:07] the point of automated builds is to simply have an as small changeset as possible ... [17:08] doing only two manual builds completely goes against that [17:08] and causes lots and lots of guesswork on our side until we found the actual issue [17:08] yeah [17:08] its human. noone is perfect. hence designing the process must work in continuous exceptions due to human failures induced [17:08] horrible [17:08] see the udev one we had a few weeks ago [17:08] which wastes more developer time than having more images per day [17:08] the package was uploaded at wednesday and we found the issue at saturday [17:08] asac, its wasting human resources [17:09] ogra_: the problem can be solved by using a cronjob or by kicking builds again at a 3 times a day schedule [17:09] with shbuman smartness [17:09] asac, how often did we get onto the wrong path in the recent times if an image was broken [17:09] wasting hours to find it was the other change that we didnt look at yet [17:09] right, even if we build it at every 12 hours, it's already better than what we have now [17:09] rsalveti, we do [17:09] i agree that we should revisit the image schedule :) [17:09] ogra_: not currently [17:10] we usually have two manual built ones per day [17:10] well, it's manual [17:10] not sure if crontab is the right approach. lets talk about it tomorrow on the standup [17:10] (if there isnt UDS) [17:10] i would love to see at least 3 images :) [17:10] so the timing is already 12h [17:10] ogra_: right, I want it to be automatically triggered [17:10] ogra_: as the archive is always moving [17:10] right [17:10] 4 per day should be fine [17:10] automated and all [17:10] I'm completely boggled at the insistence that things must *not* be automated, frankly [17:10] can we try to put it back to the cron and see how it goes? [17:11] It goes against all the engineering practice I know [17:11] haha, exactly [17:11] cjwatson, all but asac and didrocks in here say we want automated [17:11] ogra_: I didn't say we don't want automation, I said that we want to cut after each big change as a first approach [17:11] I understand "automation is hard and we can't quite manage it yet" (have said that myself on various occasions) - I just don't understand "the ideal is manual" [17:11] right, but why ? [17:12] for the record: noone says "we dont want automation" :) [17:12] didrocks, you land all your stuff in one chunk anyway [17:12] lets discuss it tomorrow [17:12] i will join you guys there [17:12] where ? [17:12] OK, so you're arguing against regularity but not against automation, I see [17:12] standup [17:12] I disagree but at least that isn't so boggling :-) [17:12] hehe [17:12] its also not exactly what i am saying :) [17:12] asac, ah, the landing team one [17:13] but closer [17:14] asac: why tomorrow? [17:14] asac: let's just put it back to cron today and see how it goes [17:14] ogra_: if you land all the stuff in one chunk, well, you have to debug it, you decided the size of the chunk :) [17:14] why would that cause us any pain now [17:14] i want to first ensure people understand what we are saying :) [17:14] ogra_: we need to be able to produce images after each "chunk" landing, but a chunk is defined by the developer itself [17:14] I dont see us having any pain from having manual image building [17:14] it gives us many good [17:15] you can still manually trigger new images [17:15] but we should *always* have it in cron [17:15] didrocks, thats not true ... your chunks are controlled in the spreadsheet [17:15] because the archive is *always* moving [17:15] not defined by a developer [17:15] ogra_: I just want to avoid A + B entered, "A says I don't care, it's B", it's not me, and B says "I don't care, it's because of A" [17:15] so ... i know for sure that we will spin more than enough images once the landing stuff is fully back operational again [17:15] all I'm asking is to add it back in cron at every 8/12 hours [17:15] ogra_: well, not really in our current model, but yeah, we would need to define priorities rather than image #, agreed [17:15] hence, i believe the issue you try to solve is the low velocity due to the 1ss move paired with a UDS going on now :) [17:16] ogra_: and I think we do align with what rsalveti told seeing it that way [17:16] and yes, we should build at least 2 images a day even if not much is going on [17:16] didrocks, so we can switch cron back on and set it to say 6h ? [17:16] I just want to make sure we always have a fallback in case we don't have humans to trigger a new image [17:16] (to have 4 images / day) [17:17] well. as i said befeore the capacity on testing is really bound to emulator :) [17:17] ogra_: well, for me, it would be a +1 but rather 8 hours TBH, to let the CI team to rekick the tests jobs and giving some slack for that). 3 images a day will be a start [17:17] asac, we should have as many as the infra allows imho [17:17] so ... dont hurry [17:17] ogra_: yeah [17:17] ogra_: but I don't think I'm the one able to take the decision :) [17:17] ogra_: I'd first try with 8h [17:17] didrocks, why do you care if it is 4 or 8 ? [17:18] ogra_: and we can improve that as we go [17:18] ogra_: because then, the testing infra can't rekick the jobs [17:18] if your team only lands something every second image thats fine [17:18] i care because i want to ensur3e the CI folks that support those images whenthey hit validation [17:18] if a new image is produced [17:18] know what to do and how to prioritize [17:18] ogra_: they can't "restest an older image" [17:18] asac, but the CI team knows when cron runs [17:18] anyway [17:18] ogra_: it's not only that, most of their time is just "relaunch the jobs" [17:18] because the whole thing is flacky [17:18] not sure why is it suddenly such a big issue todya? [17:18] that's why let's try with either 8h or 12h [17:18] and as the one who does the manual builds i must say that the requirement for manual builds is largely always at the same time [17:19] asac: and so you need to reboot the device [17:19] rsalveti: +1 [17:19] so why not just have images inbetween too ? [17:19] rsalveti, i dont get it [17:19] ogra_: because you don't let the CI team to get test results [17:19] ogra_: we can, I just want to make sure we're not saturating our test lab [17:19] why does having 2 images help more than 4 [17:19] if you kick a new image [17:19] because we give them more time to rekick the jobs [17:20] when you finished building a new image [17:20] they can't retest an older one [17:20] didrocks, but if they dont make this image they will make the one in 4h [17:20] i dont get it [17:20] yeah [17:20] if noone can tell me why this is so much more important today than yesterday, lets really talk tomorrow in the standup. [17:20] :) [17:20] there is nothing that puts time pressure on anyone [17:20] ogra_: but then, there is the same issues, they need to rekick [17:20] that's why as long we're not saturating the lab, we're fine [17:20] ogra_: and you just rebuild the next next image [17:20] and the story repeats :) [17:21] just make sure you land your stuff inbetween two builds if you want to make sure all of it lands at the same time [17:21] asac: why can't we just add it there and see what happens [17:21] rsalveti: right now until we have the emuilator fully rolled out its about saturating support folks [17:21] and the schedule is known [17:21] didrocks, why do they need to rekick [17:21] or even care for the images [17:21] ogra_: repeating "because it's flacky" [17:21] rsalveti: understand taht i have to figure how to deal with that with CI process [17:21] asac: sure, but all I'm asking is more images than we have today, and that's for sure something we can improve even without the emulator [17:21] they just need to check the results for the image their change landed in [17:21] ogra_: see a vanilla test run [17:21] no matter when they landed their stuff [17:21] before plars or psivaa starts looking at it [17:22] you have a lot of tests not running [17:22] rsalveti: not without undersanding and clarifyuing with my team what they should focus on testing etc. [17:22] or with just 1 test [17:22] rsalveti: on technology side its easy [17:22] (and beeing representated as "green" btw :/) [17:22] ogra_: so, they restart the jenkins jobs as of now [17:22] if you guys feel so strongly about more images [17:22] right, but you don't get that we're also wasting resources by not triggering more images [17:22] lets really check how we can get that [17:22] because it's really a pain to find and revert regressions [17:23] i am not sayuing we shouldnt have more images [17:23] if we get tons of changes at every image we produce [17:23] just lets discuss how to do that brest [17:23] (btw, while we discuss, we still can't test an image I guess and powerpc is broken in distro ;)) [17:23] asac, it wastes immense amount of manpower to just make the CI team happy [17:23] while they would just have to obey to a cronned build schedule [17:24] finding an issue in an image that has 50 packages changed is lots and lots harder than finding it in one that only has 10 [17:25] +100 [17:25] it's just stupid to fight automation [17:25] right [17:25] as the archive is always moving [17:25] asac, i worked with this system for 4 months now ... and thats the result of my experience [17:26] we waste manpower and that doesnt need to happen [17:27] i am the first who wants more images. if the landing team would be operational they would at least spin 2 images a day [17:27] it should be closer to 3 [17:27] and its a trivial change ... just a crontab entry [17:27] and as we add more "smart automation" we woulc increase the pace [17:27] but why do we have to make it depend on the landing team [17:27] the landing team can land stuff even with automated builds [17:27] yeah, that's the wrong, as the landing team doesn't have the power to freeze the archive :-) [17:28] *that's wrong [17:28] its not like it is hard, the schedule will be known so you know when your stuff gets into the next image [17:28] didn't make at one image? wait for the next [17:28] right [17:29] just don't make everyone to wait on your call [17:29] cihelp: publisher on d-jenkins fails to create new jobs on public instance [17:29] about triggering a new image [17:29] example http://d-jenkins.ubuntu-ci:8080/job/trusty-adt-python-misaka/1/? [17:29] jibel, looking [17:32] retoaded, it is only the creation of a new jobs, publication of new builds work fine. [17:34] jibel, it wouldn't happen to be a matrix job would it? [17:35] Not sure who to blame, but having DNS on the CI VPN is pretty cool. Thanks! [17:37] retoaded, it is [17:37] jibel, http://10.98.3.6:8080/plugin/build-publisher/instance/0/publisherThread/currentState/output makes me think that it could be related to the just updated matrix-reloaded plugin. [17:37] jibel, i will probably have to revert it back. [17:40] retoaded, I'm not sure, I see no reference to the plugin in the trace. But if it's a recent change revert it back and we'll see [17:43] jibel, it near the bottom: Caused by: java.lang.NullPointerException at hudson.matrix.MatrixProject.onLoad(MatrixProject.java:474) [17:43] jibel, and as soon as the currently running jobs complete I'll revert [17:49] retoaded, it is a reference to a Matrix object from jenkins core, not the plugin. [17:51] === Image r27 DONE === [17:51] * popey updates [17:51] (totally forgot that over the abive discussion) [17:51] *above [17:53] retoaded, it is not a new problem apparently [17:53] retoaded, trusty-adt-simplejson failed to publish too [17:56] retoaded, oh, and it vanished from d-jenkins [17:56] WTF [17:56] ogra_, can I update my phone then? :P [17:57] retoaded, so on the filesystem there is a /var/lib/jenkins/jobs/trusty-adt-simplejson but not on the UI [17:57] Ursinha, well, my maguro just updated fine here [17:57] retoaded, and publication fails with the same error 500 [17:58] bah, since when is facebook on the home lens [17:58] ogra_, are you using -proposed? read-only? [17:58] yes [17:58] ogra_, at least since yesterday when I flashed mine with the current devel [17:58] well, i do OTA updates [17:59] Ursinha, heh, that was ricardos fault ... r25 was broken [17:59] ogra_, it rarely is rsalveti's fault, hehe [17:59] haha [17:59] never ever :) [17:59] rarely ;) [18:03] lol [18:41] ogra_, so I heard you test all the images :P I have a nexus 4 and whenever I put some music on and change to another app the music stops playing after a few seconds, is that a known issue? [18:42] it shouldn't [18:42] Ursinha, i test only on maguro ... [18:42] I just updated with the latest -proposed image [18:42] what image are you running Ursinha ? [18:42] popey, is the makoman [18:42] Ursinha: are you using the music app or something else? [18:43] popey, Home lens, click on music and then play === fginther changed the topic of #ubuntu-ci-eng to: Ubuntu CI Engineering Team | Vanguard: fginther | Landing instructions: http://paste.ubuntu.com/6292280/ | Known issues: Some services are down (1SS move), network slowness [18:43] about this phone says r27 [18:43] my phone is dead.. one moment [18:45] works fine here [18:45] popey, music app has the same behavior, but the interesting thing is that the music stops playing but when I return to the music app it resumes playing as if it never stopped [18:45] there's no sound but it seems the app was playing it [18:45] i am listening to music which is playing in the music app but I'm on the home screen [18:45] it even plays when locked [18:46] and continies to play after a song finishes [18:46] popey, what should I do then? [18:46] is it launching the music app ? [18:47] popey, yes, it launches and I'm able to listen to music, the only problem is when I change screens it stops after a few seconds [18:47] http://popey.com/~alan/phablet/device-2013-11-20-184641.png [18:47] like that [18:47] it's playing as it should (I think), displaying the cover and band/song name [18:47] was this a clean install or an update to an existing install? [18:48] popey, it was a clean install yesterday, with the current devel image [18:48] odd [18:48] I switched to -proposed today and the problem is the same [18:48] I dont understand why it's working for me and not you [18:48] hehe [18:49] popey, btw it keeps playing when screen is locked [18:50] popey, I'll try a clean install and let you know [18:51] http://people.canonical.com/~cjwatson/tmp/emulator-yay.png [18:51] didn't *quite* manage to get it up on the hangout just now ... [18:51] :) [18:52] thats quite a wide screen [18:52] dual monitor [18:52] * ogra_ has triple ... but screenshots always only show one screen [18:52] import(1) doesn't really understand that and just gives me the whole thing. I should probably use the screenshot thing in the desktop but old habits die very hard [18:52] might be nvidias fault [18:53] ah [18:53] yeah, i only use alt+print [19:09] cyphermox, kenvandine, robru, we need to restart q-jenkins as soon as possible [19:10] fginther, fine by me... [19:10] fginther, go for it [19:10] yeah, fine [19:10] there is a build in progress (for 11 hours) is it safe to kill ti? [19:10] it [19:11] fginther, yeah, whatever. we can restart it if necessary. i'm not waiting on anything myself [19:11] fginther, also 11 hours usually indicates a problem anyway [19:11] robru, I kinda figured that :-) [19:11] robru, kenvandine cyphermox thanks for the input === Ursinha is now known as Ursinha-afk === Ursinha-afk is now known as Ursinha [20:19] Ursinha: background music working fine with image 27, let me help you debugging the issue [20:19] rsalveti, okay, thanks [20:47] cyphermox, is it difficult to re-deploy all of the cu2d jobs? [20:48] well, i'd love to avoid it if possible, why? [20:49] cyphermox, the jenkins root moved from /var/lib/jenkins to /iSCSI/jenkins. and some of the scripts had the old path hard coded [20:49] cyphermox, I have MPs to update the scripts and the job. I just want a plan for how best to do this [20:50] would prefer to actually do the update tomorrow when more help is available [20:58] are you saying you would prefer or are you asking my opinion? [21:00] cyphermox, I just want some idea of what is involved? does it require lots of downtime, do we need to make special arrangments to make it happen, etc.? [21:03] not really, I think we just need to update the paths in lp:cupstream2distro and lp:cupstream2distro-config (if any there), redeploy all the jobs, etc, make sure all the necessary machines have their copies of those branches updated [21:03] it would take an hour of work maybe? [21:03] cyphermox, good. I do have MPs here: [21:03] but I could see it being done much more cleanly if we use a symlink as a temporary step to avoid breaking things [21:03] https://code.launchpad.net/~fginther/cupstream2distro/jenkins-home/+merge/195848 [21:03] https://code.launchpad.net/~fginther/cupstream2distro-config/update-stack_status/+merge/195956 [21:05] cyphermox, things are already broken :-( [21:06] how so? [21:06] the check jobs are looking for job data under /var/lib/jenkins. it's not there [21:06] right [21:06] that's partly why i mentioned a symlink [21:07] if not /var/lib/jenkins -> /iSCSI/jenkins then one for just cu2d [21:07] that allows a smooth transition, but not meant to stay in place [21:07] cyphermox, hmm, ok [21:07] so, both branches look fine [21:08] thanks, I'll email out a plan [21:10] fwiw I see other places in cupstream2distro-config that need a path change [21:47] cyphermox, ack! can you add them to the bug: https://bugs.launchpad.net/cupstream2distro/+bug/1252750 [21:47] Ubuntu bug 1252750 in Canonical Upstream To Distro "latest_autopilot_results hardcodes a jenkins root directory" [Undecided,New] [21:48] do you have an equivalent bug for cupstream2distro-config? [21:48] cyphermox, no [21:50] fginther, no CI after 5 hours? can you look please? https://code.launchpad.net/~robru/friends-app/autopilot-py3/+merge/195979 thanks [21:52] robru, all the phones are blocked due to an image flash problem [21:52] robru, it is being worked [21:53] fginther, oh ok thanks. just really curious about what happens with that branch ;-) [21:53] no worries [23:01] fginther, afraid makos started to get stuck in flashing, too: http://s-jenkins.ubuntu-ci:8080/computer/mako-04cb53b598546534/? http://s-jenkins.ubuntu-ci:8080/computer/mako-04cbcc545f5328a5/? [23:04] Saviq, thanks for the ping. Someone is working on it right now, but they have to manually reflash each device this time [23:06] Saviq, all three of them were dead a couple hours ago [23:06] curious: what happened to them? [23:06] fginther, ah [23:06] asac, AIUI there was a problem with the image today, plars do you know more? [23:08] ok thanks. nevermind then [23:09] were also still being hit by https://bugs.launchpad.net/phablet-tools/+bug/1249162 [23:09] Ubuntu bug 1249162 in android-tools (Ubuntu) "Devices lose adb connection after phablet-flash loop" [Undecided,New] [23:10] * asac wonders what a phablet-flash loop is :) [23:11] ok i think i see what it tries to describe [23:13] the phablet-flash loop is a way to reproduce the problem we see on our devices [23:13] check my comment please === fginther changed the topic of #ubuntu-ci-eng to: Ubuntu CI Engineering Team | Vanguard: cihelp | Landing instructions: http://paste.ubuntu.com/6292280/ | Known issues: Some services are down (1SS move), network slowness [23:47] asac, fginther Saviq recovery was broken, that was the problem [23:48] I'm adding a safeguard to phablet-flash to avoid flashing if recovery is broken