[06:12] <fabbione> morning
[06:14] <fabbione> braddr: i can't access the ALOM on your box
[06:14] <fabbione> braddr: did you unplug it?
[06:32] <braddr> crap.. damage from my nat box croaking
[06:33] <braddr> give me 10 minutes, I'll move it to a public ip address
[06:33] <fabbione> take the time you need :)
[06:33] <fabbione> even 20
[06:35] <fabbione> oh i see
[06:35] <fabbione> ok please don't rush
[06:35] <fabbione> both davem and I can reproduce the problem now
[06:36] <fabbione> and he is working on a fix locally
[06:36] <fabbione> your box was to do the last test once we have the fix
[06:36] <fabbione> it's not required right now
[06:36] <braddr> excellent, I guess.  Won't take me but a few minutes to reconfig the network info, I've got spare ip addresses
[06:41] <braddr> ok.. reconfigs, I updated the host entry for it, so you can still get to it via the t2000-sc hostname
[06:41] <fabbione> thanks
[06:41] <braddr> no problem
[06:41] <braddr> that just leaves my tivo off the air. :)
[06:42] <fabbione> :)
[06:42] <fabbione> it works.. thanks
[06:43] <braddr> heh.. come to think of it, I could have just told you to use minicom on bellevue directly since either way it's all the same speed.
[06:43] <fabbione> ehhe
[06:43] <fabbione> true that
[06:46] <braddr> any idea what changed that caused your respective boxes to start exhibiting the problem?
[06:46] <fabbione> davem removed 8GB of RAM from his box to go down to 8Gb
[06:46] <fabbione> mine started exhibiting the problem with one image that i was using to fix your
[06:46] <fabbione> s/fix/test
[06:47] <fabbione> so basically it was just a matter of changing something else
[06:47] <braddr> interesting
[06:47] <fabbione> probably mine didn't show it immediatly because it has more hw inside
[06:47] <fabbione> like 2 PCI-E controllers
[06:47] <fabbione> who knows..
[06:48] <fabbione> davem just disappeared testing...
[06:48] <fabbione> i am sure he will find a fix
[06:50] <braddr> for what it's worth.. still nothing from sun.
[06:51] <braddr> I've been avoiding asking them what's up.. I'm over 2 weeks over due now. :)
[06:51] <fabbione> yeah
[06:51] <fabbione> don't ask :)
[06:51] <fabbione> i should return IBM hw as well
[06:52] <braddr> been a while since I got to play with any interesting ibm hardware.
[06:52] <braddr> their SP frames are kinda interesting.
[06:52] <braddr> huge though
[06:54] <fabbione> eheh
[06:54] <fabbione> nah i got a small machine from them to do some research
[06:54] <fabbione> nothing fancy
[06:55] <fabbione> but i should have returned it last week
[07:28] <braddr> 2 boots in a row..
[07:29] <fabbione> both with the wrong image
[07:29] <fabbione> and there was still corruption
[07:29] <fabbione> Unknown localized field:
[07:29] <fabbione> Description-mk.UTF-8:
[07:30] <fabbione> that can't happen
[07:30] <braddr> I wasn't sure if that was the kernel or the installer
[07:30] <fabbione> it's memory corruption ;)
[07:32] <fabbione> -13 with a hack seems good
[07:59] <fabbione> nevermind
[08:24] <braddr> no corrupted string that time I see
[08:24] <braddr> are hi5's in order?
[08:24] <fabbione> this one looks good yeah
[08:24] <fabbione> not yet
[08:24] <fabbione> this is still a hack
[08:24] <braddr> well, one for progress regardless
[08:25] <braddr> btw, I never did check to see what was on disk1 to make sure it was ok to reformat/reinstall
[08:25] <fabbione> no need to
[08:25] <fabbione> since we can reproduce it locally, we can scratch here
[08:26] <braddr> no dhcp server right now -- soon it'll be back.
[08:27] <fabbione> no problem
[08:27] <braddr> 209.189.198.125/255.255.255.224  gw .97
[08:27] <fabbione> i don't need it :)
[08:28] <braddr> didn't know how far into the install you were gonna go
[08:29] <fabbione> :)
[08:29] <fabbione> i did check enough to say that the initrd was not corrupted
[08:34] <fabbione> looks good
[08:34] <fabbione> the image i mean
[08:35] <fabbione> i have seen that before
[08:35] <fabbione> we have the aoe support in the kernel
[08:35] <fabbione> it's just another block device over ethernet
[08:35] <braddr> yupp
[08:35] <fabbione> i used it a lot to do cluster testing ;)
[08:35] <fabbione> before i got a real SAN
[08:36] <braddr> yeah yeah.. lucky you. :)
[08:36] <braddr> feel free to ship me your excess toys.
[08:36] <fabbione> i don't use it 24/7
[08:36] <fabbione> it's too expensive to run at home
[08:36] <fabbione> and very very warm
[08:36] <fabbione> you really need a/c for that
[08:37] <ajmitch> sounds perfect for me at the moment
[08:39] <fabbione> braddr:  so -13 has a hack with a fake page_size of 128 * 1024 that doesn't work
[08:39] <fabbione> -14 has 256*1024 and it works
[08:39] <fabbione> but it's still a hack
[08:39] <fabbione> now time to produce a final fix
[08:39] <braddr> and -11 and -12?  last one in my notes was -10
[08:39] <fabbione> ajmitch: did you ever get access to your |Viagara boxes?
[08:40] <fabbione> braddr: oh hell.. wait.. let me remember...
[08:40] <fabbione> -11 was a broken patch
[08:40] <braddr> not really that important, but if you remember I can shove it in the log
[08:40] <fabbione> -12 the same patch using the proper PAGE_SIZE (that clearly doens't work
[08:42] <ajmitch> not proper access at the moment
[08:42] <braddr> boots.txt updated to record those notes, but didn't bother capturing the sequence of events from tonight
[08:43] <fabbione> no problem
[08:43] <fabbione> we are close to a solution now
[08:43] <braddr> seems like it
[08:43] <fabbione> ajmitch: sucks to be you :)
[08:43] <fabbione> i got a T2000 with 32 threads 32GB of ram and 1.2Ghz proc
[08:44] <fabbione> the top class ;)
[08:44] <ajmitch> nice :)
[08:44] <braddr> of course, you're actually using yours. :)
[08:44] <fabbione> ehhe clearly
[08:45] <braddr> without considerably faster disks, my primary usage wouldn't even really keep that many threads busy.
[08:46] <fabbione> remember that this box is designed for http stuff
[08:46] <braddr> yup.
[08:46] <braddr> I keep my 16 mostly busy
[08:47] <fabbione> i can keep much more than that busy for what i do ;)
[08:47] <fabbione> as soon as i release, i want to install the the Niagara at the datacenter and play distcc or something ;)
[08:48] <braddr> I'd just started playing with using /tmp under solaris.. using linux and tmpfs I ought to do a lot better
[08:48] <braddr> but seeks and solaris' slower file systems were hurting me badly.
[09:06] <fabbione> the only reason i would use solaris it's for the hotplug support in their kernel
[09:06] <fabbione> for all the hotadd/hotremove of the hw
[09:06] <fabbione> otherwise it can screw
[09:13] <braddr> agreed.. once we're done here linux is primary, though I'll keep solaris around since I'm porting a compiler to support both on sparc
[09:16] <fabbione> eheh