=== _thumper_ is now known as thumper | ||
blahdeblah | ec0, xavpaice: woutervb tells me one of you might be able to help with testing of my fix for https://github.com/paulgear/ntpmon/issues/5 | 09:34 |
---|---|---|
blahdeblah | I've also been talking to ganso from support about the same issue. | 09:34 |
manadart | stickupkid, achilleasa: Still chasing a review of https://github.com/juju/juju/pull/12118 | 09:48 |
achilleasa | manadart: on it | 09:49 |
achilleasa | manadart: did you see my comment in 12111? | 09:49 |
manadart | achilleasa: Yep. | 09:49 |
manadart | Thanks. | 09:50 |
ec0 | [blahdeblah](https://matrix.to/#/@freenode_blahdeblah:matrix.org) sure thing, although I haven't seen that issue myself. I'm not sure about converting those conditions to warning though, or did you have something else in mind? | 11:00 |
stickupkid | manadart, here is the thing we just discussed https://github.com/juju/description/pull/91 | 11:42 |
manadart | stickupkid: OK, gimme a bit. | 11:44 |
Hybrid512 | Hi everyone | 12:02 |
Hybrid512 | Can I bother somebody regarding prometheus-ceph-exporter from the "-next" branch ? | 12:03 |
Hybrid512 | there are great improvements in this branch that we'd love to use but deployment always fail due to this bug : https://bugs.launchpad.net/charm-prometheus-ceph-exporter/+bug/1895531 | 12:04 |
mup | Bug #1895531: -next fails to deploy with TypeError: 'str' object is not callable in ceph_client.auth() <Prometheus Ceph Exporter Charm:New> <https://launchpad.net/bugs/1895531> | 12:04 |
xavpaice | blahdeblah, to be honest we've disabled that check on the clouds where it was a problem, because of the noise. Re testing, we didn't have a reliable reproducer so confirming yay or nay is going to be tricky | 20:16 |
blahdeblah | ec0: Yeah - I am not planning to convert either problem from critical to warning; just planning to prevent the NaN from leaking through to the alert value. | 21:42 |
blahdeblah | xavpaice: Understood re: the noise and the difficulty of reproducing. ganso has a couple of clouds where he seems able to reproduce it fairly frequently, so I'll work with him on that. | 21:42 |
blahdeblah | Mostly just wanting someone to test the patches, and if possible, do a code review on an upcoming test suite addition. | 21:42 |
ec0 | @blahdeblah - that makes sense to me, if you get a patch together I'll review & test | 21:43 |
ec0 | great to see you still hilight on NTP in a round-about way :) | 21:44 |
blahdeblah | ec0: Actually jsing poked me about it a few weeks back. :-P | 22:12 |
blahdeblah | ec0: Also, drewn3ss submitted https://github.com/paulgear/ntpmon/issues/6 a while back, but the Nagios check is stateless. | 22:34 |
blahdeblah | Given that you're just muting the check at the moment, I'm reluctant to invest time on introducing state management. | 22:34 |
blahdeblah | The alternative is using telegraf -> prometheus and adding a minimum time period to the check in prometheus alerter. | 22:34 |
ec0 | well, we shouldn't be muting it, frankly | 22:34 |
blahdeblah | I agree, but when it's hard to find time to make progress on actually fixing the reason for the sync failure, and it's intermittent, I can understand making that choice... | 22:35 |
blahdeblah | I've also got limited time I can put into this, and I feel like it's probably better spent making better tests and helping ganso fix the underlying cause of the sync failure (at least in the 2 clouds he's working on at the moment). | 22:38 |
ec0 | totally understand | 22:45 |
ec0 | the other way to approach it is we could move it into a shared namespace and have some of the people reporting these issues help to contribute and review | 22:45 |
blahdeblah | Happy to consider that - any suggestions as to where? | 22:50 |
ec0 | we could set something up on Launchpad maybe? | 22:56 |
Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!