/srv/irclogs.ubuntu.com/2022/12/15/#ubuntustudio-devel.txt

OvenWerksEickmeyer: is OCR of any importance? That is taking a pdf that is a scan and converting to text.16:04
EickmeyerOvenWerks: Yeah, in pdfs it's a nice-to-have because it makes the PDF searchable by embedding the text in the PDF. a PDF is merely an image, but having the text embedded is certainly nice.16:05
OvenWerksCalibre is supposed to do that, but when I tried to use it I got a zero byte output file16:05
EickmeyerCalibre has a tendency to be completely broken at times, so I'm not surprised.16:05
OvenWerksI installed gimagereader-qt5 and it complained about no language file16:07
OvenWerksso I installed tesseract-ocr-eng16:08
OvenWerksThat worked pretty good16:08
OvenWerksIt may be that calibre also neede that installed16:09
EickmeyerThat's possible, it might be under suggested packages.16:09
Eickmeyertesseract is the defacto standard for OCR.16:09
OvenWerksNot in recomends16:10
OvenWerksMuon does not allow highlighting text in its depends window :P16:11
EickmeyerObviously not in recommends otherwise it would be installed with it. It might be in suggests though. I'd look but ERR:NotEnoughTimeRightNow16:11
EickmeyerI just looked in apt-cache depends calibre and it doesn't show up as a suggests. Should probably be.16:13
OvenWerksNo worries, tesseract being installed may be less than useful unless the user's language file is also installed16:13
EickmeyerDirect sync from Debian, might be worthy of a "wishilist" bug report.16:13
OvenWerksI'm converting a PDF to chordpro. I am getting tired of showing up to practice with always the wrong key printed out.16:15
EickmeyerOoof, yeah.16:22
OvenWerksEven with tesseract installed calibre doesn't work for me.16:23
OvenWerksSo probably I am doing something wrong :)16:23
OvenWerksgimagereader works just fine so I will use that.16:24
OvenWerksMost songs we can use from CCLI but there are a few that are written locally.16:24
EickmeyerYeah, I'd say try the calibre snap as an experiment but it looks unmaintained.16:24
OvenWerksFor what I am doing calibre is really not the right tool anyway. I am not making a book, just a page.16:26
OvenWerksAnd I am editting the output anyway.16:26
OvenWerksEickmeyer: After reading through some more of calibre's docs. I have come to the conclusion that OCR is not included in the program and whoever I was reading that said it did was mistaken (plus one for google). It seems that some PDF documents have a scan that is what we see and also include an OCRed text portion in the file. Clibre is able to detect this text and grab it but the OCR has to 20:35
OvenWerkshave already been done elsewhere.20:35
EickmeyerAh, that explains a lot.20:35
OvenWerksSo maybe look at gimagereader-qt if we want an OCR app.20:35
* OvenWerks guesses the gimagereader (minus the -qt5) is a gtk app20:36
EickmeyerMost likely. We could throw that in the publishing seed, which doesn't get installed by default anymore.20:37
OvenWerksI would include at least the english language file it needs20:38
EickmeyerWhat's the package name on that?20:39
Eickmeyer(fwiw, it looks like we already seed libtesseract5 somehow)20:40
OvenWerkstesseract-ocr-eng20:40
OvenWerksMaybe there is another application that tries to do ocr?20:40
EickmeyerIt's wanted by the graphics and video tasks, so it's in there somewhere.20:41
OvenWerkslibtesseract4 in the LTS20:41
EickmeyerEither way, I have no issue adding gimagereader-qt and at least tesseract-ocr-eng20:43
OvenWerksIt's sort of a scanner like application20:45
EickmeyerRight, kinda like gscan2pdf but with OCR I'd imagine, therefore much more handy.20:45
OvenWerksNot needed to create but a utlility20:45
EickmeyerFYI, tesseract-ocr and tesseract-ocr-eng are circular-deps of each other, so I'll just add tesseract-ocr.20:47
OvenWerksEickmeyer: sure, I picked the language one because thats what my error came up with as missing20:48
OvenWerksIt did add some deps too20:48
EickmeyerHeh, interesting.20:48
EickmeyerI'm keeping skanlite since that still can access network scanners, which most scanning applications cannot.20:49
OvenWerksI still install simple-scan20:50
OvenWerksit looks ugly but I am used to it20:51
EickmeyerHeh, no worries. You make your entire desktop ugly, but I don't judge. :)20:51
OvenWerksyou might consider simple-scap beautiful then20:52
EickmeyerHehe20:52

Generated by irclog2html.py 2.7 by Marius Gedminas - find it at mg.pov.lt!