this post was submitted on 01 Jan 2024
20 points (100.0% liked)

Free and Open Source Software

17901 readers
4 users here now

If it's free and open source and it's also software, it can be discussed here. Subcommunity of Technology.


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago
MODERATORS
 

Forgive my ignorance, but I've got a question concerning OCR tools. Until now, I have utilized a paid service to upload, scan, convert them to searchable documents, and store my handwritten Uni notes. Handwritten because, frankly, my brain seems to engage with the content "better" than by digital note-taking.

It worked fine for what I needed, so I have never investigated open-source or had actual ownership/control over my uploaded notes before. As my work expands and the database of notes grows, maintaining data privacy is a huge concern, and I do not want to use the same system for interviews and such. My Uni has been, well, unhelpful sadly.

Are there any recommendations for having a similar system that puts more control and privacy in my hands?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] Samsy@lemmy.ml 3 points 10 months ago (1 children)

I work in a digitalisation environment, we use OCR in different ways, sometimes with tesseract and sometimes with adobe. Both are differently effective. Tesseract needs training and adobe has mostly a propetary better recognition. Handwriting is mostly a special part which needs manual control.

In my private environment I use a mix with paperless-ngx (which only does tesseract-ocr if it doesn't is already OCR recognised). Paperless is able to change and export the output of the PDFs in a json database which I partly convert to trilium (a database based notebook).

Didn't found a better solution yet and it isn't mostly not handwritten.

[โ€“] its_me_xiphos@beehaw.org 2 points 10 months ago

I have some reading and learning to do, and I appreciate your reply.