Jump to content


Photo

OCR of Greek text --> convert to Helena?


  • Please log in to reply
6 replies to this topic

#1 preterist1

preterist1

    Member

  • Active Members
  • Pip
  • 27 posts
  • Gender:Male
  • Location:Bradford, Pennsylvania USA
  • Interests:Preterist Theology and Eschatology, Josephus, Yosippon, Enoch, Judaica, and Apostolic Fathers (esp. Barnabas)
  • Accordance Version:10.x

Posted 15 March 2006 - 05:24 PM

I have a *book published over 100 years ago which has a lot of Greek material in it. I need to be able to have that Greek material in the computer so I can copy and paste it into the body of a scholarly article I'm writing.

I'm using the ReadIris Pro 11 (Corporate Edition) program to do the OCR. I am using it on a Powerbook G4 running OS 10.4 (Tiger). However, it does not recognize the Greek text and put it in the Helena font for me.

Is there something I need to do to setup the ReadIris program so that it will recognize Greek and put it into a Greek font for me. Does it need some other font besides Helena to do that (such as Teknia or LaserGreek or Symbol)? Or can it be done directly into Helena? Does it require me to "TRAIN" the OCR software in order for it to recognize it and put it into the right font and format?

Since you folk have done a LOT of OCR work with the Greek and Hebrew, I suspect you could give me a few tips on how to do this. Would be much appreciated.

Thanks in advance for your help on this.

-- Ed Stevens (preterist1@aol.com)

*The name of the book: The New Testament in the Apostolic Fathers (produced by the Oxford Society of Historical Theology, 1905, Printed in England by Henry Frowde at Oxford at the Clarendon Press).
Edward E. Stevens, President
International Preterist Association
www.preterist.org

#2 Guest_frgpeter_*

Guest_frgpeter_*
  • Guests

Posted 15 March 2006 - 06:26 PM

Did you select : Menu->Settings->Language... and select "Greek" from languages before "recognition"?

#3 preterist1

preterist1

    Member

  • Active Members
  • Pip
  • 27 posts
  • Gender:Male
  • Location:Bradford, Pennsylvania USA
  • Interests:Preterist Theology and Eschatology, Josephus, Yosippon, Enoch, Judaica, and Apostolic Fathers (esp. Barnabas)
  • Accordance Version:10.x

Posted 15 March 2006 - 09:28 PM

Did you select : Menu->Settings->Language... and select "Greek" from languages before "recognition"?


Yes, I selected "Greek-English" as the language, and it seemed to let me make corrections to some of the characters it couldn't recognize, but it didn't do a very good job recognizing the rest of the text. It didn't ask for learning help on all of them. It also did not seem to offer any help on the accents and breathing marks. I'm wondering if any OCR program is able to handle the accents and breathing marks? Do we simply ignore those for OCR purposes and use some kind of software conversion utility later to look at the unaccented text and apply the proper accents? Or do we have to do that manually?

How do the folks here at Oak Tree handle the OCR of Greek texts? How do they get the accents and breathing marks in there?
Edward E. Stevens, President
International Preterist Association
www.preterist.org

#4 Helen Brown

Helen Brown

    Mithril

  • Admin
  • 8,225 posts
  • Gender:Female
  • Location:heart in Israel
  • Accordance Version:10.x

Posted 15 March 2006 - 09:59 PM

We do not OCR texts at all! On the whole we receive our etexts and just have to convert, correct, and mark them up. That's why I did not reply before, I do not know if it is possible to get accurate OCR in Greek.

The texts that do need etexting are outsourced. The company we use is not cheap, but their work is excellent. If you are interested, write to me personally and I will put you in touch. They can convert to Helena, since that is what they use for us.
Helen Brown
OakTree Software

#5 Guest_frgpeter_*

Guest_frgpeter_*
  • Guests

Posted 16 March 2006 - 05:17 AM


Did you select : Menu->Settings->Language... and select "Greek" from languages before "recognition"?


Yes, I selected "Greek-English" as the language, and it seemed to let me make corrections to some of the characters it couldn't recognize, but it didn't do a very good job recognizing the rest of the text. It didn't ask for learning help on all of them. It also did not seem to offer any help on the accents and breathing marks. I'm wondering if any OCR program is able to handle the accents and breathing marks? Do we simply ignore those for OCR purposes and use some kind of software conversion utility later to look at the unaccented text and apply the proper accents? Or do we have to do that manually?


I tried both the Greek and the Greek-English. No accents, but does have the breathing marks.

I suppose it depends a lot on your scanner and its settings as well. Also the condition of the text ( you say it's 100 years old ) can also play into this. Perhaps the type on the page is "heavy"?

Sorry - not able to suggest anything else.

--G. Peter

#6 Alistair

Alistair

    Platinum

  • Active Members
  • PipPipPipPipPip
  • 514 posts
  • Gender:Male
  • Accordance Version:10.x

Posted 16 March 2006 - 05:45 AM

On an obliquely related note, I recently discovered a wonderful website with links to PDFs of scanned manuscripts and printed Bibles, including Codices Alexandrinus, Sinaiticus and Vaticanus, Tiscehndorf's NT, Scrivener's NT, Stephanus (1546 & 1550), Scrivener (1881), Erasmus (1516, 1518, & 1522), Elzevir (1624, 1633), Beza (1565, 1588, & 1598), the Complutensian Polyglot, etc etc etc.

I downloaded The Lot, obviously (6.62 GB!), but I did not record the URL of where I found them (oops!).

I don't know, therefore, if this comment is helpful or not.

But if you trawl the web, you can find some amazing stuff.

~Alistair

#7 preterist1

preterist1

    Member

  • Active Members
  • Pip
  • 27 posts
  • Gender:Male
  • Location:Bradford, Pennsylvania USA
  • Interests:Preterist Theology and Eschatology, Josephus, Yosippon, Enoch, Judaica, and Apostolic Fathers (esp. Barnabas)
  • Accordance Version:10.x

Posted 16 March 2006 - 10:54 AM

We do not OCR texts at all! On the whole we receive our etexts and just have to convert, correct, and mark them up. That's why I did not reply before, I do not know if it is possible to get accurate OCR in Greek.

The texts that do need etexting are outsourced. The company we use is not cheap, but their work is excellent. If you are interested, write to me personally and I will put you in touch. They can convert to Helena, since that is what they use for us.


That was very helpful, Helen. Thanks! I'll pass on the outsourcing idea. That sounds too expensive for me. I don't need the whole book, only about two chapters. So, I'll just grit my teeth and struggle through text entry on my keyboard! Ugggh!
Edward E. Stevens, President
International Preterist Association
www.preterist.org




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users