Jump to content

Import USFM Bible


bible4all

Recommended Posts

Our organization promotes Bible translation around the world. I use and enjoy Accordance for my study, but would like an easier way to import user Bibles in USFM format. For those of us in the translation, USFM is the standard for literally thousands of translations that are either finished or in process. Recently, a number of open-source Bibles have been created and available through sites such as http://ebible.org and http://openenglishbible.org.​ I expect these resources to continue to increase, and they are all in USFM.


From all I have been able to gather on the forum, it is only possible to import a text file. The amount of editing it takes to convert USFM to the proper Accordance format is daunting. Since USFM is such a significant format, are there any plans to make it possible to import USFM texts in the future?

  • Like 5
Link to comment
Share on other sites

 

The amount of editing it takes to convert USFM to the proper Accordance format is daunting.

I scanned the markup of a file, and it doesn't look that daunting. Stripping the poetry markup in no biggie. The whole thing can't be done, though, with a series of 'replace all' searches, because there's need for some intelligence in copying over chapter references, etc.

 

If I get a slow patch in a month, I can do one.

  • Like 1
Link to comment
Share on other sites

Well, looked at the docs, and see how daunting it could be.

Problem is, it is not a markup language for bible texts... it is a markup language for bible publishing, with footnotes, commentary, etc.

 

However, you simply need to identify only the markers that have bible text following them... and delete all the markers and their text that are not in that group... (tables, footnotes, embedded paragraphs, etc.).

  • Like 1
Link to comment
Share on other sites

However, you simply need to identify only the markers that have bible text following them... and delete all the markers and their text that are not in that group... (tables, footnotes, embedded paragraphs, etc.).

One man's simple is another man's ... Oh well, I guess that's not a saying. Anyway, I spent quite a few hours trying to fix up a version for import and didn't get very far. It is not just a matter of deleting some of the codes. It is also necessary to change the format of all of the codes that are necessary. I'm sure a skilled programmer could do it faster. But it would be great if the programmers worked out a utility to do that one time and we could all benefit from it.

  • Like 1
Link to comment
Share on other sites

  • 9 months later...

 But it would be great if the programmers worked out a utility to do that one time and we could all benefit from it.

+1

Edited by Fabian
  • Like 1
Link to comment
Share on other sites

I'm going to regret this but I'll ask. Is there a document describing the USFM file format ?

 

Had a look at the doc. As Joe points out the format has one foot in the publishing world.

On the upside there are at least a couple of Python libraries published on this. I do not know how good they are but they exist. And there are a couple of converters to various formats, none I can see to UB import for Acc. Given the extensive nature of the markup you need a parsing library to do the heavy lifting. After that you need to discard lots, and I suspect in some cases lots really is LOTS. You need to extract the text from amongst the formatting and commentary and so on. You'll lose almost all the formatting.

 

So yep it's a bunch of work, most of it to do with filtering out stuff that is not actually the biblical text.

 

So many formats ....

 

Thx

D

Edited by Daniel Semler
Link to comment
Share on other sites

Hey Eric,

 

Interesting. He has an ambitious road map, trying to create a bible format translation tool covering as many input and output formats as he can. The closest thing to an Accordance User Bible compatible import that might get half or more of the work done would be the simple HTML output format designed just for reading. You could convert to that and then strip tags and such. It's a pity it's not Python as that's what I've been using - it's C# which is why you need Mono for unixy installs. He plans but does not yet have output formats for various bible software apps. Curiously absent is any mention of TEI. I haven't looked at USFX yet and I don't know how much existing stuff there is in USFM. XML is a bit easier to handle because their are decent XML libraries that can just suck it in. It would be nice if USFM had similar support. There are several libraries around for it though. It would be interesting to see a comparison of USFM/X and TEI or the IGNTP derivative. Hmmm.... I'll have to hunt around.

 

Thx

D

Link to comment
Share on other sites

 I haven't looked at USFX yet and I don't know how much existing stuff there is in USFM. XML is a bit easier to handle because their are decent XML libraries that can just suck it in. It would be nice if USFM had similar support. 

I think he has a USFM to USFX converter and vice versa. You can contact him via e-mail, I think.

  • Like 2
Link to comment
Share on other sites

  • 1 year later...

By chance, has anyone taken this further? Not having my translations, (and a few others that are also in USFM) side by side in Accordance is frustrating to the point where I think I'll have to sit down and work out how to write a translator of some sort. 

 

The USFM part seems easy, its more working out the accordance format. Is it documented clearly anywhere? I did find this link (below), but it seems like it may be somewhat out of date, it says the files are required to be in "Mac Roman" encoding? :D

 

http://accordancefiles2.com/helpfiles/STC/content/topics/05_dd/preparing_the_text-ub.htm

 

(It seems Apple replaced Mac Roman as the default with UTF8 in 2001, so it seems that this document may be extremely out of date. https://en.wikipedia.org/wiki/Mac_OS_Roman )

Edited by Ιακοβ
  • Like 1
Link to comment
Share on other sites

I eventually found what I think is a more updated version of this file:

http://accordancefiles2.com/helpfiles/OSX11/content/topics/05_dd/preparing_the_text-ub.htm

I hope to have time to work on a basic USFM converter after the coming exam period. I'll be sure to share it here once I get around to it. :)

  • Like 2
Link to comment
Share on other sites

There was a Software called Simple Bible Reader which was created to convert files to the different Bible softwares, unfortunately the creator has declined the development. 

 

Maybe it is possible to get in touch with him. 

 

Greetings

 

Fabian

Link to comment
Share on other sites

  • 4 weeks later...

Thanks for the updates. One of our future projects is to develop a converter from USFM to other formats. Perhaps when that is complete we will be able to import our translations from another format.

  • Like 3
Link to comment
Share on other sites

As a starter, ive done USFM to HTML, but I need to spend some time in the Accordance documentation before I can do USFM to accordance. I hope to get it done in the next few months.

  • Like 2
Link to comment
Share on other sites

  • 1 year later...
  • 3 months later...

+1 for this feature

and

+1 for support of the free OSIS format.

 

Greetings

 

Fabian

Link to comment
Share on other sites

  • 2 years later...

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...