Jump to content


Photo

Import USFM Bible

import usfm user bible

  • Please log in to reply
10 replies to this topic

#1 bible4all

bible4all

    Member

  • Members
  • Pip
  • 4 posts
  • Accordance Version:10.x

Posted 25 July 2014 - 01:05 PM

Our organization promotes Bible translation around the world. I use and enjoy Accordance for my study, but would like an easier way to import user Bibles in USFM format. For those of us in the translation, USFM is the standard for literally thousands of translations that are either finished or in process. Recently, a number of open-source Bibles have been created and available through sites such as http://ebible.org and http://openenglishbible.org.​ I expect these resources to continue to increase, and they are all in USFM.
 
From all I have been able to gather on the forum, it is only possible to import a text file. The amount of editing it takes to convert USFM to the proper Accordance format is daunting. Since USFM is such a significant format, are there any plans to make it possible to import USFM texts in the future?

  • EricC and Timothy Jenney like this

#2 EricC

EricC

    Bronze

  • Active Members
  • PipPip
  • 88 posts
  • Gender:Male
  • Accordance Version:11.x
  • Platforms:Mac OS X, iOS

Posted 25 July 2014 - 02:10 PM

+1 (or as many votes as I'm allowed to cast)



#3 Joe Weaks

Joe Weaks

    Platinum

  • Active Members
  • PipPipPipPipPip
  • 1,093 posts
  • Gender:Male
  • Location:Odessa, TX
  • Interests:I like things that are Orange, and possibly Blue.
  • Accordance Version:11.x

Posted 25 July 2014 - 04:45 PM

 

The amount of editing it takes to convert USFM to the proper Accordance format is daunting. 

I scanned the markup of a file, and it doesn't look that daunting. Stripping the poetry markup in no biggie. The whole thing can't be done, though, with a series of 'replace all' searches, because there's need for some intelligence in copying over chapter references, etc. 

 

If I get a slow patch in a month, I can do one.


Joe Weaks
The Macintosh Biblioblog

Sometimes I'm so helpful even I can't stand it.

#4 Joe Weaks

Joe Weaks

    Platinum

  • Active Members
  • PipPipPipPipPip
  • 1,093 posts
  • Gender:Male
  • Location:Odessa, TX
  • Interests:I like things that are Orange, and possibly Blue.
  • Accordance Version:11.x

Posted 25 July 2014 - 04:57 PM

Well, looked at the docs, and see how daunting it could be.

Problem is, it is not a markup language for bible texts... it is a markup language for bible publishing, with footnotes, commentary, etc.

 

However, you simply need to identify only the markers that have bible text following them... and delete all the markers and their text that are not in that group... (tables, footnotes, embedded paragraphs, etc.).


Joe Weaks
The Macintosh Biblioblog

Sometimes I'm so helpful even I can't stand it.

#5 bible4all

bible4all

    Member

  • Members
  • Pip
  • 4 posts
  • Accordance Version:10.x

Posted 25 July 2014 - 05:51 PM

However, you simply need to identify only the markers that have bible text following them... and delete all the markers and their text that are not in that group... (tables, footnotes, embedded paragraphs, etc.).

One man's simple is another man's ... Oh well, I guess that's not a saying. Anyway, I spent quite a few hours trying to fix up a version for import and didn't get very far. It is not just a matter of deleting some of the codes. It is also necessary to change the format of all of the codes that are necessary. I'm sure a skilled programmer could do it faster. But it would be great if the programmers worked out a utility to do that one time and we could all benefit from it.


  • Fabian likes this

#6 Fabian

Fabian

    Gold

  • Active Members
  • PipPipPipPip
  • 471 posts
  • Gender:Male
  • Interests:www.internetkirche.com
    www.iglesia-del-internet.com
  • Accordance Version:11.x
  • Platforms:Mac OS X, iOS

Posted 01 May 2015 - 06:13 PM

 But it would be great if the programmers worked out a utility to do that one time and we could all benefit from it.

+1


Edited by Fabian, 01 May 2015 - 06:13 PM.

Mac Air (13-inch, Mid 2013)

1,3 GHz Intel Core i5

4GB Ram

Next time: I'll buy only one with Retina, and hopefully without a glossy screen. A faster CPU and more RAM.

 

Yosemite 10.10.3

Accordance 11.0.5 and waiting on 11.1

 

iPhone 4S

iOS 8.3

iAccord 2.0


#7 Daniel Semler

Daniel Semler

    Mithril

  • Active Members
  • PipPipPipPipPipPip
  • 2,027 posts
  • Gender:Male
  • Accordance Version:11.x

Posted 01 May 2015 - 06:23 PM

I'm going to regret this but I'll ask. Is there a document describing the USFM file format ?

 

Had a look at the doc. As Joe points out the format has one foot in the publishing world.

On the upside there are at least a couple of Python libraries published on this. I do not know how good they are but they exist. And there are a couple of converters to various formats, none I can see to UB import for Acc. Given the extensive nature of the markup you need a parsing library to do the heavy lifting. After that you need to discard lots, and I suspect in some cases lots really is LOTS. You need to extract the text from amongst the formatting and commentary and so on. You'll lose almost all the formatting.

 

So yep it's a bunch of work, most of it to do with filtering out stuff that is not actually the biblical text.

 

So many formats ....

 

Thx

D


Edited by Daniel Semler, 01 May 2015 - 09:57 PM.

Sola lingua bona est lingua mortua

ἡ μόνη ἀγαθὴ γλῶσσα γλῶσσα νεκρὰ ἐστιν

lišanu ēdēnitu damqitu lišanu mītu

 

Accordance Configurations :
 
Mac : 2009 27" iMac                 Windows : HP 4540s laptop
      Intel Core Duo                          Intel i5 Ivy Bridge
      12GB RAM                                8GB RAM
      Accordance 11.0.4                       Accordance 11.0.4
      OSX 10.10.2 (Yosemite)                  Win 7 Professional x64 SP1


#8 EricC

EricC

    Bronze

  • Active Members
  • PipPip
  • 88 posts
  • Gender:Male
  • Accordance Version:11.x
  • Platforms:Mac OS X, iOS

Posted 01 May 2015 - 11:09 PM

I'm going to regret this but I'll ask. Is there a document describing the USFM file format ?

See http://paratext.org/about/usfm

 

Michael Johnson has done a lot of work with programs that convert from USFM to other digital formats: http://haiola.org



#9 Daniel Semler

Daniel Semler

    Mithril

  • Active Members
  • PipPipPipPipPipPip
  • 2,027 posts
  • Gender:Male
  • Accordance Version:11.x

Posted 02 May 2015 - 01:21 PM

Hey Eric,

 

Interesting. He has an ambitious road map, trying to create a bible format translation tool covering as many input and output formats as he can. The closest thing to an Accordance User Bible compatible import that might get half or more of the work done would be the simple HTML output format designed just for reading. You could convert to that and then strip tags and such. It's a pity it's not Python as that's what I've been using - it's C# which is why you need Mono for unixy installs. He plans but does not yet have output formats for various bible software apps. Curiously absent is any mention of TEI. I haven't looked at USFX yet and I don't know how much existing stuff there is in USFM. XML is a bit easier to handle because their are decent XML libraries that can just suck it in. It would be nice if USFM had similar support. There are several libraries around for it though. It would be interesting to see a comparison of USFM/X and TEI or the IGNTP derivative. Hmmm.... I'll have to hunt around.

 

Thx

D


Sola lingua bona est lingua mortua

ἡ μόνη ἀγαθὴ γλῶσσα γλῶσσα νεκρὰ ἐστιν

lišanu ēdēnitu damqitu lišanu mītu

 

Accordance Configurations :
 
Mac : 2009 27" iMac                 Windows : HP 4540s laptop
      Intel Core Duo                          Intel i5 Ivy Bridge
      12GB RAM                                8GB RAM
      Accordance 11.0.4                       Accordance 11.0.4
      OSX 10.10.2 (Yosemite)                  Win 7 Professional x64 SP1


#10 EricC

EricC

    Bronze

  • Active Members
  • PipPip
  • 88 posts
  • Gender:Male
  • Accordance Version:11.x
  • Platforms:Mac OS X, iOS

Posted 02 May 2015 - 01:36 PM

 I haven't looked at USFX yet and I don't know how much existing stuff there is in USFM. XML is a bit easier to handle because their are decent XML libraries that can just suck it in. It would be nice if USFM had similar support. 

I think he has a USFM to USFX converter and vice versa. You can contact him via e-mail, I think.



#11 bible4all

bible4all

    Member

  • Members
  • Pip
  • 4 posts
  • Accordance Version:10.x

Posted 02 May 2015 - 02:47 PM

Daniel,

 

I also hope you don't regret this, but the USFM reference sheet can be found here: http://paratext.org/...ference2_35.pdf. It is also linked from the URL that Eric posted above.







Also tagged with one or more of these keywords: import, usfm, user bible

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users