Jump to content

RegexForAccordance


Darin Franklin

Recommended Posts

I thought it would be fun to search Hebrew and Greek Bible texts with regular expressions, so I wrote a Mac app to do just that. RegexForAccordance can search any text module in Accordance (English too).

 

http://darinfranklin.github.io/RegexForAccordance/images/Screen%20Shot%20Hebrew.png

 

RegexForAccordance gets Unicode text from Accordance through AppleScript and then searches it line by line. Here are a few examples of what you can do.

 

Find Greek words that end with ημι

Text Module: GNT-T

Range: Matt-Rev

Search: \w+ημι\b

Filters: Remove Diacritics

Result: τιθημι, φημι, αφιημι, Συνιστημι, συμφημι

 

Find Repeated Phrases in the Greek New Testament

Text Module: GNT-T

Range: Matt-Rev

Option: Ignore Case

Filters: None (Remove Diacritics for more matches)

Search: \b(.+)\s+\1\b

Result:

 

http://darinfranklin.github.io/RegexForAccordance/images/Screen%20Shot%20Greek%20Repeated.png

 

By including the verse reference in the search, you can produce some interesting verse lists.

 

Find verse 3:16 in every book

http://www.accordancebible.com/forums/index.php?showtopic=13286

Text Module: KJV

Range: Gen-Rev

Option: Include Reference

Search: \b3:16\b

Result: Gen 3:16, Exod 3:16, ..., Rev 3:16.

 

Find "Tweetable" verses in ESV (140 characters or less, including the reference)

http://www.accordancebible.com/forums/index.php?showtopic=14557

Text Module: ESV

Range: Gen-Rev

Option: Include Reference

Filters: Remove Pilcrows and Trailing Spaces

Search: ^.{1,140}$

Results: 18445 verses

 

The statistics table on the right side of the search window shows the count and length of each hit, so you can do statistical studies on letters, words, or phrases.

 

Count Hebrew Vowels in Psalm 22

Text Module: HMT-W4

Range: Ps 22

Filters: Remove Cantillation

Search: .

Result:

http://darinfranklin.github.io/RegexForAccordance/images/Screen%20Shot%20Hebrew%20Vowels.png

 

RegexForAccordance uses ICU Regular Expression syntax, provided by Apple.

 

Find Palindromes in Hebrew

(similar search in Accordance: http://www.accordancebible.com/Friday-Fun-Hannah-Is-A-Palindrome )

Text module: HMT-W4

Filter: Remove Cantillation and Points

Search: \b(?:(\w)\w?\1|(\w)(\w)\w?\3\2|(\w)(\w)(\w)\w?\6\5\4)\b

Result: 102 distinct hits

Prov 30:1 דברי ׀ אגור בן־יקה המשא נאם הגבר לאיתיאל לאיתיאל ואכל׃

 

Find five consecutive words which begin with the letters אלהימ

Text module: HMT-W4

Filter: None

Search: \bא\w*\W+ל\w*\W+ה\w*\W+י\w*\W+מ\w*

Result: 2 verses

Zech 4:13 וַיֹּ֤אמֶר אֵלַי֙ לֵאמֹ֔ר הֲל֥וֹא יָדַ֖עְתָּ מָה־אֵ֑לֶּה וָאֹמַ֖ר לֹ֥א אֲדֹנִֽי׃

Lam 2:13 מָֽה־אֲעִידֵ֞ךְ מָ֣ה אֲדַמֶּה־לָּ֗ךְ הַבַּת֙ יְר֣וּשָׁלִַ֔ם מָ֤ה אַשְׁוֶה־לָּךְ֙ וַאֲנַֽחֲמֵ֔ךְ בְּתוּלַ֖ת בַּת־צִיּ֑וֹן כִּֽי־גָד֥וֹל כַּיָּ֛ם שִׁבְרֵ֖ךְ מִ֥י יִרְפָּא־לָֽךְ׃ ס

 

I hope that some of you will find this useful.

Its free and open source. Download it from GitHub.

 

http://darinfranklin.github.io/RegexForAccordance/

  • Like 7
Link to comment
Share on other sites

Hi Darin, really like this.

 

Just a q’n. When I do the palindrome search, I get thousands of hits, not just 102. I can’t see that my setup is any different. Thoughts?

Link to comment
Share on other sites

What a fun project. I look forward to checking it out, including several implementation questions I think of.

Thanks for sharing.

Link to comment
Share on other sites

On more than one occasion I have wanted to be able to do this. Nice one.

I'll have to look back over the cases I had.

 

Thx

D

Link to comment
Share on other sites

Of course, no 1 can already be done with "*ημι"

 

But the regex really opens up some quirky and interesting possibilities.

Link to comment
Share on other sites

It'd be great to see this implemented in Windows as well. Would anyone know how to get chunks of unicode text by script from Accordance for Windows?

  • Like 2
Link to comment
Share on other sites

So, when I use the "\b3:16\b” and esvs it crashes every time.

Link to comment
Share on other sites

This is pretty slick. Thanks for sharing.

Link to comment
Share on other sites

When I do the palindrome search, I get thousands of hits, not just 102. I can’t see that my setup is any different. Thoughts?

 

The palindrome search should show "4358 hits | 3666 verses" on the left side, and "102 distinct hits" on the right.

Link to comment
Share on other sites

So, when I use the "\b3:16\b” and esvs it crashes every time.

 

Hmm...it works on my computer (Yosemite 10.10.1). I know that it will crash Accordance with NRSVS when you do the range Gen-Rev, but ESVS is working fine. I will debug this later. Thanks.

Link to comment
Share on other sites

 

The palindrome search should show "4358 hits | 3666 verses" on the left side, and "102 distinct hits" on the right.

 

This morning I get 3414 hits | 2947 verse with 89 distinct hits

 

Here is my setup

 

post-29509-0-06589200-1418329790_thumb.png post-29509-0-10624400-1418329803_thumb.png

Link to comment
Share on other sites

 

This morning I get 3414 hits | 2947 verse with 89 distinct hits

 

Here is my setup

 

http://www.accordancebible.com/forums/public/style_images/master/attachicon.gifScreen Shot 2014-12-12 at 07.29.18 .png http://www.accordancebible.com/forums/public/style_images/master/attachicon.gifScreen Shot 2014-12-12 at 07.27.51 .png

 

The program factors in the time of day in producing the results.

  • Like 1
Link to comment
Share on other sites

 

This morning I get 3414 hits | 2947 verse with 89 distinct hits

 

Here is my setup

 

http://www.accordancebible.com/forums/public/style_images/master/attachicon.gifScreen Shot 2014-12-12 at 07.29.18 .png http://www.accordancebible.com/forums/public/style_images/master/attachicon.gifScreen Shot 2014-12-12 at 07.27.51 .png

 

 

You have a leading space in front of the first \b, so you miss the words at the beginning of a line or after a maqqef.

Link to comment
Share on other sites

So, when I use the "\b3:16\b” and esvs it crashes every time.

 

Hi Ken. I fixed the problem. The Accordance preferences for European verse notation and SBL standard abbreviations affect the verse reference format in the AppleScript output. I am using SBL, and I suspect that you are not. I added support for all combinations. You may download version 1.0.1 now. Thanks.

  • Like 2
Link to comment
Share on other sites

Thanks Darin, for both explanations, though re the 2nd, I am not using European notation. SBL, always

 

post-29509-0-16776100-1418363648_thumb.png

 

However, you have fixed the crash! Thanks

Link to comment
Share on other sites

 

The program factors in the time of day in producing the results.

 

Joe! I never knew! That’s brilliant!

Link to comment
Share on other sites

Darin this is so great! Thank you for the work you put into it and for making it free!

Link to comment
Share on other sites

I'm very impressed with this app, Darin! Thank you for taking the time making this.

 

With kind regards

 

Peter Christensen

 

Oh, and for those who want to try out the app, but don't know how to work with regular expressions, the article on Wikipedia explains it nicely:

http://en.wikipedia.org/wiki/Regular_expression

Edited by Pchris
  • Like 1
Link to comment
Share on other sites

  • 1 month later...

Would the kind folks at Accordance please sherlock this, and do the honors of making this a feature of Accordance itself? I've been longing for advanced search features in the notes (as well as texts). Excellent idea, and nicely done.

Edited by James Tucker
  • Like 7
Link to comment
Share on other sites

Would the kind folks at Accordance please sherlock this, and do the honors of making this a feature of Accordance itself? I've been longing for advanced search features in the notes (as well as texts). Excellent idea, and nicely done.

 

 

+1

 

This would be a really great feature.

Link to comment
Share on other sites

Would the kind folks at Accordance please sherlock this, and do the honors of making this a feature of Accordance itself? I've been longing for advanced search features in the notes (as well as texts). Excellent idea, and nicely done.

+1

 

Matt

Link to comment
Share on other sites

  • 1 year later...

In your other post, I think you are asking how to search for two words in the same verse which have the same letters, but not in the same order. You can do this with regular expressions, but there are some complications.

 

In RegexForAccordance, click Filters and remove cantillation and points.

The commands to know for this search are

\b for word boundary 

\w for word character (i.e., not a space or punctuation).

 

Search for a two letter word: \b\w\w\b

Now we want to capture the two letters and match another word with those same letters. Parentheses capture, and \1 and \2 represent what was captured.  The .* matches 0 or more of any character. 

\b(\w)(\w)\b.*\b\2\1\b

 

If you search Genesis, you will find a lot of hits with אל... לא.  In fact, the only one that isn't like that is your example of Gen 38:7:

ער בכור יהודה רע

 

Notice that it does not find Gen 6:8.

 

ונח מצא חן בעיני יהוה׃ פ

 

That is due to two complications.

 

1. נח does not begin at a word boundary because of the ו prefix.  

2. חן has a final form nun.  Final form and medial form are different Unicode characters, so they do not match each other.

 

You could add a ו to fix the first problem:

\bו(\w)(\w)\b.*\b\2\1\b

 

That finds more hits with ואל...לא

 

To account for final forms, you could search each one separately.

\bונ(\w)\b.*\b\1ן\b

 

That finds Gen 6:8 and nothing else in the whole OT. 

 

This search can be extended to three character words by adding one more (\w) to the beginning and \3 to the end.

 

\b(\w)(\w)(\w)\b.*\b\3\2\1\b

 
Then you will probably want to check for the other permutations, not just reverse order.

\b(\w)(\w)(\w)\b.*\b\2\3\1\b

\b(\w)(\w)(\w)\b.*\b\1\3\2\b

 
 
 
 

 

Link to comment
Share on other sites

  • 2 weeks later...

This is great! Thanks. Wish this could be integrated into Accordance searches! The best I can come up with is searching for two words that contain the same characters, but I can't specify the order.  Any work around within Accordance lex data...?

 

 

Thanks!

Link to comment
Share on other sites

  • 11 months later...

Hello Darin 

 

I have a request after a search in RegexforAccordance and in Accordance I got different counts of hits.

 

See like https://www.accordancebible.com/forums/topic/20885-searching-for-a-word-based-upon-its-location-in-the-verse/?p=101503 and post-32723-0-99530800-1489417837_thumb.png ^And in Mark. I have seen that in different Verses the ESV has a superscript after the verse numbering. I would love to have to filter this.

 

Greetings

 

Fabian

Edited by Fabian
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...