Order Toll Free 1-877-339-5855
News, How-tos, and assorted Views on Accordance Bible Software.

Wednesday, July 11, 2007  

Lexical Forms, Inflected Forms, and Cantillation, Oh My!

An Accordance user recently blogged about a Hebrew search he tried to do in Accordance that didn't get him the results he expected. He was looking at a Qumran fragment which supposedly contained a quote from the Bible, and wanted to find that quotation by searching the Hebrew Bible in Accordance for a sequence of letters contained in the fragment. Using wildcard symbols and a search command, he constructed the following search:

Basically, he wanted to find a word ending in either shin-tav or sin-tav, followed by another word beginning with tav. There are a number of ways he could have constructed this search, but the way he chose should have worked just fine.

When he performed the search, however, Accordance only returned two results, even though he was aware of others that should have been found. In looking at some of the clear examples which got missed, he noted the presence of certain cantillation marks which he assumed were throwing off the search. So he asked the readers of his blog how to get Accordance to ignore the cantillation marks and just do a consonantal search.

There are a couple of lessons to be learned here:

  • First, know where to get help. If you want an answer to a question about Accordance, you're better off turning to the Accordance Forums rather than to the blogosphere in general. Most of the comments on this user's blog post amounted to something along the lines of, "I'm a PC user who doesn't know anything about Accordance, but . . ."
  • Second, even power-users make mistakes. I point this out to make you first-year Hebrew students feel better that you're not the only ones who get confused. Professors and power-users sometimes do, too.
  • Third, it's important to understand the distinction between lexical forms and inflected forms.
  • The reason this search failed to find every occurrence of that string of letters had nothing to do with vowel pointing or cantillation marks. Accordance ignores those marks unless you specifically indicate that you want them to be considered. The real reason he didn't get the results he expected was that he was searching for lexical forms when he really should have been searching for inflected forms.

    By "Lexical form" (or sometimes "lemma"), we mean the dictionary form of the word as representative of all forms of the word. For example, if you enter beth-resh-aleph in the Search window and click OK, Accordance will find all occurrences of the Hebrew word bara, no matter what particular inflected form it happens to take (yivra, barati, etc.). This would be like searching for "run" in English and finding "runs," "running, "ran," etc.

    In Accordance, if you want to find a particular inflected form of a word, you need to enter that form and enclose it in quotation marks. Here's how I would search for the inflected form yivra:

    If I were to enter this form without the quotes, Accordance would look for a lexical form with the spelling yodh-beth-resh-aleph, and give me an error message if no such lexical form exists.

    It's important to understand that inflected forms are the words as you see them in the text, while lexical forms represent every form of the word. Beginning users will sometimes try to copy an inflected form from the text, paste it into the argument entry box, and then wonder why they get an error message telling them no such lexical form can be found.

    In the same way, the user searching for a phrase from a Qumran fragment entered the text as it appeared in the fragment (inflected forms), but because he did not enclose the phrase in quotes, Accordance assumed he was searching for lexical forms. Here's how his search should have looked:

    By the way, the reason this user entered two phrases and joined them with an OR command is that he wanted to find words containing either a sin or a shin. Did you know that typing shift-C in Accordance's Hebrew font will enter an unpointed sin/shin character? Thus, a simpler way to enter this search would be like this:

    In this post, I've talked at length about the distinction between lexical and inflected forms. Understanding this simple distinction, and knowing that Accordance defaults to searching for lexical forms, will help you avoid confusion when constructing Hebrew searches. (This distinction is also important when searching Greek texts, but for some reason, users seem more likely to get confused when working with Hebrew.)

    It seems to me this area of Accordance is badly designed, because the results you get will depend on whether you happen to be using a tagged or untagged text. You've got to know whether your text is tagged before you can decide how to enter your query. If you use a query you used before, yet on a different kind of text (tagged or not) the result will be different.

    A better decision would be to require some kind of explicit marking in the query that indicates that you want this search to do something not normal (lexical form searching). Something like [email protected][*] perhaps. Then if you use it on the wrong kind of text it can give an error.

    Is it really that hard to figure out if a text is tagged or not? And if a text is tagged, then the "marking" is the quotation marks (if you're re-using a search from an untagged text). I'm really not seeing the difficulty here. In fact, having seen the complicated method of defining searches in Logos and BibleWorks with all kinds of non-intuitive markers, Accordance is by far the best in this area of design.

    A few more things, adding to what Rob said. First of all, chances seem to be quite low that you will be mixing tagged and non-tagged searches *of the same language*. No English bibles are tagged (strong's numbers don't count - different type of tagging). Then, chances are fairly low that someone who has tagged Greek or Hebrew texts will also have non-tagged Greek or Hebrew texts in common usage. (a bit more likely in Gk than Hb, but it still stands as relatively unlikely). Thus, someone reusing a query will either have it going to a still tagged or still untagged texts, or will require modification anyway to account for the different languages.

    Secondly, the disparity disappears for one direction of the search types. If you are searching for an inflected form in the tagged text, it already will be surrounded by quotation marks, which Accordance has no problem with in the non-tagged text, so the query is still valid. Only if you do a search in a non-tagged text, then take it to a tagged text do you need to add quotations to keep it valid.

    To add on directly to what Rob said, quotation marks are far more clear than any indication of searching for a lexical form. Quotes clearly state "This is what I want exactly!" (like my written pun?) When I look at your [email protected][*] example, I am highly confused as to what you mean by the brackets. I would guess that you want the word when it occurs anywhere in the entire text, not that you want the lexical form. You could give me 20 examples, all which may work, but generally don't really mean lexical form. That is what makes Accordance searching clear - you can just read the search query, and knowing nothing about accordance syntax, still have a perfect understanding (in nearly every case) of what is being searched.

    The point is, one query yields two different results. This is, to be blunt, BAD. Same book, same query, two results. I shouldn't need to enumerate how many ways this is bad, whether it be user interface consistency, confused users, invisible behavior, accidental incorrect results.

    Yes, it can be hard to know what texts are tagged unless you are careful and yes one typically DOES have and use both tagged and untagged texts. An example is GNT-T vs NA27-GBS. You need them both, you use them both.

    Or else someone might have a tagged NT but not own the tagged LXX. So they do the same search on both with cut and paste - ouch, invisible inconsistent results.

    No single book Greek bible (NT + LXX) like bibleworks has, makes it all the more likely to encounter this problem. Why not have the text of the early church as one book?

    Pick a different syntax if you like, but consistent behaviour is a hallmark of good design.


    One query yields two different results because it is run on different types of data. There is nothing wrong with that at all. All databases function that way.

    To find out if a text is tagged, simply pass your cursor over any word in that text and look in the instant details box. If there is grammatical data, it's tagged; if not, it's not. There isn't anything difficult about that.

    Accordance searches are always consistent when run on the same type of data. Moreover, I would argue that they not only are consistent, but intuitive. To be honest, your suggestions are rather counter-intuitive. I would venture to say that most users search tagged texts for lexical forms more frequently than inflected forms, which is a more special type of search.

    If a user does not know what a tagged text is, then I could see your concerns being valid. In this case, the solution is not to change the way Accordance searches so an ignorant user doesn't get confused, but to educate the user on what a tagged text is so he gets the full benefit of using Accordance (insert plug for Training DVD here). Anyone who knows what a tagged text is will not run into the difficulties you mention, and when a user is unsure if a text is tagged, the instant details box will resolve that dilemma instantly and painlessly.

    I agree that having a tagged Greek Bible module would be convenient. You can use a text set as a workaround until Oak Tree decides to make such a module available, if they in fact do so.

    Finally, it would be nice if you posted who you are rather than remaining anonymous.

    Post a Comment

    << Home

    This page is powered by Blogger. Isn't yours?