Jump to content

What does 'Search both Directions' mean for Syntax?


Peter Bekins

Recommended Posts

I asked this as a secondary question in my previous post, but the thread concentrated on my first question. What counts as 'both directions' when you have three items in your search?

 

I built three searches for a Hifil verb + 2 Complements (see attached workspace). For each search I ran it as unidirectional and bidirectional. I cannot make sense out of the results. 

 

Hifil + Comp + Comp: uni = 1098 hits and bi = 1105 hits

Comp + Comp + Hifil: uni = 26 hits and bi = 1105 hits

 

So 'search both directions' yields the same result (1105) but 1098 + 26 = 1124.

 

For Comp + Hifil + Comp, uni = 258 and bi = 230! Why does search both directions result in less hits than a unidirectional search?

 

Pete

 

2CompsHifil.accord.zip 

 

 

Link to comment
Share on other sites

Hi Peter,

 

  This looks like a hit counting issue. The references in both CHC (Comp + Hifil + Comp) uni and CHC bi are identical. Both queries return 224 hit verses but different hit counts.

 

Thx

D

Link to comment
Share on other sites

Pete,

It's a good question, but I can't answer it.

And Daniel's obversation is one I've made often to myself -- how the hits are counted is sometimes a mystery to me (I'm sure there's a logical explanation, but I don't know it), but when I check the actual verse references, they are the same.

  • Like 1
Link to comment
Share on other sites

Rob,

 

That was a follow up question I was going to ask. How do you check the verse references? You just skim through or you have a way to automate it? For ~200 it is no big deal, but for the ~1000 it would be nice to be able to run a comparison.

 

Pete

Link to comment
Share on other sites

I'm sure there's a slicker way to do it, but I just export the verse references into two text files and use BBEdit to compare the two front windows. It highlights every difference. Doesn't take very long.

Link to comment
Share on other sites

That is what I was thinking, but I never paid for BBEdit. I'll see if TextWrangler can do it, otherwise I'll have to see if I can write a script.

Link to comment
Share on other sites

I validated the verse refs in two ways.

 

  1. I did a [CONTENTS uni] <NOT> [CONTENTS bi] and then the reverse, where uni contained the results of the uni-directional query and bi the results of the search both directions query. Both returned no hits.

  2. I selected all results (cmd-A) in each result tab and did a Copy as References to a file in vi. Then i diffed each file in a terminal window - diff f1 f2. That showed 0 difs

 

 

And Daniel's obversation is one I've made often to myself -- how the hits are counted is sometimes a mystery to me (I'm sure there's a logical explanation, but I don't know it), but when I check the actual verse references, they are the same.

 

While it is nice to know that I'm not alone it's a little disturbing. Because of this I tend to ignore the hit count except as a rough guide. I would love to know better what it means.

 

Thx

D

Link to comment
Share on other sites

Pete,

 

Textwrangler has the same compare two front docs feature.

 

Daniel,

 

The hits is fundamentally morph-based, so it's a bit wonky with the syntax. The number of verses (vs. the number of hits) is often a better guide when I'm looking at large sets to compare. For fully accurate results, I always count manually.

 

In principle, though, I can't impress this on my students enough -- always do the manual work. Someone asked about this at the Accordance session at SBL and I was a bit disturbed by the discussion. Even with thousands of examples, if I were an external reviewer or if my student were authoring such a study, I wouldn't let recommend it fo publication if it wasn't clear that the search results were double-checked by the author. I want to know that the examples have been confirmed by the author's particular understanding of (Hebrew) grammar, not based on the options of the one who created the database being used. I'm suspicious enough not even to trust my own work! Some days and some cups of coffee are better than others; I am constantly rechecking decisions.

Edited by Robert Holmstedt
  • Like 1
Link to comment
Share on other sites

Hey Robert,

 

  I agree that hit counts are way more problematic for syntax searches. The hit count is mostly intelligible for morph searches. That said I think it would be possible to come up with an intelligible and useful model for counting syntax search hits. The most obvious would be to do something like count hit constructions overall - essentially counting at the top most level of the search. An alternative would be to could leaf elements matched. Another would be to count phrases and words separately again by terminal elements. It could be made fairly complex but I don't immediately see the use for that.

 

  On the manual checking thing, this is absolutely true and I have no argument here. It is one reason why fuzzy searches, or various types, are useful still. It is necessary to cross check the results in a variety of ways, to ensure accuracy. And absolutely the results themselves need to be reviewed manually. The result sets are not so large that this is really intractable. Tagging errors do exist and opinions vary.

 

Thx

D

Link to comment
Share on other sites

I agree. I just haven't pushed the issue. There are more important syntax searching issues to address. Someday ...

Edited by Robert Holmstedt
  • Like 1
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...