Jump to content


Photo

Hebrew construct count bug?


  • Please log in to reply
4 replies to this topic

#1 Steven van der Hoeven

Steven van der Hoeven

    Bronze

  • Active Members
  • PipPip
  • 69 posts
  • Gender:Male
  • Location:The Netherlands
  • Accordance Version:10.x

Posted 04 October 2013 - 02:32 PM

Win7-64 Acc 10.3.0 (applies to mac also)

 

When doing a Hebrew construct search for nouns that appear >2000 times I find the following:

1)The count input boxes are too small to contain numbers with more than three digits.

1b) IMO, the need to add upper bound for a 'greater than' search is not ideal. A small redesign with check boxes would look better - but that's a matter of taste..

2) When checking the count results in analysis I see 'lo' counted as 4. Looks like a bug, since it is <2000?

Attached Files



#2 Joel Brown

Joel Brown

    Administrator

  • Admin
  • 5,692 posts
  • Gender:Male
  • Location:Houston, TX
  • Accordance Version:12.x
  • Platforms:Mac OS X, Windows

Posted 06 October 2013 - 07:55 AM

OK, there are lots of things going on here :)

 

1) Yup, we'll bump up the dialog size.  The Mac gave you four characters, but regardless, its too small.

1b) We've had other requests for this, so we'll definitely be doing it in the future.  I doubt it will make it into the 10.3.1 bug fix release, but its on our list.

2) This is an interesting one.  Technically, its working correctly, but differently from what you expect.  The [COUNT] command means to give you the number of lemmas (or inflected words, or tag combinations) that occur x-y number of times.  What you said in your construct is "Give me the number of Lemmas that occur > 2000 times, that are also Nouns."  'lo' matches this criteria - it occurs well over 2000 times, but only 4 of those hits are Nouns (the rest are Particles).  It seems like you are trying to search instead for "Give me the number of Nouns that occur over 2000 times", which is a totally different way of doing or thinking about COUNT.  The easiest way to do this is simply a [NOUN] search, then looking at analysis to grab the words over a specific number of hits.  Or, use your search, but be aware of the subtle difference of the results.

 

Its an interesting question!  We've had COUNT for years, and none of us could recall a time when someone was blocked by this particular nature of the feature, until now :)


Joel Brown

By day: Lead Software Engineer at Accordance
By night: Freelance Trombonist


#3 Steven van der Hoeven

Steven van der Hoeven

    Bronze

  • Active Members
  • PipPip
  • 69 posts
  • Gender:Male
  • Location:The Netherlands
  • Accordance Version:10.x

Posted 06 October 2013 - 02:49 PM

Thanks for your clarification. I am just an ignorant amateur, so most times I am wrong. But this helps learning :)

 

You are right, I want to search for nouns that occur over 2000 times. By defining the search function in the Hebrew construct I initially thought the second argument acted on the subset nouns, which would be formed by the first argument. Hence, it would narrow the possible results. But your explanation makes sense. I need to overthink the implications of applying constraints only to the complete set and not on subsets. Normally it will give results as expected, but in some cases subsets will overlap like happened here.

 

Nevertheless, I would like to see the results of 'nouns that occur over 2000 times' as highlighted results in the Bible text. Would you recommend the following (screenshot)?

1) Set up search * @[noun] or set up construct with noun as search argument.

2) Results are shown in tab

3) Get [HITS] and add @ [count 2000-upper]

4) Results list is correct and is shown in text.

 

Thanks.

 

 

Attached Files



#4 Joel Brown

Joel Brown

    Administrator

  • Admin
  • 5,692 posts
  • Gender:Male
  • Location:Houston, TX
  • Accordance Version:12.x
  • Platforms:Mac OS X, Windows

Posted 06 October 2013 - 03:57 PM

Unfortunately, your search doesn't really cut it, and I don't think this is currently possible in Accordance.  First, if not using a construct, all you need is [NOUN], rather than *@[NOUN].  Anyway, what your end result is highlighting all words, that occur more than 2000 times, that also sometimes occur as a NOUN.  Remember, the 'HITS' command gives you all of the words that were found in your search, not their specific instances.  So, by the end of your results you are again saying "Give me all words that occur over 2000 times that are sometimes represented as Nouns".  Which, you can confirm because 'lo' is still on the list.
 
Really, I don't think we have a specific way to search for specifically Nouns that occur over 2000 times.  Your best bet is to start with your [NOUN]@[COUNT 2000-99999], and then if you notice a problem, tweak your search.  So, since you know Lo is a spurious hit (due to the intricacy), do a search for [NOUN]@[COUNT 2000-99999]@-לֹא.  Sorry there's not a better option for you right now!

Joel Brown

By day: Lead Software Engineer at Accordance
By night: Freelance Trombonist


#5 Steven van der Hoeven

Steven van der Hoeven

    Bronze

  • Active Members
  • PipPip
  • 69 posts
  • Gender:Male
  • Location:The Netherlands
  • Accordance Version:10.x

Posted 06 October 2013 - 04:14 PM

Sorry, I really missed that lo in the list :blink:  I'll take your advice. Thanks.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users