Word vs. Lemma with Count Command
Posted 20 February 2008 - 06:25 PM
I ran two samples, one using Lemmas and the other using Words with both set to search for items occurring 5000 times or more. The results differed in number by one item. The lemmas search produced 13 and the words search produced 14 items. One of the results in the words search was listed as "(No lexical form) = 8934". There are other differences as well, the most significant being that some of the same items show differing frequencies.
Any help in sorting these out would be much appreciated!
Posted 21 February 2008 - 12:22 AM
The first search for lemmas finds all 13 lemmas that occur over 5000 times. If you use Set Analysis Display to add the inflected form under the lemma, you see that most forms of most of these words occur far less often than 5000 times. For example אמר־1 occurs 5317 times, but the most common form יֹּאמֶר occurs under 2000 times.
The second search finds only the very common words themselves (and all the suffixes that have no lexical form), so other than the prefixes it finds only the Name. Adding INFLECT to the analysis shows that the words are the same except for the vowelling.
So the search you will choose depends on whether you want to count the specific inflected forms, or the number of times the lemma is used. However, the default Analysis shows the resulting lemmas unless you reset it.
Posted 21 February 2008 - 08:29 AM
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users