Jump to content

Frequency number different than manual math


Joe Weaks

Recommended Posts

When I search for "time" in the NRSV, it finds 867 words.

When I search for * in the NRSV, it finds 895888 words.

 

That manually gives the frequency per 1k words, (867/895888)*1000=.97

However, an analysis window set to frequency gives 867 as .81 words per 1000.

 

I get the same problem searching for lexical forms in the GNT, and if I'm searching in a smaller range of text... my manual frequency is off from the analysis window.

 

Is it that a * search is not really giving me the correct word count to divide by? I thought articles are now included.

 

Thanks for any help,

Joe

Link to comment
Share on other sites

If I remember right, it may have to do with the way Accordance handles/counts spaces, but the "authorities" will have to tell you how and why.

 

Rod

 

 

When I search for "time" in the NRSV, it finds 867 words.

When I search for * in the NRSV, it finds 895888 words.

 

That manually gives the frequency per 1k words, (867/895888)*1000=.97

However, an analysis window set to frequency gives 867 as .81 words per 1000.

 

I get the same problem searching for lexical forms in the GNT, and if I'm searching in a smaller range of text... my manual frequency is off from the analysis window.

 

Is it that a * search is not really giving me the correct word count to divide by? I thought articles are now included.

 

Thanks for any help,

Joe

Link to comment
Share on other sites

The Analysis and Table include punctuation in the word count. Only the * search for words counts only words (or lemmas or inflected forms depending on the text). Remember that if there is no lexical form, the word won't be counted (Hebrew suffixes).

Link to comment
Share on other sites

The Analysis and Table include punctuation in the word count. Only the * search for words counts only words (or lemmas or inflected forms depending on the text). Remember that if there is no lexical form, the word won't be counted (Hebrew suffixes).

 

Helen,

Thank you for getting back to me with an answer. This is VERY helpful.

So, for my own clarity, the frequency (words per 1k) number given in the Analysis and Table views uses a formula that includes punctuation? So, with this text:

 

Now that I am here, I want to be there.

 

A search for * would find 10 words, 9 forms.

But a frequency stat in an analysis search for "I" would find two hits in a total word count of 12, since it includes the comma and the period?

 

This is helpful (if not crucial) knowledge for those of us who use Accordance to undergird academic research. For research purposes, presumably in original languages, I suggest that users get word counts and figure their own frequencies... since rarely would we be concerned with including punctuation in such stats.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...