掌上百科 - PDAWIKI




查看: 2765|回复: 3

[词典校勘] COCA詞頻,排序卻不單只依詞頻排序



发表于 2016-2-27 15:37:56 | 显示全部楼层 |阅读模式
緣起於fuxy526加入詞性百分比的版本( F( i& l% ]# w- L9 P3 I* H: y
4 _$ \9 O9 B/ I! @6 L
才注意到有些詞性的詞頻較多,排序卻靠後! F6 s1 Z+ y$ I7 @% _% [6 y8 x
" q4 `$ q. R* O: R
. e" S$ I1 R6 J% m1 E
+ n9 ]9 b7 L  _) ?
( ]/ Q0 u& m5 K/ z' d1 x7 g; C
% R" l4 C& Z: a0 J: p: A, }COCA網頁上找不到排序的依據是什麼



发表于 2016-2-27 20:44:01 | 显示全部楼层

: O; J' U' K* j/ H) k4 ?http://www.wordandphrase.info/h_dispersion.asp0 O: \# a1 v% x1 w% y9 I
Why doesn't the frequency ranking follow the absolute frequency of a word?3 R9 L+ N1 x7 H( l% @( y! `
DISPERSION AND RANKING (1,60,000)6 W0 ]3 n: S2 d& t

; m% M9 E: B: O& h/ VAs you browse through the frequency listing, you may notice that words with a lower frequency than other nearby words have a higher ranking (1-60,000). This is because the ranking is a function of two numbers: [frequency x dispersion]. Dispersion is a score (0.00-1.00) that measures how "evenly" the word is spread across the entire corpus (with 1.00 being the most even). The idea is that if a word is concentrated in just one or maybe two genres (or worse, even just a few sub-genres or texts in that genre), then the word is more specialized, and shouldn't be ranked as high in the overall list 1-60,000.
8 j4 O3 T; C0 m6 @
, j1 w0 f7 c/ q- o7 C5 x) bMost people won't need to see the dispersion score. If you do, you might consider downloading the data that contains this information.  (See a sample (every seventh word, 1-60,000) with dispersion in the right column).
7 b! p0 Z$ ^- ~- B; X- U- h2 ]4 g0 x, P+ o" U
Also, please be aware that there are still some isolated "issues" with the frequency list, especially with words that occur mainly as a proper noun or in proper nouns (e.g. cook, ray, frost, savage). In most cases, these are already marked in the frequency list with parentheses, to let you know that there might be problems. But even with these issues, we believe that the frequency list here is more accurate than any other large frequency listing of English.

- U  x$ Z; m* h/ h* X: W
+ W. M* i. z, Q7 \/ Ehttps://en.wikipedia.org/wiki/Statistical_dispersion+ t- Y/ a9 n5 @+ [! i9 p
$ k+ Z0 {- n# g0 h& Y. O
6 M3 |6 t" f2 \2 Y0 W6 W6 q: a2 ^; b/ b


厲害 網站翻遍了 竟然沒發現  发表于 2016-2-27 23:32


发表于 2016-3-8 12:30:10 | 显示全部楼层
您需要登录后才可以回帖 登录 | 免费注册


小黑屋|手机版|Archiver|PDAWIKI |网站地图

GMT+8, 2024-4-29 22:25 , Processed in 0.037286 second(s), 11 queries , MemCache On.

Powered by Discuz! X3.4

Copyright © 2001-2023, Tencent Cloud.

快速回复 返回顶部 返回列表