|
发表于 2017-6-4 20:46:11
|
显示全部楼层
本帖最后由 skywind3000 于 2017-6-4 20:51 编辑
( z$ `6 R0 n, b3 L) ~- w$ k7 d# j6 @' P6 g
光靠正则搞不定,你需要 Lemma List,就是一个每个单词有哪些变形的对应表格,比如:
7 ^' w) L& J. M: _, K& r4 x2 K7 n }
4 K: _( \7 x) f* C* W b* H# S- be/4109826 -> is,was,are,were,'s,been,being,'re,'m,am,m
5 m& L" b8 D7 p; u - have/1315648 -> had,has,'ve,having,'s,'d,of,d,ve( L$ u" P. e7 ?; v
- it/1213224 -> its,they
( i: G3 D0 i7 Q/ R& [ - he/1196022 -> his,him,they% [# G+ ^! h( T7 z, ]7 l8 J& Y3 q
- i/1133697 -> my,me,we,is
3 [/ R* y& z# g0 C$ x& H/ ] - they/841960 -> their,them,'em
. W% F8 k$ `# k! A - you/804279 -> your,ya,ye
# v3 F7 ] h, [4 R - not/767330 -> n't
0 X/ q+ o6 J6 E7 H& G0 _3 u - she/653505 -> her
$ {1 d* f5 N* l+ t; p9 ^0 T5 ~! t - do/535646 -> did,does,done,doing,du,d'
8 r; d |5 a: [: | - we/503360 -> our,us
; v$ i8 S6 J8 r3 @( u - will/334612 -> 'll,wo,ll
4 Q: Z' @" X: \* J+ E/ X, v0 O - say/317317 -> said,says,saying
; `" ?' u& k* b2 e+ e/ L! f3 J3 y - would/278414 -> 'd0 a$ s$ ]6 t3 e3 L8 P
- can/263138 -> ca,cans,can,could
- i+ ~' b" ]# l$ V V5 ^, }, [ - go/227247 -> going,went,gone,goes,goin': c0 |4 D# v [) F- w
- get/212569 -> got,getting,gets,gotten
5 ^7 M# S, r0 b1 B1 ^8 I8 s - make/209818 -> made,making,makes. P3 b+ i6 I- p6 M. p0 \( X
- up/206976 -> ups,upping,upped
+ ~0 g, U' o6 Z6 U0 e& t7 a - see/184969 -> seen,saw,seeing,sees
. }# B+ T C0 X5 G0 @/ {7 m/ [ - other/181277 -> others' P3 [2 h, g. l& G- v
- time/181080 -> times,timed,timing
. s" H0 ^8 V+ H+ y" v8 j. A - know/177717 -> knew,known,knows,knowing
6 E) |9 f( K1 t" h* H - take/172773 -> took,taken,taking,takes! V0 n0 _! B' q1 y, `
- year/161649 -> years
复制代码 ' s' k: \2 X& A1 |. x8 k4 d
* c1 W# F) z2 `5 y* m然后写点小脚本就搞定了,点击下载:
5 `7 A& F) B+ r3 o: |( wlemma.en.txt0 g" @" C' u" O6 a
F( b6 [* N2 n$ w+ X: L& T( ~
|
|