|
发表于 2020-9-2 00:34:41
|
显示全部楼层
笨笨地试用了一下正则表达,应该还好:
0 Y( s5 o9 o* t& \/ M/ L @export('The_Little_Dict')
( Y3 B& l- D; `6 F def The_Little_Dict(self):
9 e4 A% w6 c1 {, |" h( D2 t def_distribution = ''
; @8 H) `+ _# X& S6 m$ y8 K m = re.findall(r'(<link.*?<hr/></div>)|(</div><div\s+class="word-frequency">.*?<div\s+class="coca">)|(<span\s+class="pos">.*?<div\s+class="total">\d+</div>)|(</div><div\s+class="coca2">.*?</span></div></div>)', self.get_html_all())
6 w+ E6 f3 W3 A! E& |" c; q8 N if m:# ^+ V" f& C+ E& J9 ~
for i in range(len(m)):" u4 Y1 B+ K& e
for j in range(len(m)):
/ j) ^4 ?1 ^' C. C! x% i if not (m[j] == ''):# W/ ?2 h1 B8 E$ G& R: H `
def_distribution += m[j]$ m" y9 @/ [9 d1 J# K
return def_distribution
. h, S; w+ W$ J# u; F! ?; e' ^ return ''
3 _$ h# w4 k( p& t; G) n& E$ ?1 R
+ H/ q6 H0 E' W( g+ O; w# v1 b4 F: T, k7 @. C+ T6 i
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
|