TA的每日心情 | 怒 2023-3-8 00:39 |
---|
签到天数: 4 天 [LV.2]偶尔看看I
|
本帖最后由 instrumental 于 2021-3-18 04:18 编辑
X; L1 ]$ A, ~2 R+ G
7 o' o' M# r% P, G1 V7 g. L$ OFastWordQuery" G) }9 ~& _7 _, }+ d# Y+ E
# C0 F9 s: q3 e2 l* Y. N在为[[Anki]]制作单词卡片的时候,发音一直是一个比较头痛的问题。- m4 K' M( T; J+ }" Q" I7 _7 d8 M
有道的发音是比较容易获取的,但是有道的发音不纯正,很多音读得不是非常正确。
4 o" Q" H' t9 g2 N! xTTS也不能与真人发音相比,而且机器合成音在背单词的时候会有不良影响。* e# Y: H2 Q: p' S
相对来讲朗文词典的真人发音就非常的纯正。
' N; L/ Y/ b0 R, N; Z所以FastWQ特别针对朗文mdx词典制作对应的字典查询服务,以便快速的为单词添加读音并下载音频到本地。( o* L! A8 l! f6 i" V8 l
按照以下步骤,可以很方便的为整个单词本快速的添加音频。
& W' F' f0 T0 Y- Q
+ V' R# g9 F; s0 R* m使用的时候切记不要导入词典而是直接去设置py7 e, f4 H0 j" \' D; s
# R( w- a" d8 ~" b& p8 s1 {# x
特制的朗文 6 本地词典
" u! a- I- W6 d& _7 a& I
5 N6 @7 V ]$ C2 t- _6 x/ a \9 K( w- 1b5de81464fb46ca892eab4a698a207e#1b5de81464fb46ca892eab4a698a207e#1206#/LDOCE6双解 修改 可提取音频/entry.js( V1 k' P, }# d! `/ {
- f24817ae7bc736d4365908da3f87f77f#efd11ff3580a0f7dc0735119d296cf98#1297932816#/LDOCE6双解 修改 可提取音频/L6mp3.mdd
! O- O$ Y# |! s( [: o - ee13707d1b966a49bc3149b111cd6b4d#60a7f4d9a40eb2359b4ece19876f584d#26550272#/LDOCE6双解 修改 可提取音频/L6mp3.mdd.db
$ S$ t. ?, S, n& S: o& E - 7f301751b90d10f33412de45a599f2da#60a24ed264dc49dd7a8b05a1ace848cf#124057083#/LDOCE6双解 修改 可提取音频/L6mp3.mdx
2 Z, _* f, a3 b5 l$ R - aefbf1c0cae35980122ace73f533409e#df04641926e6d88694ecee7dcada3d19#11501568#/LDOCE6双解 修改 可提取音频/L6mp3.mdx.db
% W% K5 n- x2 A- ]4 V - e79f484789815550da723c944b8f27d8#e79f484789815550da723c944b8f27d8#12642#/LDOCE6双解 修改 可提取音频/LDOCE6.css
复制代码 . P, }8 U6 Q0 ]
( P' q2 R$ D: Y) [: J) B修改程序的YHCD,然后用正则表达式制卡就好了2 b: Q( o" X) d8 d! f; ?
* Y: N a9 N! e0 c: j+ Y n, h0 H; m0 k# Y
Z, {" R* ^% M7 t4 d
" ^% d2 S ^+ E% `$ c- - [FastWordQuery - GitHub](https://github.com/sth2018/FastWordQuery)
- G+ x) W; t" j" e0 q6 O - - [为单词添加真人发音(朗文 mdx 词典)](https://sth2018.github.io/FastWordQuery/docs/get_mdx_ldoce6_sounds.html)
! s' [& w; {# S: M8 b, R - - [anki 神级插件 fastWQ 提取本地朗文音频・语雀](https://www.yuque.com/purequant/anki/sudl9z)
复制代码
6 X" n& N8 k( ]& K
8 {' r+ b9 V2 _9 W) [& X# H4 l$ h! g$ U- ]
; G+ d' W+ v, T N
6 E7 q& Z8 T- H0 k' _% y
! X1 K; Y; J3 s' ?( ?
& ?+ B% r# l$ }0 d8 l' j ?0 k! L0 ^, x( w0 E! k- r
- G! d/ M. M' A- \
* z2 q, [2 l6 k% ]
! U+ m( V% \8 q. o# h( H' C
& _# U% a) h# A% k& D* q4 B ^/ D
! X" D; M5 a% g7 g
) C: V+ h9 s7 o. O) Z8 H7 [* t* L; T ^/ }3 K8 M5 m, e* h
* ~& J# `: [0 o, V1 q
, U; N% O7 e+ w$ b" j; ]9 K4 y# l+ r
; ~9 j0 C( r9 B z, b0 g. S* x7 E7 [, M! Z0 k4 M0 N6 r
- #-*- coding:utf-8 -*-( @) E4 y2 k" H. h/ A
- import os0 I5 `* J3 [/ @! H' U' i
- import re
& y7 k; H4 m- g1 B& n - import random
+ e8 O% ^7 V- ~' g4 w/ o, P - from ..base import *
) r! N, } U: k1 D3 w - . Y& E& t; x4 S, a/ P1 I
- VOICE_PATTERN = r'<a href="sound://([\w/]+\w*\.mp3)"><img src="img/spkr_%s.png"></a>'
: R G! Q# N) Z" V4 R( q' y - VOICE_PATTERN_WQ = r'<span class="%s"><a href="sound://([\w/]+\w*\.mp3)">(.*?)</span %s>', i2 a$ z: K6 X3 O
- MAPPINGS = [$ t. n0 j5 ?% v+ {9 K
- ['br', [re.compile(VOICE_PATTERN % r'r'), re.compile(VOICE_PATTERN_WQ % (r'brevoice', r'brevoice'))]],
) M/ z4 H, @% `) K/ X4 Y4 ]. z/ r - ['us', [re.compile(VOICE_PATTERN % r'b'), re.compile(VOICE_PATTERN_WQ % (r'amevoice', r'amevoice'))]] s+ X' D+ }. H% ~
- ]5 v: x& p: v- J
- LANG_TO_REGEXPS = {lang: regexps for lang, regexps in MAPPINGS}
' x7 V1 a9 C% _ - DICT_PATH ='D:\\111111111111111111111111111111.mdx.mdx'
, v: U9 i1 J2 @" L# g
" o% K E4 x( v) W% N1 x) P! D% p- # e7 P1 E4 y( h) K
- @register([u'xxxxx', u'xxxxx']). Y6 [: Y( P7 O$ f& \$ k
- class xxxxx(MdxService):
" E( c. A7 c9 Y& k9 U( _- } - - Z5 g7 d+ r/ ?4 b, W* \
- def __init__(self):
- _: `. ^; w. c' ~ N' T- C - dict_path = DICT_PATH
; U" \* Q$ \4 g9 @ - # if DICT_PATH is a path, stop auto detect; R `/ j* Z" S3 h
- if not dict_path:7 t: T$ ?( G( m/ [$ U
- from ...service import service_manager, service_pool: k( ^4 n" {1 q: G) X
- for clazz in service_manager.mdx_services:- Z% U. ^$ ]% i) g; ]" l) R( _2 k7 b
- service = service_pool.get(clazz.__unique__); B: r; R4 |9 V9 k3 l" i6 Z# o
- title = service.builder._title if service and service.support else u''
5 f5 `2 {% g/ o) U& \ - service_pool.put(service)
7 Q) H* A8 q s8 B5 y7 e! K - if title.startswith(u'LDOCE6'):- Y) O+ Q4 ^' ?* V7 i9 L3 q
- dict_path = service.dict_path
' c6 J7 I4 t: ?0 }- n4 t @ - break
7 l" g+ g: W% ?7 {8 `. I - super(xxxxx, self).__init__(dict_path)& h+ K( Y) l3 d) C- C* o" ?2 ~ J
" k. `9 v9 \8 h" Z- @property
, m. R2 p' n# N. o! q - def title(self):
/ l! f% x( Y( o. j - return getattr(self, '__register_label__', self.unique): M6 k% K* \8 P
7 U8 Z1 _7 P8 j- @export('PHON'). ?& l& P: h5 y% v
- def fld_phonetic(self):
, N+ x7 K% ~8 ~0 g6 {9 c - html = self.get_html()- T' o) S9 C& |5 K- O
- m = re.search(r'<span class="pron">(.*?)</span>', html)
& `4 \) [6 q/ r3 z Z - if m:
5 x8 k: P0 W. e - return m.groups()[0]! j7 h7 W7 p3 K4 C8 _, G
- return ''- q: x) {/ v9 d/ M
* s8 ]( A4 q/ g3 O+ n+ y- def _fld_voice(self, html, voice):
" T* }- X' N9 e- f9 v6 ?3 V - """获取发音字段"""
3 T( |2 I( _3 b; ] - for regexp in LANG_TO_REGEXPS[voice]:
( i5 P( q6 E% c. }) Q! |+ L, K - match = regexp.search(html); _; N! J/ b; P2 D
- if match:' `$ E0 V3 u6 G7 Z- h, s R
- val = '/' + match.group(1)
0 L& R! E! W( g( { - name = get_hex_name('mdx-'+self.unique.lower(), val, 'mp3')- @+ ]9 m0 `' Z- X
- name = self.save_file(val, name)
. o- f" s( k; {. R - if name:4 G- t( s" S6 e" Y5 x( S
- return self.get_anki_label(name, 'audio')& v% E* l5 N: e; X$ n8 a8 b+ h
- return ''
. P" i5 | R; m( H - ; z) S2 q/ ^' _
- @export('BRE_PRON')
: Y" T# B8 o) o! E) P, j, g - def fld_voicebre(self):. e) ?# Z& \1 a- T
- return self._fld_voice(self.get_html(), 'br')
4 y7 o5 n. V+ P1 T - 6 V3 o$ k. Z- p2 d- l+ y: ~
- @export('AME_PRON')
; E, K. S: x: Q - def fld_voiceame(self):* E+ w; f# E( B
- return self._fld_voice(self.get_html(), 'us')
h: y7 ` K3 o9 D. f a( _
% p4 T: g# Q1 v! ^) @5 N9 R) b* P, C- def _fld_image(self, img):
8 j4 g1 S/ v8 A* I) N& ^ - val = '/' + img
7 k& T" F: j% t% N6 L3 M6 g - # file extension isn't always jpg2 x8 f/ B/ f# a; d( O( v3 r
- file_extension = os.path.splitext(img)[1][1:].strip().lower(), _" H$ S( }7 T K' _( l0 I
- name = get_hex_name('mdx-'+self.unique.lower(), val, file_extension)+ s0 ^- e* y" _1 g' L
- name = self.save_file(val, name)- E2 a% M9 r, ~8 ]) p0 d
- if name:
1 q5 U: G& x& q9 H& N( H$ s - return self.get_anki_label(name, 'img'), `/ G! \" E8 J" ~
- return ''
; |; P1 B* B2 p5 E( C
# c" X2 E) V1 _% i2 W) [0 L- @export('IMAGE')! g* A9 l4 } a. h3 P
- def fld_image(self):
. Z# t& A$ u8 {; X, R - html = self.get_html()
* _; ]9 \) v# V; { - m = re.search(r'<span class="imgholder"><img src="(.*?)".*?></span>', html)
0 u" _! [% g( P7 g - if m:5 r9 |4 h+ k0 o4 |, n; A
- return self._fld_image(m.groups()[0]); s3 ^7 {/ o6 `8 O k- L; H- G
- return ''
& p. U# J5 ]; X1 p* I0 l
4 t6 x: Q& _* h2 b" N- @export('EXAMPLE')
2 L- n8 I2 r4 p9 J' F - def fld_sentence(self):
1 t, y& W# m) H4 Z1 e% V - return self._range_sentence([i for i in range(0, 100)])$ P9 }7 o0 k0 l6 c% p" F
- N% g( u V4 b3 {+ b- def _fld_audio(self, audio):9 {$ V9 X; X. E1 X( ^# m
- name = get_hex_name('mdx-'+self.unique.lower(), audio, 'mp3')6 b! J9 W( G4 t l
- name = self.save_file(audio, name)6 ~0 ~( g' O& i p
- if name:' I1 b1 Y, \) r6 t# v* a! j0 Y: t" C
- return self.get_anki_label(name, 'audio')
" c: T3 ]: D1 i) p - return ''( F: Q" M; V8 T
- 8 h- p* f8 N, u( a
- @export([u'例句加音频', u'Examples with audios'])
1 W; ?% j9 I6 D. d! `& _ - def fld_sentence_audio(self):/ z( ~/ o* V7 `6 q7 e5 c' Q: g' u
- return self._range_sentence_audio([i for i in range(0, 100)])
( T+ t: p, A9 P
0 L0 J' Z& M* W- j+ |* p; _1 _- @export('DEF') Q u; z& z r' ?
- def fld_definate(self):: u' U7 S: i) [& t/ A0 ~
- m = m = re.findall(r'<span class="def"\s*.*>\s*.*<\/span>', self.get_html())9 D# {; j; f+ I* Q; ~
- if m:
& E4 B' x/ X3 d w" Z7 l8 l - soup = parse_html(m[0])
/ k- V6 x4 o7 a! u& F; u - el_list = soup.findAll('span', {'class':'def'})& B+ M& |' w3 B' Q6 [
- if el_list:1 m6 j/ Y5 r/ ^4 u# {
- maps = [u''.join(str(content) for content in element.contents)
$ K" @4 U2 S/ i - for element in el_list]! u- I3 h; s( k
- my_str = ''
. ~0 K' W' g$ G! F7 J - for i_str in maps:+ C# h- A2 |$ z7 F' R1 Y+ v/ W
- my_str = my_str + '<li>' + i_str + '</li>'
* L' m4 T' i U e; ? - return self._css(my_str)
3 C$ }" ~2 p6 m' k- r" |; ^ - return ''" f3 ?/ W3 s: c( P ~% A
- W; j8 j; g5 m. |7 s& f- @export([u'随机例句', u'Random example'])
2 e4 o' g0 j( j2 w- Z - def fld_random_sentence(self):
1 C/ q' |: N: ?4 h - return self._range_sentence()9 S, S* c$ x" M* z
- $ \8 p! |. {* g, V8 \ e" @! @1 S
- @export([u'首2个例句', u'First 2 examples'])# Q; I, H0 j- a3 R
- def fld_first2_sentence(self):
$ g) W' @& D3 D - return self._range_sentence([0, 1])6 } B' ~+ Q- I" n
- : H c; b6 E" Q9 Y
- @export([u'随机例句加音频', u'Random example with audio'])
5 W- p3 ]. `# H% \ - def fld_random_sentence_audio(self):
+ Z+ f& p- n& f - return self._range_sentence_audio()
$ N8 d, k K) e8 C - 8 W+ A! s& a/ D2 |. u
- @export([u'首2个例句加音频', u'First 2 examples with audios'])
1 d$ n! |1 V/ g - def fld_first2_sentence_audio(self):) c% ]" L5 ~ `; L" M; Y
- return self._range_sentence_audio([0, 1])% s: S$ t- o. z2 {: t7 b
- / C/ F$ z- `( S! A/ z# v! M1 Y7 G
- def _range_sentence(self, range_arr=None):
, v, W K- [0 }2 h3 U4 b) v - m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html()) S' n, F6 l0 F: K m/ Q
- if m: r: P+ Z, d" G1 R9 J6 l
- soup = parse_html(m[0])- w5 h( O) ^6 R
- el_list = soup.findAll('span', {'class':'example'})
/ [' G O! C3 n" _5 u4 a8 h' e. u - if el_list:, x& f, N7 X8 d% L3 Q- W p5 Q2 s8 @
- maps = [u''.join(str(content) for content in element.contents) 3 c7 C) k+ P+ ~# o$ \
- for element in el_list]
% O7 d1 u) e* n( | - my_str = ''9 _% p7 S6 I, e9 U* p
- range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)] J- F. Q" j( h B
- for i, i_str in enumerate(maps):- `( L- D. u3 I6 C
- if i in range_arr:
0 Y6 H* H" q3 _, M - i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip()
- c: j8 p; Z( C) ~; e, V* a - my_str = my_str + '<li>' + i_str + '</li>') I% |$ ~4 F z( J" N. d
- return self._css(my_str)1 B- w6 h" ]) @2 Y$ W
- return ''
& \" j4 y& _. C - : [1 X% a/ u4 r0 s0 ?" n0 f7 m
- def _range_sentence_audio(self, range_arr=None):
* `1 S2 n" ]- e) ]' L2 M5 G - m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html())$ x* m9 v) d4 h4 i) l9 @% D8 b
- if m:
) @/ x2 X- D! P5 I" p' R - soup = parse_html(m[0])
& ]* k, y. L0 y$ v - el_list = soup.findAll('span', {'class':'example'})
( v& A8 k% v! L; z' F - if el_list:
# N% Z4 Q7 S, G3 p4 l+ Q) U' x - maps = []1 s4 b5 h$ q7 H1 M3 R* E
- for element in el_list:
1 L/ ?1 Q* A2 E. r4 r - i_str = ''+ E2 J6 b; `8 o h) _
- for content in element.contents:
. v, f$ a) \. _% q5 U+ C+ k - i_str = i_str + str(content)
' f8 r1 w% Z% O" i; u' l) n - sound = re.search(r'<a[^>]+?href="sound\:\/(.*?\.mp3)".*</a>', i_str)* R% y: t" R' \" D! z
- if sound:1 V" x+ P* q2 n6 O8 j
- maps.append([sound, i_str])
3 e/ _4 G8 d9 \' j4 ] - my_str = ''. b% |- Q+ K8 z! L1 l" _& |
- range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)]
, d) F d! Y9 N$ U& y - for i, e in enumerate(maps):! d0 J8 j4 z+ s) e2 F- F
- if i in range_arr:
7 _( J6 v# e( \' {& I - i_str = e[1]5 q" E) z! x) U$ \# F! x
- sound = e[0]4 f5 o: q( E2 }0 g
- mp3 = self._fld_audio(sound.groups()[0])" p( `8 Q: V# z# ?( H1 F% R
- i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip()9 x. s4 d2 K, n; V. l! |4 g
- my_str = my_str + '<li>' + i_str + ' ' + mp3 + '</li>'
" y5 G1 b9 ~0 p: X - return self._css(my_str)
% T7 U* n) r/ v2 T2 k* i+ |7 N$ A - return ''
% J- x0 a5 f3 F - / D( ~8 r# s3 n- o. j1 V. i
- @export([u'额外例句', u'Extra Examples'])+ A9 |% C6 F' R* e3 A- [
- def fld_extra_examples(self):
5 ~! v6 o7 m: ^3 j& v: [3 y - lst = re.findall(r'href="/(@examples_.*?)">.*?<', self.get_html())
/ ^: b* y; a/ y - if lst:) h5 b" c- t6 _1 i+ n+ o
- str_content = u''0 Q8 l4 M) [. P8 B. ?
- for m in lst:
( n$ T! x& k3 X - content = self.builder.mdx_lookup(m)6 U8 ^- [# k7 q. ?5 j. \9 ]
- if len(content) > 0:
x/ @. y7 H: d/ f9 g3 ]0 ~ - for c in content:
9 D2 K: x; a/ Q" E, T |/ H - str_content += c.replace("\r\n","").replace("entry:/","")
; @2 V+ v7 ^, c" P - return self._css(str_content)
; n9 q8 Q5 f# n: ~ - return '' / d6 m" \3 ~4 R! q$ k, @+ w1 e3 A
: z+ h3 T ~ ]- @with_styles(cssfile='_ldoce6.css') `4 w9 H, {0 a9 L& `' E9 Y
- def _css(self, val):* l! I: G/ w" F
- return val% ]# Y! X d; X+ t% x+ f; T( Z& w9 A
-
复制代码
8 z$ P8 B/ _% S T# {1 b# ^' C7 ?- j) i/ D
- * i, J* Q) x/ n& I& ^. ]! A4 S/ W
- ^([^\t]*)\t(.*)
: {) R+ r/ C" I% n3 ~( Z, ~. @ - \1\n<link type="text/css" rel="stylesheet" href="LDOCE6.css"/><div id="LDOCE6_Zzz_1"><span class="entry" id="zzz" new="NewInLdoce"><span class="entryhead"><span class="hwd"></span><span class="hyphenation"></span> <span class="brevoice"><a href="sound://\2"></span brevoice><img src="img/spkr_r.png"></a> <span class="amevoice"><a href="sound://\2"></span amevoice><img src="img/spkr_b.png"></a><span class="buttons"></span></span><span class="sense" id="zzz_s1"> <span class="def"></span def></span></span></div><script src="entry.js"></script>\n</>
复制代码 I6 |- l; y% c, U" G
- y7 g8 g+ g- x I/ L2 ^- V. l! j! E" F: x3 s: p. p' c
有点复杂就说下原理,懂的人自然懂
# @$ P- e% [# S+ g3 ^' ?
0 b8 s- N' U0 z2 CFastWordQuery的作者特别做了朗文 6的支持,包含 .py 文件 特制的 mdx mdd
, w# W1 b* v# B; K M
- S, v% L4 B5 d' H$ a我就根据那个特制的MDX直接造了新的mdx出来,然后就支持了.
( B9 n' k8 f0 H) o! gpy代码比较复杂看不懂,我只是改了里面的词典名 xxxxxxx
- E+ k& r4 t: ?. w
& X9 c8 j& H4 q# R然后正则表达式是用来制作mdx的
) z+ F+ G/ _& Q& l' M/ C+ H( Z, b8 Y$ e: f
词头\tMP3* o) N& f, E7 V+ h0 j
" W8 z' y) @3 R q# S& V1 z
+ H; s5 h! z! a1 q% e想想应该也没几个人用得上,学英语以外的外语的人应该不多
5 h2 o- q( p. u3 `, a( }+ ?
, c1 z! M9 G: e: g- v# B9 J+ S
8 o$ c8 C+ k6 \) s; |8 k6 Y6 E
# w _" W6 ? q' e a |
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
评分
-
1
查看全部评分
-
|