TA的每日心情 | 怒 2023-3-8 00:39 |
---|
签到天数: 4 天 [LV.2]偶尔看看I
|
本帖最后由 instrumental 于 2021-3-18 04:18 编辑
0 J. V2 G( i1 N
. r1 c5 o; a$ M4 o! v( F% j2 jFastWordQuery
, y% ]- v6 V0 [! @5 l4 N% g- m) _) ], ]/ |' i8 M9 g) s
在为[[Anki]]制作单词卡片的时候,发音一直是一个比较头痛的问题。
3 V) x9 u; G5 O8 l) x, A; q有道的发音是比较容易获取的,但是有道的发音不纯正,很多音读得不是非常正确。
0 U- n; M; @0 j3 A QTTS也不能与真人发音相比,而且机器合成音在背单词的时候会有不良影响。
+ D) q0 x' o/ c) X0 t6 {' G% K相对来讲朗文词典的真人发音就非常的纯正。
n% g/ G; B1 W所以FastWQ特别针对朗文mdx词典制作对应的字典查询服务,以便快速的为单词添加读音并下载音频到本地。
; a; c+ U) v$ g/ o7 u按照以下步骤,可以很方便的为整个单词本快速的添加音频。9 B5 u& q& r7 f0 e# U6 V7 _
. P6 V# V4 b7 T9 U9 h
使用的时候切记不要导入词典而是直接去设置py
0 n3 ~" O4 k7 G: z
0 I& N* K" }% t6 k. H9 [- K特制的朗文 6 本地词典
) @, S3 H" }1 ^+ z+ k3 E; k$ P0 o- x# c
- 1b5de81464fb46ca892eab4a698a207e#1b5de81464fb46ca892eab4a698a207e#1206#/LDOCE6双解 修改 可提取音频/entry.js/ e/ x+ E5 S, N- S
- f24817ae7bc736d4365908da3f87f77f#efd11ff3580a0f7dc0735119d296cf98#1297932816#/LDOCE6双解 修改 可提取音频/L6mp3.mdd; ~/ A: h: J& W! ~" n; R
- ee13707d1b966a49bc3149b111cd6b4d#60a7f4d9a40eb2359b4ece19876f584d#26550272#/LDOCE6双解 修改 可提取音频/L6mp3.mdd.db
7 ~' ?8 s+ g9 A( F9 M - 7f301751b90d10f33412de45a599f2da#60a24ed264dc49dd7a8b05a1ace848cf#124057083#/LDOCE6双解 修改 可提取音频/L6mp3.mdx" y5 e( I, O4 N# n% C" n1 ?
- aefbf1c0cae35980122ace73f533409e#df04641926e6d88694ecee7dcada3d19#11501568#/LDOCE6双解 修改 可提取音频/L6mp3.mdx.db
" E \9 F) Q8 l" n - e79f484789815550da723c944b8f27d8#e79f484789815550da723c944b8f27d8#12642#/LDOCE6双解 修改 可提取音频/LDOCE6.css
复制代码 1 |8 O! q; ~" a+ B$ [% t( h
5 f+ Y- Z+ J5 t
修改程序的YHCD,然后用正则表达式制卡就好了
) I" [# D8 L; u- S4 N! a' e
4 ?' E& }) f$ E6 {- q
) l' |' Z' u" s( s$ X
# S5 X8 u- L9 d7 s" a' d3 r Y$ K- - [FastWordQuery - GitHub](https://github.com/sth2018/FastWordQuery)
n/ G2 S$ v# K; D - - [为单词添加真人发音(朗文 mdx 词典)](https://sth2018.github.io/FastWordQuery/docs/get_mdx_ldoce6_sounds.html)8 ~* ~! s9 m8 q# e5 v# C
- - [anki 神级插件 fastWQ 提取本地朗文音频・语雀](https://www.yuque.com/purequant/anki/sudl9z)
复制代码 V3 \$ C2 M q6 F. f
5 M, |9 Q' k9 M" s- _
2 y! e8 D* z, A
4 @* [ A' p5 L* n; ?, ?9 L( y5 F: u- ?/ \0 U$ I5 v
2 F3 M( T8 {% p% q0 g9 D/ S
8 N2 O4 [0 v5 ^1 b. {
9 d. \- Q, d& a+ T. O# [* q& @( \2 p9 W0 Y7 p, k2 p# K
* E. y/ f( R. x+ q& g
2 ]! I, {* E2 x0 @5 a( W4 n! ]9 \
" H+ Y+ k) }2 j/ b9 P& v1 X, ~0 Q/ f, M2 J
+ k6 m5 d! X' y) c2 `$ ?# l- T- {8 K4 K; M3 t& h
/ P9 |" f- l: a6 |$ B3 _; ]1 k1 K
3 ]. K! o x) W7 {7 C6 O" A3 _
/ T7 o) \0 n3 {( @. A- #-*- coding:utf-8 -*-
# @( W9 F8 f4 @' x. v) L2 I - import os
4 a1 L7 o# c) R - import re. u3 K/ ~9 z' m, _2 w2 x+ _* W
- import random
: R% ^( ^% }' D6 R3 i( F - from ..base import *: T+ {. @; T; Y9 `' [
. i% m9 w& Z7 p: v! \- VOICE_PATTERN = r'<a href="sound://([\w/]+\w*\.mp3)"><img src="img/spkr_%s.png"></a>'& I$ S4 O3 Q( }& Y5 e
- VOICE_PATTERN_WQ = r'<span class="%s"><a href="sound://([\w/]+\w*\.mp3)">(.*?)</span %s>'4 X0 I$ p1 J8 {1 H6 Z
- MAPPINGS = [
: ^2 e4 v/ V' Z' y& w' N0 _ - ['br', [re.compile(VOICE_PATTERN % r'r'), re.compile(VOICE_PATTERN_WQ % (r'brevoice', r'brevoice'))]],+ @+ n) l0 ]6 |2 i o
- ['us', [re.compile(VOICE_PATTERN % r'b'), re.compile(VOICE_PATTERN_WQ % (r'amevoice', r'amevoice'))]]$ V7 U" \5 p$ j) ?
- ]
3 n7 j! t% j( g+ M( u - LANG_TO_REGEXPS = {lang: regexps for lang, regexps in MAPPINGS}8 e7 p% }) ^& y8 v7 n
- DICT_PATH ='D:\\111111111111111111111111111111.mdx.mdx'5 |0 `. H' y" {4 e- o! ?8 a& N
: C7 a# \: h* _4 ^& j- P- 7 V# z& F6 l0 [( e# ]
- @register([u'xxxxx', u'xxxxx'])
: k& A' c8 g) L' \$ F - class xxxxx(MdxService):1 X, ~: W4 X4 r9 O
- , {; w j, i2 ^0 e$ R5 u
- def __init__(self): U+ ]2 a* w: r7 e4 O- ?; V' w
- dict_path = DICT_PATH
5 ~& ^5 Q' z$ k* o+ Z, | - # if DICT_PATH is a path, stop auto detect; [6 E% E$ K8 i. r
- if not dict_path:
9 [0 P! U* D v+ F4 E- Q+ `0 n - from ...service import service_manager, service_pool) W% o0 b; I9 F$ @6 L7 @+ e4 z# P
- for clazz in service_manager.mdx_services:3 e0 J; ?/ P& X" U N" p
- service = service_pool.get(clazz.__unique__)! i8 j1 e. [9 N5 ~ a: u
- title = service.builder._title if service and service.support else u'' Y8 x' ?3 i3 ?
- service_pool.put(service) m% t/ Y" ^! I8 c( }* S5 d, Z( C' s
- if title.startswith(u'LDOCE6'):5 X1 X9 U% W! x( h% \) \8 n. q
- dict_path = service.dict_path
! t# k' }# ~0 U7 b* q5 W - break; o! J/ x# N6 C- S3 n" c
- super(xxxxx, self).__init__(dict_path)! E% u/ ?9 W& `% k6 Y, k
- 6 L8 {9 l. \0 J5 d
- @property: Z; [( u! m" C+ z
- def title(self):- s! v) Q" p, b4 n% B
- return getattr(self, '__register_label__', self.unique)# P' y( K7 K+ ]& \- i& E* G3 v* D$ P
; H* Y2 O; b' ?! }; r1 ]& ^& e- @export('PHON')2 E% U: O; D' n/ B) ]% f. W7 s
- def fld_phonetic(self):
4 |3 d/ V2 l" ^! Z - html = self.get_html()
! L4 }* w( o! \$ Y3 @# j - m = re.search(r'<span class="pron">(.*?)</span>', html)& I+ e% I: n, M9 K( P
- if m:) e/ V. r1 I! p0 J9 L6 u
- return m.groups()[0]
, T5 Z2 {: l f# x1 L' M' _ - return ''1 \7 i* y, S' Z' Y- y
1 D& p2 @* e9 K& `+ z- def _fld_voice(self, html, voice):
, x4 v* R& Q4 K' s, Q+ L - """获取发音字段"""4 N1 _, Q- w9 t% f9 j- Q
- for regexp in LANG_TO_REGEXPS[voice]:- z% H; Z7 @! _4 R% q9 w1 Z
- match = regexp.search(html)- ?/ r8 K6 H3 n; B8 ~' G* q
- if match:
/ z, I6 L$ j1 R; D7 \- x1 Y" q - val = '/' + match.group(1)
7 s& h" a' _5 s: X - name = get_hex_name('mdx-'+self.unique.lower(), val, 'mp3')6 f% j8 |, H/ l8 k4 x g
- name = self.save_file(val, name), l3 {) \( }, n) Y& c+ @# l; U
- if name:9 U0 a w3 T9 q/ ~" U* ?. D
- return self.get_anki_label(name, 'audio')/ }) o/ B3 O% y$ K( k
- return ''
( x! t* X- {# v7 i7 [" H - 7 `; `3 L) s A5 Y# m
- @export('BRE_PRON')
# \% c# `( F' }, U9 G- P - def fld_voicebre(self):
1 f: G5 u8 c3 A, g) n7 }, N - return self._fld_voice(self.get_html(), 'br')
( K/ s% r4 Z5 ^8 \' O& q& @ - : W( }7 m/ J: D
- @export('AME_PRON')5 H! J0 v! t( `
- def fld_voiceame(self):9 r9 J4 o3 q, K% f# \
- return self._fld_voice(self.get_html(), 'us')0 t- m9 l. T! U* ^$ A
- / a# n' E/ a6 L
- def _fld_image(self, img):
. B" u3 R! a3 L& v* i$ x - val = '/' + img9 o6 t6 T% I. P, ?& ?, Q5 [! w
- # file extension isn't always jpg9 q/ O2 z! N9 m4 r+ v
- file_extension = os.path.splitext(img)[1][1:].strip().lower()( M7 R* y2 s4 [5 ^* L5 l
- name = get_hex_name('mdx-'+self.unique.lower(), val, file_extension), _' }4 F/ z& Z- y" O% k1 z
- name = self.save_file(val, name)
: J* i" |# y; Q) r: _1 q - if name:
" ]# Z' u5 F: Y4 O$ ` - return self.get_anki_label(name, 'img')
$ h7 f z7 P( A - return ''
/ K n7 h4 g" K* _) Q3 G - 2 ?# r, u0 B0 A3 M9 m8 R
- @export('IMAGE')1 b6 N" ~; u$ A2 n, U2 o
- def fld_image(self):+ b$ N. B2 a" ~8 X+ ]& [
- html = self.get_html()
7 k( Q( K4 X" Y9 P4 h! F# j P - m = re.search(r'<span class="imgholder"><img src="(.*?)".*?></span>', html)7 y9 @; E, x1 c
- if m:% d1 \( H* e2 V$ Z
- return self._fld_image(m.groups()[0])5 x/ \9 a' c1 S" V1 K
- return ''
" ]- |+ I, R* s - 2 \& M P6 j. f3 X) H- _
- @export('EXAMPLE')( U6 W3 B6 \% u- ?* I$ D0 x
- def fld_sentence(self):2 `( s$ E6 A# ~3 e8 h
- return self._range_sentence([i for i in range(0, 100)])) C/ I; y# C- M) E2 t2 C$ g1 ]6 \
- 0 y* c) E- P2 q3 D7 L. Q
- def _fld_audio(self, audio):( k& ?& O, Q; X/ G0 D
- name = get_hex_name('mdx-'+self.unique.lower(), audio, 'mp3')
) o$ T# D0 @8 [2 j: p5 e* _ - name = self.save_file(audio, name)
. M7 N8 }" i- N - if name:
6 U$ p3 o' w' v5 u0 w$ Q) u; E - return self.get_anki_label(name, 'audio')' ?2 F3 q. j4 Y D& j
- return ''+ g# x2 ?' ?+ d7 R2 _* J
1 E3 \. W) _3 I2 X& ]. L- @export([u'例句加音频', u'Examples with audios'])
8 L4 ?1 O! c$ O# B/ Y2 Q4 k - def fld_sentence_audio(self):
1 [3 D9 w( m5 P0 d1 e; F% c+ J$ r- N - return self._range_sentence_audio([i for i in range(0, 100)])
' Q: m1 B! z8 z - & z9 M( c: k/ G+ e* v, m' N$ G
- @export('DEF')" u0 W1 b9 y5 F5 ?
- def fld_definate(self):
7 M# @; `! N% ]- K* `+ | Z - m = m = re.findall(r'<span class="def"\s*.*>\s*.*<\/span>', self.get_html())
, }1 h9 ?4 t7 h# q% U: i - if m:
/ W, N0 [* H4 x - soup = parse_html(m[0])7 j7 ]+ x4 j0 T' C9 g. K: T
- el_list = soup.findAll('span', {'class':'def'})* R: I# f5 j w2 n N
- if el_list:
4 e' f" U0 I& |) \ - maps = [u''.join(str(content) for content in element.contents)
9 m5 I# Q G) `3 R - for element in el_list]8 ] L5 A d6 J1 f$ Q* U: t
- my_str = ''2 M) o2 B( w7 B0 l3 W" G8 F
- for i_str in maps:
5 d1 h$ g X) D3 s( x1 b. j - my_str = my_str + '<li>' + i_str + '</li>'
9 W& e( O7 I6 Z) `7 y - return self._css(my_str), |, p0 V+ a- A) s" ?/ p- R
- return ''
9 ^& ] ]) ~- y6 Q
* c! n* |' F; b, h, k- @export([u'随机例句', u'Random example'])
8 n5 v; n. d, L, ` - def fld_random_sentence(self):
# P, W; \+ w, K$ e8 g - return self._range_sentence()
! T; [# T1 F- m" E6 c/ P
: U. w4 e* X5 C0 N- @export([u'首2个例句', u'First 2 examples'])4 o8 R( H4 |+ C$ d# E0 P; q- T- e
- def fld_first2_sentence(self):
6 U$ p8 r& T- R - return self._range_sentence([0, 1])9 ]/ O% ?" d7 j' F! k, G
- : j. A; m, L1 w$ T# Z" `3 F* g
- @export([u'随机例句加音频', u'Random example with audio'])
N" @9 h: t' G6 U1 v4 u, V - def fld_random_sentence_audio(self):
6 r+ }6 Y- c$ X, @; |+ ?# i - return self._range_sentence_audio()$ f$ l$ }- p& ~9 B0 V' F
. Q' x0 u+ ?9 Z; Q: s- @export([u'首2个例句加音频', u'First 2 examples with audios'])
0 H3 `/ z w5 I9 o8 G4 `0 D - def fld_first2_sentence_audio(self):/ {, q+ v" I) B# ~( v
- return self._range_sentence_audio([0, 1])( e4 \; p& J* Y$ z+ d$ r
* s6 O0 Z+ ?* L3 S- def _range_sentence(self, range_arr=None):+ O- \3 T3 j/ J9 Z
- m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html())& w) {4 V8 F8 z8 I8 p6 {
- if m:0 \9 s( e, y z% a5 u% I
- soup = parse_html(m[0])
; @- g& v* }( a5 ? - el_list = soup.findAll('span', {'class':'example'}). M& h; \& h# P: o% u M. f
- if el_list:
3 ~3 Z% @' j, W2 ]: Q - maps = [u''.join(str(content) for content in element.contents) Z# X5 i. W1 F5 D8 {& t+ c
- for element in el_list]
/ n7 p7 P( v0 r# o2 N. v& o - my_str = ''' i6 h" _' y/ H* _; e, |. W
- range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)]# e" D& f1 S6 d
- for i, i_str in enumerate(maps):3 S% Q" k" a7 b, L+ Q/ I
- if i in range_arr:
& S& [7 h( T0 G - i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip()
* l: ~1 B u+ l" ^5 X# |- i - my_str = my_str + '<li>' + i_str + '</li>'
7 U* q2 X7 o; N2 V# Z$ U/ W, R6 A: `, u - return self._css(my_str)
# u. T$ {3 w- D+ ~- ? - return ''
- q0 r1 G: a5 A7 n - + Y! C* H; F* c* B
- def _range_sentence_audio(self, range_arr=None):
- t, ^9 F3 P, j& T( E6 v - m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html())
: v$ {' ] O0 k9 z @. ]1 z - if m:
+ z0 |5 |0 X) |1 _. n2 R/ w# C9 O - soup = parse_html(m[0])
* o6 A# z9 Z+ A/ k, @& L: X, w- ^" c - el_list = soup.findAll('span', {'class':'example'})/ w% ]! g3 A1 Z; w! Q2 l" Z( g" y
- if el_list:
) ?; v5 ~) c. }& n' y p, N - maps = []
; _- u; r) k3 i: z$ O2 q - for element in el_list:
, Z" A% {2 |2 H( b# [ - i_str = ''
6 P' X& G2 `3 ~ `( U; N - for content in element.contents:) I9 O/ d% h) n9 d" _
- i_str = i_str + str(content)
4 N' y+ A. F, ]7 X3 _3 N, }& }, s1 u - sound = re.search(r'<a[^>]+?href="sound\:\/(.*?\.mp3)".*</a>', i_str)
4 S. u- m% D# ?' g! s - if sound:
( F/ h: y1 G9 P6 S5 o - maps.append([sound, i_str]); T) V1 G0 S% k
- my_str = '') V- I' x8 z2 R+ p
- range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)]
4 A8 G2 J% \: E8 {. O - for i, e in enumerate(maps):
8 u. z6 j2 k6 x - if i in range_arr:/ {5 m9 |; t4 m' U8 q5 f3 m1 r
- i_str = e[1]
) |3 T5 q" s# i) F) h6 J: H7 }# I - sound = e[0]" @8 G* ^! z! Q% C* t% a6 |4 \ ]
- mp3 = self._fld_audio(sound.groups()[0]) s9 d$ ?, P$ |4 u7 v
- i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip(): V3 A! ?" ~' y3 [9 c7 [: Z
- my_str = my_str + '<li>' + i_str + ' ' + mp3 + '</li>'
* `; [& k" Z& c8 E* o9 `% I - return self._css(my_str)/ m9 w1 L5 A( ~5 V/ F' l0 a6 B
- return ''
, {4 S2 `+ [9 F+ \
' z( d) v% | B. s) m( {- ]8 x- @export([u'额外例句', u'Extra Examples'])
/ E& @2 |. r4 _# \5 O - def fld_extra_examples(self):
# r% d5 x3 {5 _ - lst = re.findall(r'href="/(@examples_.*?)">.*?<', self.get_html())
1 g2 q+ a" H9 K! K - if lst:: A Q* q+ ~% D$ M* P) K0 |
- str_content = u''
7 N F7 U8 ]4 k) \3 J6 H; r8 \6 A - for m in lst:
, Q& D4 u& |! D% q( n9 |- v! } - content = self.builder.mdx_lookup(m)
& R3 P( k+ m4 _* `8 E - if len(content) > 0:
- m, d5 h+ K3 H" X8 Y - for c in content:
$ q, }% l+ x) _, d. Z - str_content += c.replace("\r\n","").replace("entry:/","")6 [$ \- o! C+ n
- return self._css(str_content)! f. ~: x" ], H9 ^% i; n
- return ''
. w* o, o6 m# q2 x; N
6 U* U& x" h D0 e2 ]' [' e# q- @with_styles(cssfile='_ldoce6.css')
! [) V# H9 k5 p+ u3 \ - def _css(self, val):
: J: \! e! K, f. ]3 E0 {6 j - return val
5 p, `/ p0 E& G3 O, Q1 k# v. t5 y -
复制代码
3 |8 J/ m. }9 [% L4 n$ x/ H8 B9 d- w5 ~9 p$ v4 j, z o$ ]
9 P% W# i+ H( j' Z' |- ^([^\t]*)\t(.*)8 H9 N2 h. n1 D$ m- X+ j+ W2 L8 c
- \1\n<link type="text/css" rel="stylesheet" href="LDOCE6.css"/><div id="LDOCE6_Zzz_1"><span class="entry" id="zzz" new="NewInLdoce"><span class="entryhead"><span class="hwd"></span><span class="hyphenation"></span> <span class="brevoice"><a href="sound://\2"></span brevoice><img src="img/spkr_r.png"></a> <span class="amevoice"><a href="sound://\2"></span amevoice><img src="img/spkr_b.png"></a><span class="buttons"></span></span><span class="sense" id="zzz_s1"> <span class="def"></span def></span></span></div><script src="entry.js"></script>\n</>
复制代码
0 d1 K" o1 k# f% S% X w) m* x M+ Q* T7 |
/ w8 j# k* R2 a) ]$ i有点复杂就说下原理,懂的人自然懂
- g% g1 u) s$ l
+ p% A9 a6 \9 h3 w) l1 x4 _FastWordQuery的作者特别做了朗文 6的支持,包含 .py 文件 特制的 mdx mdd% o1 d) I- g, \) ]# ?! q/ m
3 g) `0 ^, U; }% c+ Y# Y我就根据那个特制的MDX直接造了新的mdx出来,然后就支持了.
_; n& r- f* l% S; i/ b4 Jpy代码比较复杂看不懂,我只是改了里面的词典名 xxxxxxx
# M! _3 I' `" G y3 M" y
, k) o7 h9 ?7 R- }% t然后正则表达式是用来制作mdx的( v, d5 j2 K, b- s. U
9 T% D, e1 Q' {3 i
词头\tMP3: u& G" t* `# K$ p+ k
& R$ D% X3 I' z% c) P+ I9 i. M$ _" `% P# x
想想应该也没几个人用得上,学英语以外的外语的人应该不多
; K3 g2 O) E- {& N$ P; F3 A) ]. g. j" C) z& l* h( T
/ f4 L' Y0 {6 c% G2 s1 ^
; V& J. P! w& n; l( ?5 J! m |
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
评分
-
1
查看全部评分
-
|