TA的每日心情 | 怒 2023-3-8 00:39 |
---|
签到天数: 4 天 [LV.2]偶尔看看I
|
本帖最后由 instrumental 于 2021-3-18 04:18 编辑
1 x# \' y( R+ G0 {! M
2 w' U. v/ ?5 _" q8 K; {4 wFastWordQuery
9 z, r' e- G2 O0 b# o1 L, }! s
- a6 I2 e5 E2 }在为[[Anki]]制作单词卡片的时候,发音一直是一个比较头痛的问题。7 I0 |- Q6 O9 z$ M0 A
有道的发音是比较容易获取的,但是有道的发音不纯正,很多音读得不是非常正确。( a8 H @3 T ~+ A2 f9 H* T! b' A
TTS也不能与真人发音相比,而且机器合成音在背单词的时候会有不良影响。
1 @( t2 F- G7 ]( u$ p; {8 Y6 |& w相对来讲朗文词典的真人发音就非常的纯正。
/ {% d6 Q2 g& l4 N! m所以FastWQ特别针对朗文mdx词典制作对应的字典查询服务,以便快速的为单词添加读音并下载音频到本地。
. Y Y' w4 N% _: Z9 V7 L1 D; o7 G按照以下步骤,可以很方便的为整个单词本快速的添加音频。
9 N' K" G }) X8 k& l
0 J) q4 Y. X, ]: L8 p2 D使用的时候切记不要导入词典而是直接去设置py4 h- Q8 U, B. }' }( R9 h. j
. S% P* [! c& b% V) d
特制的朗文 6 本地词典
- `1 A) `: x& o5 n- r3 j$ p( B9 w1 ~- w+ t
- 1b5de81464fb46ca892eab4a698a207e#1b5de81464fb46ca892eab4a698a207e#1206#/LDOCE6双解 修改 可提取音频/entry.js) E2 t6 V( k) ^. v- ?0 x
- f24817ae7bc736d4365908da3f87f77f#efd11ff3580a0f7dc0735119d296cf98#1297932816#/LDOCE6双解 修改 可提取音频/L6mp3.mdd
- `7 `! r# W% j( r/ \; X) `3 T - ee13707d1b966a49bc3149b111cd6b4d#60a7f4d9a40eb2359b4ece19876f584d#26550272#/LDOCE6双解 修改 可提取音频/L6mp3.mdd.db
2 m8 Q! p) I& V% c* Y - 7f301751b90d10f33412de45a599f2da#60a24ed264dc49dd7a8b05a1ace848cf#124057083#/LDOCE6双解 修改 可提取音频/L6mp3.mdx E/ n7 O, E2 e
- aefbf1c0cae35980122ace73f533409e#df04641926e6d88694ecee7dcada3d19#11501568#/LDOCE6双解 修改 可提取音频/L6mp3.mdx.db D) U y- u, x6 I' e
- e79f484789815550da723c944b8f27d8#e79f484789815550da723c944b8f27d8#12642#/LDOCE6双解 修改 可提取音频/LDOCE6.css
复制代码
% n7 W& D, z% H: ^6 }9 C5 i
* C g" @# @; h5 o$ |5 V; n修改程序的YHCD,然后用正则表达式制卡就好了
# m! d8 u: w8 v+ V6 g3 m# }( C4 q" Y. G5 s

! k( F2 _3 b& x) D7 ^: u& V& y3 U. V& `! ^- ^
- - [FastWordQuery - GitHub](https://github.com/sth2018/FastWordQuery)
5 z3 o3 ^- r+ P - - [为单词添加真人发音(朗文 mdx 词典)](https://sth2018.github.io/FastWordQuery/docs/get_mdx_ldoce6_sounds.html)' U% ^$ X( U2 Q' o+ |
- - [anki 神级插件 fastWQ 提取本地朗文音频・语雀](https://www.yuque.com/purequant/anki/sudl9z)
复制代码 3 f9 e7 @0 h. y% H- d! n. p
0 n5 W( y% R1 D/ G& R% w8 L# @# h, Y# Z0 q2 S
) G4 i% |% j' ^" \8 O7 c( |
4 r! D4 H; ]- C: a7 ^) G
3 s, W' v: @/ C* o. l* Q
5 z8 Q3 \0 n* @% r7 u
1 G& k$ \6 E5 ]/ S: V5 f/ t( T1 j/ T0 r! [( R c
) \9 y' t4 Y# a5 X% Z7 X& ~' A
4 ~$ o8 y6 M. B k8 I- h5 f7 W8 B3 n4 \* q0 c+ X
! w& h% c8 Z$ s. c( n
+ Z( _! Z1 S; f$ Z9 R) [) P+ g% I) [7 {% `
- Q% |) s8 T* [# u+ J6 A
& u+ k3 z5 h- m
$ |9 G$ N+ A) L+ k q
7 D& z+ f6 x" T+ {$ _7 \- #-*- coding:utf-8 -*-1 B! l* E( t$ G' G3 e! j
- import os4 |' u4 J$ |* r+ Z/ ?
- import re
! \8 l/ D, X0 a: v/ t - import random
' W8 z: e" G: l8 i. {8 U - from ..base import *7 L7 V3 n6 F+ f2 S9 d* W5 ~( [
- ( d, S! F& t- P2 B
- VOICE_PATTERN = r'<a href="sound://([\w/]+\w*\.mp3)"><img src="img/spkr_%s.png"></a>'
* Z. a6 x, D/ M& K - VOICE_PATTERN_WQ = r'<span class="%s"><a href="sound://([\w/]+\w*\.mp3)">(.*?)</span %s>'3 | o4 o! ^1 J/ [4 t: B9 n
- MAPPINGS = [% {9 Y/ h8 A+ v: [: p4 }
- ['br', [re.compile(VOICE_PATTERN % r'r'), re.compile(VOICE_PATTERN_WQ % (r'brevoice', r'brevoice'))]],0 H7 [) T8 g- O$ I$ o
- ['us', [re.compile(VOICE_PATTERN % r'b'), re.compile(VOICE_PATTERN_WQ % (r'amevoice', r'amevoice'))]]
/ b3 U9 E: g, U8 L; C2 x - ]
4 m4 X% {0 R! f1 z - LANG_TO_REGEXPS = {lang: regexps for lang, regexps in MAPPINGS}$ U4 Y6 L9 Z6 J, [. ]1 ?( ?
- DICT_PATH ='D:\\111111111111111111111111111111.mdx.mdx'8 \6 M& v# [0 j* v
- " S5 g6 ]1 j4 N+ k& A
- * j K' L: O: L
- @register([u'xxxxx', u'xxxxx'])
- q/ o! I* a3 Q. H' d9 H* {) N - class xxxxx(MdxService):- s& f |8 k( G/ d
- % e9 ^, N! |/ T+ p
- def __init__(self):
$ ~: Z! I- Q! v, a1 {: U0 r - dict_path = DICT_PATH/ D& Z# ?. w1 Q: {* v
- # if DICT_PATH is a path, stop auto detect' t! g# [9 o9 y6 D/ t% b" ~" ?
- if not dict_path:! k, i) g `6 a6 f7 @
- from ...service import service_manager, service_pool
; k, [6 h1 f. n: @# Z - for clazz in service_manager.mdx_services:; ]6 t; n+ H- v3 t5 s3 b" |9 t; I
- service = service_pool.get(clazz.__unique__)$ \6 D7 b& @, y0 S2 z6 Q
- title = service.builder._title if service and service.support else u''
. O$ W5 O: a* j$ N - service_pool.put(service)
" t& b" Z* w- I- w! |" h - if title.startswith(u'LDOCE6'):! n" O. s/ e; {, Y
- dict_path = service.dict_path0 [! ^6 ~2 q" ~' s) N# R) R
- break% ~. [# ?6 A+ o' x2 S
- super(xxxxx, self).__init__(dict_path)
! ]2 J/ c% b$ V5 M I
( y- Q3 G2 q- i" g* p* S) j4 N. N- @property
1 |! C1 U$ g2 x" H0 r - def title(self):
9 t0 D; R; b; M# o; r& Q) ~7 K V% y - return getattr(self, '__register_label__', self.unique)3 S O+ d5 W: J% `, L
- % d5 E' @( W' ~1 w% C y$ S
- @export('PHON')9 a9 Z/ l4 ?. w; @" n- U% {7 ^
- def fld_phonetic(self):/ n J. L3 {- u' I% _3 `* N
- html = self.get_html()
3 S) `9 C6 O! ^- h2 s: ~# q - m = re.search(r'<span class="pron">(.*?)</span>', html)1 l3 E4 E# c( W
- if m:6 _ E& O) i( c: y, ]9 a5 D2 L
- return m.groups()[0]
- F& N# B+ z+ W3 y& \8 L - return ''
- `3 P6 O1 v. U( t2 M
' P5 g/ t, ?8 q u- def _fld_voice(self, html, voice):% } m5 S/ |% U* O
- """获取发音字段"""
5 q' A8 t- L. l) D/ b* a - for regexp in LANG_TO_REGEXPS[voice]:' R1 U" F5 u) f! t8 W2 p
- match = regexp.search(html)
' B2 S7 h6 G m6 R% p9 j" e - if match:% k% v; m% L& r* h4 n/ {
- val = '/' + match.group(1)1 u, l, `& V! t$ `4 p" I1 E" ~/ w
- name = get_hex_name('mdx-'+self.unique.lower(), val, 'mp3')
6 ~2 }3 G$ a0 }" K, K$ \ - name = self.save_file(val, name), ] b: J! {! M0 _2 a( a# x8 v% ?
- if name:
* A( |, B, G) M2 k - return self.get_anki_label(name, 'audio')7 Y8 f. ` W9 Q) h0 M
- return ''
# f' \, x0 Y! ^7 N* H! f - ( d& `9 r2 f$ r$ Q- H) E
- @export('BRE_PRON')
j) W. X# a/ a& ^1 H - def fld_voicebre(self):+ U0 Y) r, q: j8 W. o
- return self._fld_voice(self.get_html(), 'br')
1 y) {/ y4 P" {7 s& h/ l7 [ - , H7 ]# l) ~& q; A$ M
- @export('AME_PRON')' C* v# g3 b% S+ x' R3 h3 D
- def fld_voiceame(self):
' x& F/ d: V) c, M! U3 r - return self._fld_voice(self.get_html(), 'us')
; x* c6 ?9 N, u
3 u0 d& R# h* S- def _fld_image(self, img):
' R% K6 z9 |0 [ - val = '/' + img3 e2 p* T! I) ^7 P
- # file extension isn't always jpg2 L' y. `1 e) {; p3 ?1 d, n
- file_extension = os.path.splitext(img)[1][1:].strip().lower()
+ c* z; Z% @' W - name = get_hex_name('mdx-'+self.unique.lower(), val, file_extension)$ j- ~1 M7 n s) B9 u$ A
- name = self.save_file(val, name)& A* M& E9 z4 V. P/ a9 b9 G6 {
- if name:
/ H7 P! }$ @3 Z+ D& q6 T - return self.get_anki_label(name, 'img')
& t6 c. U. j- I3 h6 [ - return ''
4 Q& [" n$ ~" @' k1 K
: j) B- m8 C- q/ N2 f- @export('IMAGE')
/ I$ s5 r4 B; ]% x; b8 a# n - def fld_image(self):
6 y1 V9 Q* n! I( {- u# c - html = self.get_html()
" r; [$ G& n z - m = re.search(r'<span class="imgholder"><img src="(.*?)".*?></span>', html)$ W% @2 S" ]5 G6 e9 A2 Z0 m$ p
- if m:8 a8 v R- h+ `# j
- return self._fld_image(m.groups()[0])
5 c) A7 \( b, N9 O% ^ - return ''/ H# y. \& r9 x! R) z
5 j/ @4 ^$ k# \% M$ ]/ n7 X r- @export('EXAMPLE')6 ]' `+ c7 g7 p
- def fld_sentence(self):0 Z0 e @- B& ~" h8 Z4 s2 p
- return self._range_sentence([i for i in range(0, 100)])( k2 {- |3 N4 i. x
- M: ~: A: S) h+ T- def _fld_audio(self, audio):
, O; }, x; S' r9 G" n$ H( X - name = get_hex_name('mdx-'+self.unique.lower(), audio, 'mp3')
, S8 Z7 a6 ^3 H ^! ~ - name = self.save_file(audio, name)
! n$ Q- m! Z% D4 q9 u. Y - if name:4 V- Q: d- c. `
- return self.get_anki_label(name, 'audio')
# C7 C2 e; E) l/ G1 _6 a) G - return ''
: c6 k% Y) M! g1 p5 _- @6 a0 n - ! Z) K/ q2 C M& H! i0 f
- @export([u'例句加音频', u'Examples with audios'])% ^. \" Q8 H5 }+ j6 ?$ E
- def fld_sentence_audio(self):# P9 O3 t+ M$ V5 @5 P3 Y
- return self._range_sentence_audio([i for i in range(0, 100)])7 j! ^9 f8 |3 W0 Q/ }
[: \3 L2 C, g/ G- @export('DEF')
# e; }0 R. m4 u) |$ L! {) ^ - def fld_definate(self):
7 ~% j9 J. y$ c: d - m = m = re.findall(r'<span class="def"\s*.*>\s*.*<\/span>', self.get_html())( M J* ~& z1 v2 F
- if m:( R5 C' t. O$ u2 h3 k; F; o. {4 p2 O
- soup = parse_html(m[0])
1 W' I. U) V8 F, @8 [. c - el_list = soup.findAll('span', {'class':'def'})
- A/ x L' F' f7 D( M* l) L Z9 L& N - if el_list:* e& Y* Y/ K- F* i
- maps = [u''.join(str(content) for content in element.contents) $ _1 i3 y% C. y3 w! p- b/ {
- for element in el_list]
3 i% w% r( k( o: J) c - my_str = ''5 i7 e. ~0 S) ^/ j3 O2 \7 ]! Z
- for i_str in maps:& `; V x) q( P) b. T
- my_str = my_str + '<li>' + i_str + '</li>'- o: `( Q) m; H- u* }
- return self._css(my_str)( h9 } L' b M( ~3 X0 M
- return ''
( E \8 D4 l3 E( g/ E, Y - * e' M* P0 i7 \0 ]
- @export([u'随机例句', u'Random example'])/ m# C& k7 b9 R- _5 A
- def fld_random_sentence(self):4 w: b0 B! k/ h
- return self._range_sentence()
5 o; h# d2 I J& E g, ^ - 8 M- J1 {# D$ [3 V/ O
- @export([u'首2个例句', u'First 2 examples'])
4 u& l) d2 ~3 P1 \+ F) T% h9 S - def fld_first2_sentence(self):; I! X o) _# m0 c( _
- return self._range_sentence([0, 1]) o3 g0 |8 o6 _ |6 y( [: ]
-
$ V+ \+ v4 K1 W' ^& L* J$ a0 }* N& s - @export([u'随机例句加音频', u'Random example with audio']), z" r' P( c; m9 M } K% x
- def fld_random_sentence_audio(self):
/ ^% ^0 n+ \5 v( n# \ - return self._range_sentence_audio()9 }% t# D' |- u! A' `" J
9 Y' G2 q1 b* f1 h$ {$ e$ Y- @export([u'首2个例句加音频', u'First 2 examples with audios'])- K+ t( F8 C& F* X
- def fld_first2_sentence_audio(self):0 C& i' w; n* K& { s
- return self._range_sentence_audio([0, 1])
8 V/ D% h+ L& V! \0 G
$ x% f3 @& {: l0 Z- def _range_sentence(self, range_arr=None):. x5 k6 g( U- f4 F4 Y
- m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html())
+ Q6 | `% }$ T/ H H# y5 @& g% ]4 q - if m:
4 a. F. A. B& {: k6 F8 f - soup = parse_html(m[0])
, x/ m4 A r9 H& A& M0 \" R Y - el_list = soup.findAll('span', {'class':'example'})
+ R4 g, l& O4 e - if el_list:9 V @8 C3 t. M9 G1 q
- maps = [u''.join(str(content) for content in element.contents)
7 f( l6 e: Y/ u4 T+ j8 { - for element in el_list]
0 f \. Y5 g& O) }4 N; ~/ q1 r - my_str = ''; L+ h8 j% A1 e
- range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)]
3 r$ a5 J. I, y9 I! b0 c2 y1 d- o - for i, i_str in enumerate(maps):8 F/ [( _0 l; @# `
- if i in range_arr:
! P1 g( ?; {! g9 ^ - i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip()
7 T* L$ ~1 Z& H. U& { - my_str = my_str + '<li>' + i_str + '</li>'
9 Q. L& s7 f6 z# m c, c - return self._css(my_str)
. S8 N9 e* I+ O - return ''; A' ~# e4 C* h4 t2 O
+ X' L8 Z3 H0 y0 @* X- def _range_sentence_audio(self, range_arr=None):
/ q6 F: f( j' m. D3 w - m = re.findall(r'<span class="example"\s*.*>\s*.*<\/span>', self.get_html()), H) t+ Q2 f+ b. U9 e- m6 a4 n& l1 x
- if m:
) Y7 R/ _- ~1 C* D8 a; J - soup = parse_html(m[0])2 W' c/ t$ l9 z$ [/ V# r$ h
- el_list = soup.findAll('span', {'class':'example'})5 I! M- k# }4 @' M4 h4 ~ U
- if el_list:
; z" K1 ~- D }# b7 w1 C) _* B - maps = []
) M+ J7 l) _. Z - for element in el_list:
- D) R. K9 G% {; ?: f1 I9 S - i_str = ''8 H# n1 r( k8 y- i2 A/ A2 U, }9 }
- for content in element.contents:
: {) b. X; ?/ G( }# @" z' ] - i_str = i_str + str(content)
$ ]. Z9 B( n8 l k - sound = re.search(r'<a[^>]+?href="sound\:\/(.*?\.mp3)".*</a>', i_str)
, G D/ d9 z, d - if sound:& o+ K% l8 B+ ]8 S% a# v, p; C- ]
- maps.append([sound, i_str])& [" s: l" @ e6 B4 h5 L9 A" s2 `
- my_str = ''
/ c' R- E- `4 j - range_arr = range_arr if range_arr else [random.randrange(0, len(maps) - 1, 1)]2 Y3 [7 \1 |+ j" p
- for i, e in enumerate(maps):( _1 Q3 E; `. d0 d1 F$ ~
- if i in range_arr:
6 ^& O" F( R$ y0 q - i_str = e[1]/ P& d2 u2 q0 H) ]) ?
- sound = e[0]& Y% e# E- ]8 a5 l0 p
- mp3 = self._fld_audio(sound.groups()[0])4 ~2 d( t9 `$ l
- i_str = re.sub(r'<a[^>]+?href="sound\:.*\.mp3".*</a>', '', i_str).strip()
2 ]1 W' e9 j- g9 _ - my_str = my_str + '<li>' + i_str + ' ' + mp3 + '</li>'
' \/ ~4 ~- Q4 I - return self._css(my_str)# v. M0 s2 s; c* O' ^' V" i
- return ''8 ?( {# ^ A9 f; k1 T
- Y5 d0 r9 r3 y( Q4 H( Y4 H- @export([u'额外例句', u'Extra Examples'])) ]# h& X# y' V e: i3 p, r2 j G
- def fld_extra_examples(self):$ L) R' r$ v0 W* m
- lst = re.findall(r'href="/(@examples_.*?)">.*?<', self.get_html())
+ j, E. s0 a0 _, X5 _ - if lst:
, f6 y9 Y# V; r2 x/ N# @- O3 L3 C - str_content = u''
8 P; Q" q- X7 q/ @" ]3 _# X4 x - for m in lst:
4 n; T; ]- Z( f# O6 \ - content = self.builder.mdx_lookup(m)* p" ~9 B- z3 p4 J! i
- if len(content) > 0:, {7 T) f1 d" S1 L- b+ ~' \& i/ R
- for c in content:
2 N$ r9 {2 W& Q# }2 a - str_content += c.replace("\r\n","").replace("entry:/","")
5 z1 }! b) W( Q+ ]# ` - return self._css(str_content)# \* V# p" A4 e) g4 x3 P8 Z" S. y
- return ''
/ I9 m ]2 @( G5 K/ N" _! [' g - 0 T$ F6 J& C/ X, K2 G4 S% ?( p/ k" l
- @with_styles(cssfile='_ldoce6.css')
; M3 p& b0 f6 e1 U8 o" | - def _css(self, val):( @: v# f, f$ | O8 L# E
- return val
" N% `' {! G7 w( i -
复制代码 / t R% w# Y: D% l/ G9 T
w5 G b) K+ i6 \( m7 ~
- , u7 v k4 J0 k4 ~2 F
- ^([^\t]*)\t(.*): y# w" s3 e$ H
- \1\n<link type="text/css" rel="stylesheet" href="LDOCE6.css"/><div id="LDOCE6_Zzz_1"><span class="entry" id="zzz" new="NewInLdoce"><span class="entryhead"><span class="hwd"></span><span class="hyphenation"></span> <span class="brevoice"><a href="sound://\2"></span brevoice><img src="img/spkr_r.png"></a> <span class="amevoice"><a href="sound://\2"></span amevoice><img src="img/spkr_b.png"></a><span class="buttons"></span></span><span class="sense" id="zzz_s1"> <span class="def"></span def></span></span></div><script src="entry.js"></script>\n</>
复制代码 9 j1 J0 U9 d$ i! j4 V. m3 T
1 G$ k. }$ m m2 ]. ^7 V& ^1 P4 \" G
& m% U3 ^+ E5 r. |9 o0 L9 c5 I* _有点复杂就说下原理,懂的人自然懂
, H6 ~) _2 v9 Q; y" L& ]
! c* t# M2 v7 mFastWordQuery的作者特别做了朗文 6的支持,包含 .py 文件 特制的 mdx mdd
! z( J k+ v) \4 G: e2 u% P2 l/ [. n) u, h8 N1 w
我就根据那个特制的MDX直接造了新的mdx出来,然后就支持了.
* }% w6 w& D$ A- e' O7 D. [py代码比较复杂看不懂,我只是改了里面的词典名 xxxxxxx
; C& B0 S4 y7 D
. d4 c0 K& K6 w" V然后正则表达式是用来制作mdx的; I0 _( m; w; c) Q! J$ U& g
2 Y1 ^+ H. X- ~. q4 J词头\tMP3* |2 Y0 |: l3 K+ ]9 F& w6 f) P$ x7 G6 k
0 H6 `/ R2 I, v) i( [* x- _9 m3 s
3 z" F) Z( C' V( X$ T
想想应该也没几个人用得上,学英语以外的外语的人应该不多
9 q0 T0 U1 Q J2 ?$ F
$ e5 x7 Q/ h" T; X* N m) H ^! E4 p% c% Z L) G o
2 Q( P# ?% C6 D- {9 e, ]/ G5 ]
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
评分
-
1
查看全部评分
-
|