掌上百科 - PDAWIKI




查看: 2079|回复: 5

[求助] stardict词典的同义词条如何合并



发表于 2015-1-18 04:35:56 | 显示全部楼层 |阅读模式
本帖最后由 LYX1692 于 2015-1-18 04:38 编辑 - s: R- V$ i8 e5 ~3 T% l

. \, }! h/ S; n% Z( ~不知道有没有大神以前做过星际译王的词典,有个问题求教。
, k7 G6 C, ~+ Z* A, U3 A7 U我喜欢用Kindle多看看书,里面的词典采用的stardict格式的。6 \8 d' I& Y: t* ^' _/ c
) F7 z, W3 W( j* [
! p) Z$ h- B3 Q: q6 G比如说搜索centre,自动导向center。
9 B- M. A- [3 m3 l$ v4 P1 Q/ g, P; o1 u; ^8 B! M* Q8 L
百度找了好久,只找到一个说要编辑.syn格式的文件。1 O( P# k) V- L# Y- J
可是连最起码的用什么工具编写,又怎么编译都没交待。# h8 C1 x+ O( t# x

: s, x9 G. f3 C# l; v0 i) ~0 J6 o2 w  O8 T7 M& n6 e* U
The .syn file contains information for synonyms, that means, when you input a
* B+ j4 d/ k7 d& U# Bsynonym, StarDict will search another word that related to it.& y" I8 z7 p, D& O0 J. X% x
/ B8 G: U  F6 M7 P. \
The format is simple. Each item contain one string and a number.9 V# y5 N4 j2 O! Y  x7 `
synonym_word;  // a utf-8 string terminated by '\0'.0 r) B0 Q; p7 w9 D
original_word_index; // original word's index in .idx file.0 H; v$ G" w" F
Then other items without separation.: W) i4 U) Q% `3 B* k( F
When you input synonym_word, StarDict will search original_word;
0 ~) ^, d7 _1 w& R/ g: [1 }& K( c
The length of "synonym_word" should be less than 256. In other
& o4 h- Y( I* ~  p$ `7 }" Owords, (strlen(word) < 256).! t9 f' c5 T- T# ^6 L
original_word_index is a 32-bits unsigned number in network byte order.* c$ x$ m4 e! }5 y# l6 F
Two or more items may have the same "synonym_word" with different ' J: H3 V: h1 p+ u; g1 Y
original_word_index." Z4 d; K- ~" ?2 C
The items must be sorted by stardict_strcmp() with synonym_word.


发表于 2015-1-19 04:06:52 | 显示全部楼层
.idx .ifo .syn是编译后的文件,需要提取出文本,修改后再编译。, S- z+ k( |9 i' H& w
提取文本和编译软件StarDict Editor
! ~7 v1 @7 e% x5 pStarDict Editor支持Babylon文本格式的编译& R8 ^& f; O. Y' a- [$ X* i) ]) p
例:' |, H" p! `) j1 m& [
babylon 格式9 w3 e" q9 X! m# _2 T
apple|apples2 E5 H/ ~5 N4 _% u
the meaning of apple0 K% b+ p. S3 T3 ^! B9 u& ?$ l

8 f( ^3 b1 W* m. J: t& Q8 `eeeee( x) H* \2 ^( n, Q- m0 C
3 X, r( p4 j1 W$ _: b- v! J8 c$ H! p) G) w
9 v: K0 C) W/ S8 o- ?- n5 ^  bapple|apples
' y1 e* x4 C! Z$ K% Q, l2 Q$ g! z编译后会生成xxx.syn文件,查apple或apples都会直接指向同一个解释,”the meaning of apple“。  {' l. Q+ Z7 A6 W! x$ \/ [
这是用Babylon格式的方法。  X" [4 a& S% S6 e7 M4 G

8 q1 u, \9 J# C因为不熟悉stardict格式,可以用StarDict Editor反编译出文本,看看。
# v0 ]6 U, v8 J5 q7 B===! D( ~2 J5 q( H% d6 J' B) ^5 \8 }
8 U3 e! N  f* I& y) V$ y
Tab file(text)
& j- H& A' L/ h4 Japple     the meaning of apple& Q+ D  r6 z9 C  D/ ~
eeeee     kkkkk/ E2 u" f8 S8 U' D' C( u
(text格式 synonym信息丢失3 k8 f( h& }5 ~4 H; V1 m
===Textual StarDict dictionary xml格式===
- t$ _7 Z2 a- P* V) ~. _
1 M# B( P7 F9 E5 q8 T! c) T. Y  <?xml version="1.0" encoding="UTF-8" ?>
/ w6 g6 N( s5 J4 J& J5 t- W4 c- <stardict xmlns:xi="http://www.w3.org/2003/XInclude">
6 i- c5 b, E: d6 r+ I7 G% F- <info>
- z0 c5 Z6 |& S7 N6 L: Y5 P  <version>2.4.2</version>
/ A) G0 J* h' d  <bookname>3</bookname>1 p# ~" f% m6 g& ]+ ]
  <author />  p. }$ m7 c& R6 t2 |
  <email />0 \% H6 m9 p' X8 v) j5 h& U" [) M* ^9 b
  <website />
- Y- b! l! w8 f9 K! U  e8 n* K  <description /># q+ [7 ?$ d5 R2 C# z% E$ E; T- U0 h7 `
  <date />
7 H3 s: ?  v2 G9 c0 s  <dicttype />6 ?( a7 H5 r' n
' I+ N' b. E0 }& O% j- <article>; c9 g. s# v: S7 A
* e# n$ @% P' U& n( c" d7 q$ ?) m  <synonym>apples</synonym>) E1 Q8 R8 b# C- Q( a0 \* s4 d
- <definition type="m">2 Q, ~9 A  M3 I0 T9 A
<![CDATA[5 a+ ]# i% q* D; b( R8 E
the meaning of apple- I2 E- P  I7 u. X6 Z
  ]]>9 ]/ V1 G+ j# {+ h* s4 r$ [
. _5 X; D( }" j, W* ?' z  </article>
/ a6 ^4 g$ M7 e; w0 {- <article>- Y+ |" s& @0 U7 p" `3 L
( @- Q& }+ u$ G$ C, v3 _- <definition type="m">
5 M, E- L# D$ E/ ~4 s<![CDATA[
+ W3 u0 r. k" [% S# O6 V8 k. skkkkk# i# S0 w* d$ w8 o
! M! `. w' i; U7 l1 c# f5 ^  </definition>/ i4 _6 S: L' `( l$ ?
  </article>" Z- F" y: }" w
- u$ X/ [5 F" q5 }+ j+ C0 F======
6 r. R8 S) y  C5 r+ ^$ K  [& ?& s1 V6 K* ^  ?' X- x


发表于 2015-1-19 04:10:05 | 显示全部楼层
本帖最后由 qunwang6 于 2015-1-19 11:29 编辑
% N" I/ k. e9 P% W7 l
- R* ]; J0 f8 y+ v' _9 y其它
8 f- _! z1 K1 Sstardict-textual-dict-example.xml
2 @% u" k! a  r% G: jhttp://code.google.com/p/stardic ... e3841c0fda092c68b6f
$ x, m5 `6 W8 c% C
& T0 x! g. e! _, jStarDict格式的词典转txt
; {0 W7 O1 q+ S$ n0 ~方法一:
( m7 P' Y5 C& d! OStarDict Editor
. ?/ o# A- i. A, W7 Y" Ghttp://code.google.com/p/stardic ... xe&can=2&q=. B( O- i, G. b& q% P
( n/ Q' V& u1 n5 ]2 L* }4 K# r2.将解压后文件夹内的xxx.dict.dz重命名为xxx.dict.zip然后解开为xxx.dict文本文件
4 N$ o8 a" |. a0 s7 y* K) r3.打开StarDict Editor,选DeCompile,文件xxx.ifo4 Q! f' A5 Q% c8 L
4.DeCompile, l" s6 j# {! w& B
! N: e, m+ G/ B9 i" y2 AStarDict转txt程序:cvtstardict2txt.zip
# J: ?, m/ \  b: |# Uhttp://www.pythonclub.org/python-files/stardict
; R& [. G1 E% m5 f: K3 |8 d+ t3 O4 l; S$ N: @2 L' H2 E- y, z2 W
1.下载、解压StarDict词典的压缩包xxx.tar.bz27 ~! o; R5 ]2 q; z
2.将解压后文件夹内的xxx.dict.dz重命名为xxx.dict.zip然后解开为xxx.dict文本文件* Z9 c8 w# O) C1 A( \- o1 F2 Q1 e
+ n; K9 m% \5 K% w5 y: }8 q  `方法三:
5 V" O. J7 E$ k( u4 n# @3 ]4 Fpyglossary4 w, H$ E* u; O7 f6 N) s! }
" i8 N4 E! M7 i- H1.下载、解压StarDict词典的压缩包xxx.tar.bz2, w2 B2 c6 W8 y% U% {9 B& [
2.cd 词典文件夹
7 a3 x; @" N+ N+ T, |+ I9 B4 e! W! F3.python /Applications/Utilities/DictionaryDevelopmentKit/pyglossary/pyglossary.pyw --read-options=resPath=OtherResources --write-format=AppleDict xxx.ifo xxx.xml: L( u. N3 E) z( t4 ^* z
; S2 H8 A# V! [
编译成StarDict格式的词典# J2 h, N2 p1 m. P& l( {: B8 R
7 ~3 `! z4 V7 p( lStarDict Editor
! X+ y5 Y! k- j1 Shttp://code.google.com/p/stardic ... xe&can=2&q=
2 I( M' V; Z. j方法二:
" i" O. T; \! ^6 j& @; g! Dpyglossary2 W. I( |9 \5 b; B- D9 I2 ^
, {3 ?% i. C3 w: O8 `' A: i% Wpython /Applications/Utilities/DictionaryDevelopmentKit/pyglossary/pyglossary.pyw --read-options=resPath=OtherResources --write-format=Stardict xxx.txt xxx.ifo) ]: Y/ x: J; g. j  g/ O6 ]- w
2 U- J8 f, X) o; k& h+ \

6 E3 r6 R& |% V===0 W" o; H# I& H
; K9 t) T2 G/ ?/ ~- \5 o5 t7 aa     1\n2\n3
6 N& S3 h/ \6 x1 ^5 N% s" k2 bb     4\\5\n63 E; S# A. x0 o" [) G, M
c     789. i6 C2 N2 \3 p
! p& R7 i6 o- g& s3 @
每行开头,是一个单词;接著,是一个Tab符(如果你的文本编辑器有 "Tab=空格" 选项,勿选之);接下来是单词音标及释义。n 表示换行,\ 表示斜线 。词典文件的最后一行,必须是一空行。  I) A9 k' `8 t) D$ t; }. Z0 q! P, m5 x

- e& c! P' R' m1 B0 K<A href='bword://DAKOS'>DAKOS</A>
8 D# Y9 k( F2 R5 L
  u# a1 {" `  r2 L4 x3 ~  l- {$ x1 x$ d% v5 G0 |' g7 L

% @. d9 {  G+ y: ~How to add HTML tags to StarDict file?
0 n3 H2 A" A6 O, e) ?( v8 `* f+ F6 z
. v& q0 j2 t0 g; U+ w3 v
a   <span style="font-color:#008000">prep.</span> <i>(en relaciones de proporción, equivalencia)</i> per.
+ B! [' H8 `  m/ [% S! sabajo   <span style="font-color:green">adv.</span> below.4 D! `8 d! O* Y( R4 f/ V
a mano   by hand.* Y* i  c3 c/ v. }3 l, T
abarcar   <span style="font-color:green">v.</span> <i>(temas/materias)</i> to cover; <i>(superficie/territorio)</i> to span, cover.
; {/ k* F$ b8 T2 a; c( J! D2 C: l3 R( D===( O- d1 }5 u  ?& v8 g) s: q* r& y0 X
Compile any supported file format to StarDict dictionary.
& s9 X9 ~. ^& n1 l& S
* r# E) w1 R& NTab file format, G* V, q# B. q! \1 z$ O
---------------4 y' C2 d" k2 t3 z1 e- d6 Y8 ]$ I
Here is a example dict.tab file:
- c% j3 i6 n1 u  P# R============, g4 H' j! R3 H; h5 q8 v9 _( p
a     1\n2\n3# S9 R$ F* ^3 q
b     4\\5\n6
+ B! S2 _# R  b& sc     789- o8 q0 y, c) l) n) R+ l! |8 E
============. ^2 C5 T5 C+ o1 P; b# q6 J% B
Each line contains a word - definition pair. The word is splitted from definition with a tab character. You may use the following escapes: \n - new line, \\ - \, \t - tab character.2 O1 j. y5 M% {! M1 T% Q. ]- Q
; b0 g' b( y) L
1 E& r  f( `' F1 \1 |9 }
Babylon source file format
' f7 L2 w  y3 M/ [1 ]$ b8 g0 U( E- c--------------------------
' H  C) H) u- t=====: P$ k. G5 P" u6 _1 V) @
% w, d' ?% ]% Tthe meaning of apple1 Q2 V: e# a. N9 {* M

  Z0 n: s! B3 f- ?- [% T2dimensional|2dimensionale|2dimensionaler|2dimensionales|2dimensionalem|2dimensionalen
# ~$ u% P( O% d- H* ltwo dimensional's meaning<br>the second line.1 J" @1 |  z6 F

! k2 L/ N- @+ e# g% J=====* \) y3 e9 j. H. ~% ~
Each article must be followed by an empty line. The file must end with two empty lines!3 l3 F% }3 R+ ~! T* z: ]1 t/ A
2 e, {1 C3 y' Y- ^  z( _% |2 f9 _
You may specify field like bookname, author, description that will be used in the generated StarDict dictionary. You may specify options effecting processing of the babylon source file. See libbabylongfile.cpp source file for complete list of supported fields and options. To specify options and fields, leave the first line blank, than write options, one option per line. Precede each line with a hash sign.8 n1 }+ ~' T: W2 A; J( }; M* _) C
For example:
1 a7 \& Z1 _8 I: N/ _8 f=====% W: \0 J+ [- I) `8 H

8 }$ a2 V2 n( ^9 S" i#bookname=My dictionary
2 g" a4 }+ A. g  V: Y! S0 b#author=My name. `  K2 y4 g: z$ h9 H7 d
#description=...4 u2 j" G! ^+ q# \8 B" L
#other fields=, F) L$ |5 d% [( Y9 c5 z9 a
/ T' h+ H% e* D8 g% u# H
( F& m1 k- x) i2 h( \( H: d=====5 B. y( Q' G' c

: i0 D3 Q; k! ]. eTextual StarDict file format. _! k" G6 [6 _& T: {" f& z
6 m0 k' b# @% p2 ~- X" KSee doc\TextualDictionaryFileFormat in source tarball for information about Textual StarDict dictionary.& q! u: n- p7 n1 f" e5 B1 r
8 n/ x- B$ f1 r6 g) t- q

6 s6 J/ W1 F7 L单词( f$ a4 l$ c- [% _
解释(用htm的格式,如<B></B>可以设黑体;<font face=xxx></font>可以设字体等等。
/ E3 R& w: ?  X  U) |. r(空行)5 l' M! f9 S* F' @; N8 A1 Q2 ]8 _
! I8 n7 T: u- a( B词典文件的最后一行,必须是一空行。
* o( g5 ~- I" }0 A0 P; {


 楼主| 发表于 2015-1-19 15:29:07 | 显示全部楼层
qunwang6 发表于 2015-1-19 04:06; \% w( K  E  l! t7 b
.idx .ifo .syn是编译后的文件,需要提取出文本,修改后再编译。. D8 q/ `. N8 P
提取文本和编译软件StarDict Editor
6 V! w/ C. i5 A  [" S4 RSta ...

1 L9 n; a8 q; I太谢谢了。我放弃了原来的Txt Tab File,改用Babylon的文本格式。) k; s" h  n' U! j
" B+ J4 w' a( T0 D  u) m4 l( p  {太厉害了。
' |7 }$ _* Q& r! ^9 h/ |9 ~
  • TA的每日心情
    2021-1-15 05:13
  • 签到天数: 271 天


    发表于 2017-9-2 04:17:44 | 显示全部楼层
    qunwang6 发表于 2015-1-19 04:10
    ; ~# D: F7 _1 P  z' G3 d& a其它2 O+ i: w" D8 k" N
    stardict-textual-dict-example.xml# v- Z  Y1 P" i/ T7 ^
    http://code.google.com/p/stardict-3/source/browse/dict/doc/ ...
    " ~* }  p. p/ s7 E
    大神, 有办法搞定 StarDict 或者 Babylon 格式词典的发音不? 想做成像 MDict 这样声音跟着词条走, 而不是把声音文件打成个大包然后按文件名字匹配单词, 例如一个单词能发英音和美英....
  • TA的每日心情
    2017-10-25 12:37
  • 签到天数: 8 天


    发表于 2017-10-13 15:48:09 | 显示全部楼层
    3 _$ F) p" }6 e<key>krīḍā-bhūmi</key>
    ! \0 i9 M7 g, s7 ?/ j* N- e<synonym>krīḍābhūmi</synonym>" B5 p( N) Z& L
    5 Y+ w; m" Q& S- {<synonym>རྩེ་བའི་ས།</synonym>4 Q& V" H; i' M( Q! [4 o0 [& e

    % L- I' I0 X. w# \- r9 B<definition type="g">& Q- Z# [8 F+ V7 d/ ?1 L
    ' h9 R+ i9 D! ^        遊戲其地
    5 X8 w! t* x& ]" O: G4 r9 `3 {        <font size="3" face="Tibetan Machine Uni">རྩེ་བའི་ས།</font>! @0 |4 |, O) o+ i
            rtze ba'i sa]]>0 F( {; k  M, x0 `
    </definition>$ m) \4 }& Q# G' Y# E
    7 z% |1 I6 F* j) }+ g------------------------------------------------
    9 L- X* K; v* A9 Z( J8 z$ e7 e; b  w1 z0 p2 K- z  b
    您需要登录后才可以回帖 登录 | 免费注册


    小黑屋|手机版|Archiver|PDAWIKI |网站地图

    GMT+8, 2024-4-26 06:14 , Processed in 0.037685 second(s), 9 queries , MemCache On.

    Powered by Discuz! X3.4

    Copyright © 2001-2023, Tencent Cloud.

    快速回复 返回顶部 返回列表