TA的每日心情 | 无聊 2019-1-1 20:10 |
---|
签到天数: 31 天 [LV.5]常住居民I
|
发表于 2018-12-26 10:02:26
|
显示全部楼层
这问题用 XPath 可以不用这么烧脑,用正则表达式则是把简单事情复杂化了,附 python 程序实现,依赖 lxml 库。) B4 q) V# `) k4 J2 i/ _0 y5 f+ w
" F$ T a5 B. e. ]5 P( E w- #!/usr/bin/env python2
. B$ b$ }/ a% ~. B9 b! T; c - # -*- coding: utf-8 -*-0 h# }8 t, Y, F/ N1 V- L1 d
- """
6 g' @- H; U+ _ f; }, G4 m - File: replace_tilde_with_title.py
( U) e/ I) H& d - Author: zzhirong. i& ~$ D0 U9 i* i O. [2 ^
- Email: zzhirong@email.com
4 W- s) t( d$ ^! T - Description: 替换 span 下的 ~ 为 d:entry 的 d:title 属性
% p$ c2 T% y' ]& {1 W - """, c9 g" L/ H$ I( |! d
6 H' a( ^5 M7 K6 s! y( k$ K% n- from lxml import etree9 x1 e: h6 r& w% z3 q
, H0 K' D7 M5 H8 V5 m- s = """<?xml version="1.0" encoding="UTF-8"?>
! o9 c" W1 f0 K! z7 r - <d:dictionary xmlns="http://www.w3.org/1999/xml" xmlns:d="http://www.apple.com/DTDs/DictionaryService-1.0.rng">& J" o" g6 v% b6 N
- <d:entry id="_38ja" d:title="xxx">$ B+ g: d0 ?5 D4 c: }5 M
- <d:index d:value="steal" d:title="steal"/><span class="hw">steal</span><br/>
0 ?$ i9 J/ O3 O/ i" f: V. v - <span class="ex">~ a visit <span class="tag1">(an interview)</span> </span><span class="ex_c">测试<span class="tag1">(测试)</span></span>2 l2 ^) p! G, Q, C9 ?& R/ l# D
- <span class="ex">~ a kiss </span><span class="ex_c">测试</span>9 `3 [3 `& l$ [. w0 I
- <span class="ex">~ rides on the train </span><span class="ex_c">测试</span># Y0 Y# h+ j% R3 ~
- </d:entry>
" _+ i# S4 p& D - </d:dictionary>9 o: m2 ~, ^: ]' y2 y; k$ C% ~+ C
- """0 E! b' m! t) e# J4 ]
- 9 o8 i/ J) h8 J3 }
- xml = etree.XML(s)
2 A: t( k% H* S - D_NS = xml.nsmap["d"]
- f# j2 l: f8 Z' W - XML_NS = xml.nsmap[None]
. A9 I* o+ I/ N+ n, f L. B
E$ J5 ~( a- V5 h0 r8 {- for entry in xml.xpath("//d:entry", namespaces={"d": xml.nsmap["d"]}):
* u6 t S8 ?- `, u7 g( y) f1 N/ a2 X - title = entry.get("{%s}title" % D_NS, "")
# w4 r6 _* P' o# B5 c z - for span in entry.iterfind("./{%s}span" % XML_NS):8 }; E$ Q. Y: w# g; H& r
- span.text = span.text.replace("~", title). A" J' A% M% U: ~; k X5 b
- print(etree.tostring(xml))
5 x; B4 U N4 x# J5 o& S
复制代码
1 t: E) g" o4 Q% O: J: Q1 k0 w1 N0 Q |
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
|