TA的每日心情 | 无聊 2019-1-1 20:10 |
|---|
签到天数: 31 天 [LV.5]常住居民I
|
发表于 2018-12-26 10:02:26
|
显示全部楼层
这问题用 XPath 可以不用这么烧脑,用正则表达式则是把简单事情复杂化了,附 python 程序实现,依赖 lxml 库。
, Q9 V# N! O4 f- ! c) F+ }! _4 d) G" `7 i
- #!/usr/bin/env python2
% i% v4 r' w& A0 g: F - # -*- coding: utf-8 -*-0 p7 |7 ~7 o8 i; t! W
- """
! a) W% N. V' r& O0 o8 b, O% R - File: replace_tilde_with_title.py9 e4 u, W, z4 v% y' e% c) p4 h
- Author: zzhirong. _0 b1 R3 m9 y
- Email: [email protected]
4 m- L2 t2 J. B - Description: 替换 span 下的 ~ 为 d:entry 的 d:title 属性! {" X5 W" w+ D' i, A
- """# a6 A' f4 Y; F/ j) B
0 o6 o* |3 E- b6 ]- from lxml import etree* k( a8 b- J' y, ^4 Q9 x
- 6 B4 t0 m4 y2 G' i* M1 O6 I
- s = """<?xml version="1.0" encoding="UTF-8"?>" G3 s2 r8 W; R8 G. u, p7 Z3 I" a$ l
- <d:dictionary xmlns="http://www.w3.org/1999/xml" xmlns:d="http://www.apple.com/DTDs/DictionaryService-1.0.rng">* q: e5 [+ t, p. x; G F
- <d:entry id="_38ja" d:title="xxx">* P( r5 Y" T3 K8 G; s
- <d:index d:value="steal" d:title="steal"/><span class="hw">steal</span><br/>+ \/ W, w- V2 i& H
- <span class="ex">~ a visit <span class="tag1">(an interview)</span> </span><span class="ex_c">测试<span class="tag1">(测试)</span></span>& @: o' Z/ `0 U( y6 a# {, F
- <span class="ex">~ a kiss </span><span class="ex_c">测试</span>
! N7 D% {1 |) A m0 O) T: H - <span class="ex">~ rides on the train </span><span class="ex_c">测试</span>
3 N# B3 ~( D" v - </d:entry>/ s0 j9 v. q4 [
- </d:dictionary>
' ?! ^1 \: _+ A' E' x# v" [ - """2 D* I6 v d9 r# X0 T6 O: b+ z; u
- 2 _3 [4 g1 M* x' K8 M5 W x
- xml = etree.XML(s)* j% t- |+ \' G/ L2 o9 n6 i9 t- C, S
- D_NS = xml.nsmap["d"]4 T8 W3 G! W8 H/ S6 W1 _& E
- XML_NS = xml.nsmap[None] q9 H5 y' y' C1 L N. |" ~
- . p" h k) f2 ]1 q. }
- for entry in xml.xpath("//d:entry", namespaces={"d": xml.nsmap["d"]}):( m9 F) w, o5 }" ]$ N! Y3 Y3 G
- title = entry.get("{%s}title" % D_NS, "")8 S, M* ]" q" y
- for span in entry.iterfind("./{%s}span" % XML_NS):% d' J0 Q) q# L4 ]# y. u8 @! g
- span.text = span.text.replace("~", title); W! M! q, N/ s8 M8 f- R
- print(etree.tostring(xml))2 W& \7 J+ e) v, K& U
复制代码
4 t+ |; V" ^* j5 I9 v* F |
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
|