TA的每日心情 | 无聊 2019-1-1 20:10 |
---|
签到天数: 31 天 [LV.5]常住居民I
|
发表于 2018-12-26 10:02:26
|
显示全部楼层
这问题用 XPath 可以不用这么烧脑,用正则表达式则是把简单事情复杂化了,附 python 程序实现,依赖 lxml 库。
" l6 J3 @& k- \* b( d" w3 H- D1 d7 f! A$ p9 d
- #!/usr/bin/env python2/ j! ^$ {2 C" I: D8 W8 X
- # -*- coding: utf-8 -*-
4 T" b1 H# g0 r- G& M - """, e) `0 K$ h- }; I( G8 n
- File: replace_tilde_with_title.py, ~3 I3 I5 T# N; l: v
- Author: zzhirong9 s y" @+ G: p j
- Email: [email protected]# I h. N- ?5 ^! J- O) m6 l
- Description: 替换 span 下的 ~ 为 d:entry 的 d:title 属性
% `/ y6 }* ^. O0 C; y4 H - """
0 N* }1 S5 y% A% n4 e0 E
" g1 T1 T4 y$ J6 j/ k" b% Q9 P' i- from lxml import etree
) h0 i, M& |5 y5 C/ n1 Q* X, k' }
7 }, I9 C& B% D; S; i) q- s = """<?xml version="1.0" encoding="UTF-8"?>1 K6 Y& T- a2 h- a( H8 Q% r2 m
- <d:dictionary xmlns="http://www.w3.org/1999/xml" xmlns:d="http://www.apple.com/DTDs/DictionaryService-1.0.rng">6 i' m- a$ v: H; M8 v0 M' @; C
- <d:entry id="_38ja" d:title="xxx">
- L* }/ ~1 a" O$ v - <d:index d:value="steal" d:title="steal"/><span class="hw">steal</span><br/>
" \( A V2 R E- `0 i - <span class="ex">~ a visit <span class="tag1">(an interview)</span> </span><span class="ex_c">测试<span class="tag1">(测试)</span></span>& {6 k0 N2 {6 L; ~
- <span class="ex">~ a kiss </span><span class="ex_c">测试</span>1 {+ l J3 H N y3 m+ k; K) K S
- <span class="ex">~ rides on the train </span><span class="ex_c">测试</span>: h. b1 x8 }. Z/ T4 x& q
- </d:entry>
6 y$ g) C2 l' x - </d:dictionary>
1 x3 A g- E' W I3 p9 L3 H - """
" X4 A) W% {# I* D- K - % u. Q; F% c- L; ~1 M* v
- xml = etree.XML(s) J- C* b* s# _, O
- D_NS = xml.nsmap["d"]
* k0 W6 I" N% T# e: m - XML_NS = xml.nsmap[None]
$ v2 j* _6 p$ k( ]8 X' M: f4 s - Z1 _6 ^3 d V6 h4 x8 ^ i2 |3 A
- for entry in xml.xpath("//d:entry", namespaces={"d": xml.nsmap["d"]}):. a# A% X, _' o! }* d0 w( J
- title = entry.get("{%s}title" % D_NS, "")
/ m0 r. _5 f# j# `' ?. f n - for span in entry.iterfind("./{%s}span" % XML_NS):
& p0 P4 s8 J9 f0 e. J" l1 o3 G8 H: _5 ^ - span.text = span.text.replace("~", title)! J3 S- b! @3 L7 m
- print(etree.tostring(xml))
8 m' Y! x2 J- H$ H# S4 d1 p q
复制代码 % Q5 @+ A+ \! B- u. M- w
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?免费注册
x
|