需求:

如何将xpath定位到的元素进行转为HTML源码

方法1:使用from lxml.html import tostring的tostring方法功能

from lxml.html import tostring
from lxml import etree

html_get = etree.HTML(resp_text)
div_ok = html_get.xpath('//div[@id="mw-content-text"]')[0]
div_content = tostring(div_ok).decode('utf-8')

方法2(推荐使用,经过我效率测试,使用etree返回的html使用xpath定位到的元素,还使用etree转换为HTML源码效率更快):

from lxml import etree

html_get = etree.HTML(resp_text)
div_ok = html_get.xpath('//div[@id="mw-content-text"]')[0]
print(div_ok,type(div_ok))
div_content = etree.tostring(div_ok, pretty_print=True, method='html').decode('utf-8')  # 转为字符串