parsers module¶
-
parsers.
parse_xml
(x)¶ parse xml
Parameters: path – [String] Path to an xml file Returns: object of class lxml.etree._Element Usage:
from pyminer import fetch from pyminer import parsers url = "https://peerj.com/articles/cs-23.xml" out = fetch(url) parsers.parse_xml(out.path)
-
parsers.
parse_xml_string
(x, encoding='UTF-8')¶ parse xml to a string
Parameters: path – [String] Path to an xml file Returns: a string Usage:
from pyminer import fetch from pyminer import parsers url = "https://peerj.com/articles/cs-23.xml" out = fetch(url) parsers.parse_xml_string(out.path)
-
parsers.
parse_plain
(x)¶ parse plain text
Parameters: path – [String] Path to a plain text file Returns: a string Usage:
from pyminer import fetch from pyminer import parsers url = "xx" out = fetch(url) parsers.parse_plain(out.path)
-
parsers.
parse_pdf
(x)¶ parse pdf
Parameters: path – [String] Path to a pdf file Returns: a string Usage:
from pyminer import fetch from pyminer import parsers url = "http://www.banglajol.info/index.php/AJMBR/article/viewFile/25509/17126" out = fetch(url) parsers.parse_pdf(out.path)