- Name: boilerpipe
- Version: 1.2.0
- Release: 9.mga6
- Epoch:
- Group: Development/Java
- License: ASL 2.0
- Url: https://github.com/kohlschutter/boilerpipe
- Summary: Boilerplate Removal and Fulltext Extraction from HTML pages
- Architecture: noarch
- Size: 101811
- Distribution: Mageia
- Vendor: Mageia.Org
- Packager: neoclust <neoclust>
Description:
The boilerpipe library provides algorithms to detect and
remove the surplus "clutter" (boilerplate, templates)
around the main textual content of a web page.
The library already provides specific strategies
for common tasks (for example: news article extraction) and
may also be easily extended for individual problem settings.
Extracting content is very fast (milliseconds), just needs the
input document (no global or site-level information required) and
is usually quite accurate.
- OptFlags: -O2 -g -pipe -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fstack-protector --param=ssp-buffer-size=4 -fomit-frame-pointer -march=i586 -mtune=generic -fasynchronous-unwind-tables
- Cookie: rabbit.mageia.org 1456910148
- Buildhost: rabbit.mageia.org
Sources packages:
Other version of this rpm: