Please Visit: http://ift.tt/1ajReyV
from Public RSS-Feed of Jeffery yuan. Created with the PIXELMECHANICS 'GPlusRSS-Webtool' at http://gplusrss.com http://ift.tt/1haNgjk
via LifeLong Community
Library: boilerpipe
boilerpipe provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.
Can be used with Nutch and Solr when crawl web page.
Just call http://ift.tt/1haNgjf to highlight the main content of an arbitrary URL.
http://ift.tt/106wMPw
http://ift.tt/1gX4FwZ
boilerpipe provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.
Can be used with Nutch and Solr when crawl web page.
Just call http://ift.tt/1haNgjf to highlight the main content of an arbitrary URL.
http://ift.tt/106wMPw
http://ift.tt/1gX4FwZ
from Public RSS-Feed of Jeffery yuan. Created with the PIXELMECHANICS 'GPlusRSS-Webtool' at http://gplusrss.com http://ift.tt/1haNgjk
via LifeLong Community
No comments:
Post a Comment