diff options
Diffstat (limited to 'plugin/readability/README')
| -rwxr-xr-x | plugin/readability/README | 24 |
1 files changed, 24 insertions, 0 deletions
diff --git a/plugin/readability/README b/plugin/readability/README new file mode 100755 index 0000000..5e08c96 --- /dev/null +++ b/plugin/readability/README @@ -0,0 +1,24 @@ +This code is under the Apache License 2.0. http://www.apache.org/licenses/LICENSE-2.0 + +This is a python port of a ruby port of arc90's readability project + +http://lab.arc90.com/experiments/readability/ + +Given a html document, it pulls out the main body text and cleans it up. + +Ruby port by starrhorne and iterationlabs +Python port by gfxmonk + +This port uses BeautifulSoup for the HTML parsing. That means it can be +a little slow, but will work on Google App Engine (unlike libxml-based +libraries) + + +**note**: I don't currently have any plans for using or improving this +library, and it's far from perfect (slow, and almost certainly buggy). +So if you do something cool with it or have a better tool that does +the same job, please let me know and I can link to it from here. + +If you're looking for alternatives / forks, here's the list so far: + - http://www.minvolai.com/blog/decruft-arc90s-readability-in-python/ + - https://github.com/buriy/python-readability |
