16 month ago
nelson : templatemaker - Advanced Python screenscraping; tries to infer document structure from multiple examples
# copy
16 month ago
nelson : Screenscraping tips - Some comments I made about liberating data from websites
# copy
17 month ago
Andy Baio : Adrian Holovaty releases templatemaker, a Python library for smart screen scraping - given a large set of HTML documents, intelligently extracts the strings that change between them
Matthew M. Boedicker : templatemaker, Python screenscraping library - (via waxy) [via]
joshua : Introducing templatemaker - back out templates from similar documents
Rod Begbie : Introducing templatemaker - Python library that analyses a corpus of web pages, works out where the dynamic values are in the template, then allows you to scrape out the juicy details. I can think of oh, so many uses for this. [via] #
philgyford : Introducing templatemaker | Holovaty.com - Python thing. Point it at some HTML files and it will make a template with holes for the unique strings in the pages. (via Daring Fireball)
# copy28 month ago
deusx : More Like This WebLog > The State of Screen Scraping - "During all this recent excitement about using hAtom to generate feeds, I’d forgotten that I wrote about the concept nearly three years ago when I was getting ready to talk about syndication at Seybold SF."
# copy
29 month ago
Paul Hammond : Hpricot, a fast and delightful HTML parser - Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree and John Resig's JQuery, but with the scanner recoded in C
jimray : Hpricot, a fast and delightful HTML parser
# copy
55 month ago
Simon Willison : Beautiful Soup - Ultra Liberal Python HTML/XHTML parser. (via)
Nelson Minar : Python screen scrape - Beautiful soup - Python library for screenscraping HTML
deusx : Beautiful Soup - "You didn't write that awful page. You're just trying to get some data out of it. Right now, you don't really care what HTML is supposed to look like."
Paul Hammond : Beautiful Soup - You didn't write that awful page. You're just trying to get some data out of it
Anne van Kesteren : Beautiful Soup - I want this, only with XPath or CSS Selectors. Learning a new selecting language again and again is annoying. Please base your product on standards. #
Matthew M. Boedicker : Beautiful Soup, Python lib for screenscraping
Rod Begbie : Beautiful Soup - Python HTML parser which doesn't choke on malformed markup. Handy for screenscraping. #
philgyford : Beautiful Soup: We called him Tortoise because he taught us. - "Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping."
factoryjoe : Beautiful Soup: We called him Tortoise because he taught us. - Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. Saved By: Chris Messina | View Details | Give Thanks Tags: python, beautifulsoup, css, screenscraping, microformats
# copy