blogmarks.net Get Firefox!

templatemaker

16 month ago

nelson : templatemaker - Advanced Python screenscraping; tries to infer document structure from multiple examples

Tags : beautifulsoup django parser python screenscraping template

  copy

Screenscraping tips

16 month ago

nelson : Screenscraping tips - Some comments I made about liberating data from websites

Tags : beautifulsoup python robot screenscraping web

  copy

Adrian Holovaty releases templatemaker, a Python library for smart screen scraping

17 month ago

Andy Baio : Adrian Holovaty releases templatemaker, a Python library for smart screen scraping - given a large set of HTML documents, intelligently extracts the strings that change between them

Matthew M. Boedicker : templatemaker, Python screenscraping library - (via waxy) [via]

joshua : Introducing templatemaker - back out templates from similar documents

Rod Begbie : Introducing templatemaker - Python library that analyses a corpus of web pages, works out where the dynamic values are in the template, then allows you to scrape out the juicy details. I can think of oh, so many uses for this. [via#

philgyford : Introducing templatemaker | Holovaty.com - Python thing. Point it at some HTML files and it will make a template with holes for the unique strings in the pages. (via Daring Fireball)

Tags : dev python web adrianholovaty screenscraping html scraping templates templating top via:daringfireball webdevelopment

  copy

More Like This WebLog > The State of Screen Scraping

28 month ago

deusx : More Like This WebLog > The State of Screen Scraping - "During all this recent excitement about using hAtom to generate feeds, I’d forgotten that I wrote about the concept nearly three years ago when I was getting ready to talk about syndication at Seybold SF."

Tags : hatom microformats screenscraping semweb webdev xslt

  copy

Hpricot, a fast and delightful HTML parser

29 month ago

Paul Hammond : Hpricot, a fast and delightful HTML parser - Hpricot is a very flexible HTML parser, based on Tanaka Akira's HTree and John Resig's JQuery, but with the scanner recoded in C

jimray : Hpricot, a fast and delightful HTML parser

Tags : programming ruby screenscraping webdev

  copy

Beautiful Soup

55 month ago

Simon Willison : Beautiful Soup - Ultra Liberal Python HTML/XHTML parser. (via)

Nelson Minar : Python screen scrape - Beautiful soup - Python library for screenscraping HTML

deusx : Beautiful Soup - "You didn't write that awful page. You're just trying to get some data out of it. Right now, you don't really care what HTML is supposed to look like."

Paul Hammond : Beautiful Soup - You didn't write that awful page. You're just trying to get some data out of it

Anne van Kesteren : Beautiful Soup - I want this, only with XPath or CSS Selectors. Learning a new selecting language again and again is annoying. Please base your product on standards. #

Matthew M. Boedicker : Beautiful Soup, Python lib for screenscraping

Rod Begbie : Beautiful Soup - Python HTML parser which doesn't choke on malformed markup. Handy for screenscraping. #

philgyford : Beautiful Soup: We called him Tortoise because he taught us. - "Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping."

factoryjoe : Beautiful Soup: We called him Tortoise because he taught us. - Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. Saved By: Chris Messina | View Details | Give Thanks Tags: python, beautifulsoup, css, screenscraping, microformats

Tags : python scraping html screenscraping beautifulsoup top webdevelopment

  copy
xml
Upian.