The htmlSQL project is a PHP class that will allow developers to access HTML tags and attributes using SQL like syntax. It looks like the project has actually been around for a few years, however I just heard about it. I have worked on several personal data scraping projects that use HTML as the source. My normal route is to just use regular expressions to get the data I need and then dump it into a local database. htmlSQL could really speed up the time it takes to select the right data and I see big potential for those types of projects.
I haven’t got to play with htmlSQL yet, but I do plan on writing a tutorial using it soon. If you are interested in htmlSQL you can check out the htmlSQL Live Example or the htmlSQL site.
comments