Sunday, December 25, 2011

Task: News scraper/tracker

I want to scrape the Reuters news feeds (later, others) into a database for various analytical purposes [eg]. That's going to consist of a daemon on my fileserver that checks the feeds on a period basis and loads things into a database. Then we'll do other analysis on that database. I'm most interested in linking stories and identifying trends.

Yeah, OK, I know this isn't groundbreaking research. It's new for me, though. And it will be a good microcosm of scraping tasks for declaratization as well as a valuable component for all kinds of things. So ... it's a task.

No comments:

Post a Comment