Hi -
I'm trying to identify some patterns where I can essentially spider the contents of a NATS-powered site. I'm talking about the intros or short descriptions on a photo set or gallery.
We'll use
http://innocenthigh.com as an example because I know they're fucking rad (And no, Billy, I'm not going to scrape your site's content

). For example, hit up their main intro page:
http://www.innocenthigh.com/t1/
As of this writing their most recent update is for "Bree Olsen". Its that "Bree is a really nice girl that ..." that I'm after. According to Web Developer Tools, this textual content can be found inside of:
html > body > div > table > tbody > tr > td > table #Table_01 > tbody > tr > td > table #Table_01 > tbody > tr > td > table #Table_01 > tbody > tr > td > table > tbody > tr > td > span .student_id_story1
I think I can isolate this text depending on where it appears in a span or a paragraph or simply assigned to a class. Hell, I can use any combination of those, but I know that people template the fuckall out of their sites so even that is not a sure-fire way to identify this conent.
Anyone got any tips/tricks? How about from the NATS guys themselves, do you guys check out these posts? If I can get this hammered out, I think I'll be on to something big - unfortunately in the beginning it will only support sites generated via NATS or any other system where content is easily machine-identifiable.
I guess on that note, how many people would care that I was doing this? The only way I'd be doing this is to use that same content to promote said sponsor. I would not be doing this otherwise. If you have a problem with me doing that, then you have a problem with me converting sales for you.
Thanks!