IRobotSoft Visual Web Scraping & Web Automation Forum
http://irobotsoft.org/bb/YaBB.pl
General Category >> FAQ >> Is it possible to scrape data from RSS feed?
http://irobotsoft.org/bb/YaBB.pl?num=1229897099

Message started by IRobotSoft Administrator on 12/21/08 at 17:04:59

Title: Is it possible to scrape data from RSS feed?
Post by IRobotSoft Administrator on 12/21/08 at 17:04:59

Surely you can scrape RSS data.  You can add a schedule action and add events to complete the following steps:  
1.  Assign variable 'rss_url' as the url of the RSS
2.  Assign variable 'query' as a string:  
<item>{      title=<title>:tx;  
    link=<link>:tx;  
    description=<description>:tx &html_decode;  
    keywords=<media:keywords>:tx;
    pubdate=<pubDate>:tx;
    }
3. load RSS feeds into a dataset variable:  
rss.loadData(rss_url,'link is not null','xml',query)

Then you can do anything you want from the 'rss' dataset, for example, repeated an action based on rss and go to URL using rss.link.  

Title: Re: Is it possible to scrape data from RSS feed?
Post by IRobotSoft Administrator on 10/14/17 at 00:08:53

Note that it is tricky if the url is an HTTPS url.  You can use code like the following instead:

1.  Assign variable 'rss_url' as the url of the RSS  
2.  Use SaveUrlFile to save the HTTPS content to a local file 'local_feed.txt'
   SaveUrlFile(rss_url, 'local_feed.txt')

3.  Assign variable 'query' as a string:  
<item>{      title=<title>:tx;  
    link=<link>:tx;  
    description=<description>:tx &html_decode;  
    keywords=<media:keywords>:tx;  
    pubdate=<pubDate>:tx;  
    }  
4. load RSS feeds into a dataset variable:  
rss.loadData('local_feed.txt','link is not null','xml',query)  




IRobotSoft Visual Web Scraping & Web Automation Forum » Powered by YaBB 2.1!
YaBB © 2000-2005. All Rights Reserved.