Welcome, Guest. Please Login.
IRobotSoft Robot
11/22/17 at 05:48:09
News: IRobotSoft is the Best Visual Web Computing Platform!
Home Help Search Login
Google
 


Pages: 1
Send Topic Print
Is it possible to scrape data from RSS feed? (Read 3049 times)
IRobotSoft Administrator
IRobotSoft Administrator
*****


IRobotSoft, the Best
Internet Robot
System

Posts: 1601
Gender: male
Is it possible to scrape data from RSS feed?
12/21/08 at 17:04:59
 
Surely you can scrape RSS data.  You can add a schedule action and add events to complete the following steps:  
1.  Assign variable 'rss_url' as the url of the RSS  
2.  Assign variable 'query' as a string:  
<item>{      title=<title>:tx;  
     link=<link>:tx;  
     description=<description>:tx &html_decode;  
     keywords=<media:keywords>:tx;  
     pubdate=<pubDate>:tx;  
     }  
3. load RSS feeds into a dataset variable:  
rss.loadData(rss_url,'link is not null','xml',query)  
 
Then you can do anything you want from the 'rss' dataset, for example, repeated an action based on rss and go to URL using rss.link.  
Back to top
 
 

The Administrator.
WWW   IP Logged
IRobotSoft Administrator
IRobotSoft Administrator
*****


IRobotSoft, the Best
Internet Robot
System

Posts: 1601
Gender: male
Re: Is it possible to scrape data from RSS feed?
Reply #1 - 10/14/17 at 00:08:53
 
Note that it is tricky if the url is an HTTPS url.  You can use code like the following instead:  
 
1.  Assign variable 'rss_url' as the url of the RSS  
2.  Use SaveUrlFile to save the HTTPS content to a local file 'local_feed.txt'
    SaveUrlFile(rss_url, 'local_feed.txt')
 
3.  Assign variable 'query' as a string:    
<item>{      title=<title>:tx;    
     link=<link>:tx;    
     description=<description>:tx &html_decode;    
     keywords=<media:keywords>:tx;  
     pubdate=<pubDate>:tx;  
     }  
4. load RSS feeds into a dataset variable:    
rss.loadData('local_feed.txt','link is not null','xml',query)  
 
 
 
Back to top
 
 

The Administrator.
WWW   IP Logged
Pages: 1
Send Topic Print