Welcome, Guest. Please Login.
IRobotSoft Robot
05/27/17 at 19:07:50
News: IRobotSoft is the Best Visual Web Computing Platform!
Home Help Search Login
Google
 


1  General Category / Newcomers / Re: Need a sample robot file for extracting data
 on: 05/20/17 at 23:51:22 
Started by vakkenapally | Post by IRobotSoft Administrator
You will need to change the target format.  
 
Under the "save" task, find the last action "Schedule task", right click it and select "Save variable...".  Then on the Save Variables page, change the target type: XML file to "CSV file".  Rerun it, and the data will be in csv format.  You can change the Save to file name to .csv or just rename the file to .csv.  
 
Reply Quote Notify of replies  

2  General Category / Newcomers / Re: Need a sample robot file for extracting data
 on: 05/20/17 at 16:42:50 
Started by vakkenapally | Post by 20GT
thanks I found literature.txt but how can I get it to export as csv?
Reply Quote Notify of replies  

3  General Category / Newcomers / Re: Need a sample robot file for extracting data
 on: 05/18/17 at 21:12:54 
Started by vakkenapally | Post by IRobotSoft Administrator
Sorry, the pubmed page has changed recently.  We have make a new release with a fix.  
 
Please download the new package at:
http://irobotsoft.com/irobot-eval.zip
 
 or the pubmed robot at:  
http://irobotsoft.com/robots/pubmed.irb.  
 
Data should be scraped to literature.txt.  Let us know if you still cannot find the scraped data.
Reply Quote Notify of replies  

4  General Category / Newcomers / Re: Need a sample robot file for extracting data
 on: 05/18/17 at 17:09:00 
Started by vakkenapally | Post by 20GT
I ran the "pubmed.irb" robot but have no idea what it was doing???
I saw it browsing but didn't see what it was scraping.  
what is it doing ?
where is it saving the data it's scraping
Reply Quote Notify of replies  

5  General Category / Newcomers / Re: callParallel browser limitation...
 on: 10/11/16 at 21:49:17 
Started by BrentH | Post by BrentH
I tried testing using: Advanced -> Server Test Mode -> Test Server-mode Full  
...this did not work.
 
Is a 'user agent' string being passed with the website request? Something is telling yellowpages.com that the request is coming from an unsupported browser.  What is it?
 
Thanks
Reply Quote Notify of replies  

6  General Category / Newcomers / Re: CrawlWebsites() using https...
 on: 10/11/16 at 21:19:10 
Started by BrentH | Post by BrentH
I just mis-typed in the blog Wink  I am definitely using crawlWebsite() correctly.
 
I am testing with a mix of http and https urls...
 
http sites are crawled and data is returned without issues.
https sites show "0 of 0 bytes..." for each https site in the download popup.
 
The log shows each https site as a tuple...but no data is returned for them.  It seems that they are being skipped.
 
A bug?
 
Thanks
Reply Quote Notify of replies  

7  General Category / Newcomers / Re: CrawlWebsites() using https...
 on: 10/11/16 at 09:40:05 
Started by BrentH | Post by IRobotSoft Administrator
Are you sure it is not the typo in crawlWebsite(), note there is no s?  
Reply Quote Notify of replies  

8  General Category / Newcomers / Re: callParallel browser limitation...
 on: 10/11/16 at 09:28:50 
Started by BrentH | Post by IRobotSoft Administrator
Socket Browser does not support any browser version.   You can use menu Advanced -> Server Test Mode -> Test Server-mode Full to see if the socket browser works for the website.   Otherwise, you have to use the embedded IE browser.   You can parallelize through dividing the job and running multiple irobot instances on your computer.  
Reply Quote Notify of replies  

9  General Category / Newcomers / Re: callParallel browser limitation...
 on: 10/10/16 at 21:58:14 
Started by BrentH | Post by BrentH
yellowpages.com only supports IE 9 and above (I tested with IE emulation mode).
 
I now believe this is a browser version issue and nothing to do with javascript.
 
What version is the embedded socket browser? Is there a method to switch versions?
 
I also, tried to use the Advanced --> Main Browser --> IE Browser (run as admin).
With IE 11 installed I still get browser not supported page. This should work, right?
 
Thanks
 
Reply Quote Notify of replies  

10  General Category / Newcomers / Re: CrawlWebsites() using https...
 on: 10/10/16 at 21:21:49 
Started by BrentH | Post by BrentH
Ok that worked!
 
callParallel now works with https sites after installing the 'Windows x86 MSI installer'. Thanks for the fix!
 
However, I retested the crawlWebsites() function for crawling https sites and found that it does not work.  
crawlWebsites() only seems to work with http sites.
 
Is there a fix for that?
 
Thanks again
 
Reply Quote Notify of replies