Is it possible to automatically download data from a website?

Hi,

I know in SAS it is possible to automatically download data from a website. For instance, suppose you wanted to download the unemployment rate using GAUSS from, say, the Federal Reserve Bank of St. Louis. The URL for the text datafile is: http://research.stlouisfed.org/fred2/series/UNRATE

Does anyone know if this can be accomplished inside of a program?

Thanks for your assistance.

 

2 Answers



0



You can have GAUSS execute system commands and use the Curl executable. Do you want to just download the data to an xls or csv file and then use that downloaded file for your analysis, or do you want to have your program to have the flexibility to pull down a different date range every time?

After you respond, I will post an example.

aptech

1,773


0



Here are the steps to automatically downloading data from St. Louis Fed site:

  1. Download the cURL executable program from here
  2. Extract the cURL zip file and place the file curl.exe into your GAUSS home directory.
  3. The version of cURL from the link above requires the Visual Studio 2010 x64 redistributable. If your machine does not already have it installed you can download it from Microsoft from here for free.

Once these steps are completed, you can use this procedure to download the data. NOTE if you place curl.exe in a location other than C:\gauss13, then the variable curlpath needs to be changed to that location.

proc (0) = downloadData(url, outfile);
	local curlpath, cmds, curlexe, filetype;
	
	//Find the desired file type
	if (strindx(outfile, "xls", 1));
	    filetype = "xls";
	elseif (strindx(outfile, "csv", 1));
		filetype = "csv";
	else;
		errorlog "downloadData: Warning: No valid file extension found";
	endif;
	
	//Specify location of 'curl' executable
	curlpath = "C:\\gauss13\\";
	curlexe = curlpath$+"curl.exe";

	//Create full system call
	cmds = curlexe$+" \""$+url$+"\""$+" --data \"_qf__mainform=&native_frequency=Monthly&download_data=Download+Data&units=lin&frequency=Monthly&aggregation=Average&obs_start_date="$+obs_start$+"&obs_end_date="$+obs_end$+"&file_format="$+filetype$+"\" > "$+outfile;
	
	//Execute system call
	dos ^cmds;
endp;

Example 1

url = "http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata";
obs_start = "1948-01-01";
obs_end = "2013-03-01";
outfile = "unemployment.xls";

downloadData(url, outfile);

Example 2

url = "http://research.stlouisfed.org/fred2/series/LNS14200000/downloaddata?cid=32447";
obs_start = "1968-01-01";
obs_end = "2013-03-01";
outfile = "unemployment_part_time.xls";

downloadData(url, outfile);

These downloaded XLS files use the first 18 rows or so for description text and they will not contain useful data.

aptech

1,773

Your Answer

2 Answers

0

You can have GAUSS execute system commands and use the Curl executable. Do you want to just download the data to an xls or csv file and then use that downloaded file for your analysis, or do you want to have your program to have the flexibility to pull down a different date range every time?

After you respond, I will post an example.

0

Here are the steps to automatically downloading data from St. Louis Fed site:

  1. Download the cURL executable program from here
  2. Extract the cURL zip file and place the file curl.exe into your GAUSS home directory.
  3. The version of cURL from the link above requires the Visual Studio 2010 x64 redistributable. If your machine does not already have it installed you can download it from Microsoft from here for free.

Once these steps are completed, you can use this procedure to download the data. NOTE if you place curl.exe in a location other than C:\gauss13, then the variable curlpath needs to be changed to that location.

proc (0) = downloadData(url, outfile);
	local curlpath, cmds, curlexe, filetype;
	
	//Find the desired file type
	if (strindx(outfile, "xls", 1));
	    filetype = "xls";
	elseif (strindx(outfile, "csv", 1));
		filetype = "csv";
	else;
		errorlog "downloadData: Warning: No valid file extension found";
	endif;
	
	//Specify location of 'curl' executable
	curlpath = "C:\\gauss13\\";
	curlexe = curlpath$+"curl.exe";

	//Create full system call
	cmds = curlexe$+" \""$+url$+"\""$+" --data \"_qf__mainform=&native_frequency=Monthly&download_data=Download+Data&units=lin&frequency=Monthly&aggregation=Average&obs_start_date="$+obs_start$+"&obs_end_date="$+obs_end$+"&file_format="$+filetype$+"\" > "$+outfile;
	
	//Execute system call
	dos ^cmds;
endp;

Example 1

url = "http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata";
obs_start = "1948-01-01";
obs_end = "2013-03-01";
outfile = "unemployment.xls";

downloadData(url, outfile);

Example 2

url = "http://research.stlouisfed.org/fred2/series/LNS14200000/downloaddata?cid=32447";
obs_start = "1968-01-01";
obs_end = "2013-03-01";
outfile = "unemployment_part_time.xls";

downloadData(url, outfile);

These downloaded XLS files use the first 18 rows or so for description text and they will not contain useful data.


You must login to post answers.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.