Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (2024)

Home » Programming Languages » R

  • Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (1)NortonTrevisanRoman

Yahoo Finance(below) is a website that gives us a good deal of information about the financial market, including information about stock trading.

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (2)

To get the numbers for a specific stock, all you have to do is to search for its Yahoo code (ex: VALE – see figure below). This is not necessarily the company’s trading code at the market, but its representation in Yahoo Finance.

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (3)

The platform will then show you some free information, both technical and fundamental, about the desired company and stock (naturally, the best pieces are kept for Yahoo Finance’s premium accounts).

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (4)

Still, what interests us here is the historical price series, which is for free. Before going any further, however, it’s worth mentioning that, even though Yahoo Finance actually had, at some point, an API of its own, apparently due to some legal problems they had to discontinue it. There are some other APIs out there, such as the one provided byRapidAPI, but these are non-official and limited in their free accounts. I know of no one providing a free API specifically to R.

The idea here is to download the time series returned by Yahoo Finance when, from its webpage, we choose a time period and frequency, and then hit theApplybutton (below).

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (5)

The returned time series can be downloaded (as a .csv file) by clicking theDownloadbutton. What we are going to do is to simulate this click, by sending Yahoo’s servers the same request this button would.

But before that, there is one odd behaviour you should notice in Yahoo Finance’s website, and that is related to the time period you choose. Let’s suppose you wish to change this period. You click on it, and chooses a start and end date. In this case, you wanted the period to span from 15/06/2020 to 19/06/2020, as shown below.

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (6)

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (7)

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (8)

What happens then is very odd. By clicking onDone, you’ll see a page showing a different period, in this case 14/06/2020 to 18/06/2020 (the day before the dates you’ve set).

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (9)

And to get matters even more confusing, after clicking onApply, Yahoo will show you data from 15/06/2020 to 18/06/2020:

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (10)

Our mission is then to map the desired time period (15/06/2020 to 19/06/2020) into the URL generated by theDownloadbutton which, in this example, is

query1.finance.yahoo.com/v7/finance/download/VALE? period1=1592179200&period2=1592524800&interval=1d&events=history

Assembling the Query

As you may have already noticed, Yahoo Finance’s query follows the pattern

query1.finance.yahoo.com/v7/finance/download/STOCK_CODE?period1=START_DATE&period2=END_DATE&interval=INTERVAL&events=history

where STOCK_CODE is Yahoo’s stock code (in our example, VALE), START_DATE and END_DATE are the start and end dates, respectively, of the retrieved period, and INTERVAL is the time interval of each record (in this case, we have worked with daily records – 1d).

STOCK_CODE and INTERVAL are not so hard to get. The problem is how to get from 15/06/2020 to 1592179200 and from 19/06/2020 to 1592524800. Let’s forget, for the moment, we wanted data to span from 15/06 to 19/06, and focus on the data returned by the website.

As it turns out, Yahoo Finance represents time as the number of seconds since the beginning of 1970, UTC. In R, that corresponds to the POSIXct class. So let’s try it out:

t1 <- ISOdate(2020,6,15) as.integer(t1)
## [1] 1592222400

Not quite so. What happened is that ISOdate function is defined as

ISOdate(year, month, day, hour = 12, min = 0, sec = 0, tz = “GMT”)

That is it gives us the number of seconds at noon. So let’s set it to the beginning of 15/06:

t1 <- ISOdate(2020,6,15,hour=0) as.integer(t1)
## [1] 1592179200

andvoilà. What about 19/06/2020? Well, it turns out that both dates are codified as


t1 <- ISOdate(2020,6,15,hour=0)
t2 <- ISOdate(2020,6,19,hour=0)
as.integer(t1)
## [1] 1592179200
as.integer(t2)
## [1] 1592524800

Now we see both 1592179200 and 1592524800 from the URL. This also explains why we asked for data up to 19/06 and got only up to 18/06: markets were closed by 00:00 19/06/2020. This is something you must correct for, if you should stay faithful to your original query. Fortunately, this correction is easy: all you have to do is to ask for data from the next day, or from midnight. So both

as.integer(ISOdate(2020,6,19,hour=24))
## [1] 1592611200
as.integer(ISOdate(2020,6,20,hour=0))
## [1] 1592611200

will return the desired number. Or you might just forget about it, as I did in this example.

So, the URL in our example may finally be assembled with

stock <- "VALE"
url <- paste("https://query1.finance.yahoo.com/v7/finance/download/",
stock,
"?period1=",
as.integer(t1),
"&period2=",
as.integer(t2),
"&interval=1d&events=history",
sep="")
url
## [1] "https://query1.finance.yahoo.com/v7/finance/download/VALE?period1=1592179200&period2=1592524800&interval=1d&events=history"

(here I kept the 1d interval, but obviously you can change it at will).

Downloading the .csv File

Now that we have the URL, all we have to do is to download the file. To do this, an alternative would be


dataset <- read.csv(url)

and you’ll have your data frame with Yahoo Finance’s data.


str(dataset)

## 'data.frame': 4 obs. of 7 variables:
## $ Date : Factor w/ 4 levels "2020-06-15","2020-06-16",..: 1 2 3 4
## $ Open : num 10.1 10.8 10.6 10.5
## $ High : num 10.6 10.9 10.8 10.6
## $ Low : num 10.1 10.4 10.5 10.4
## $ Close : num 10.6 10.7 10.7 10.6
## $ Adj.Close: num 10.6 10.7 10.7 10.6
## $ Volume : int 33837300 43970200 34886400 34436100

But that comes for a price

  • Every time we run our code, we download the very same dataset; and
  • We must savedatasetif we wish to use it offline in the future

Alternatively, we could download a local copy of the data


fileName <- "my_dataset.csv"
download.file(url, fileName)

and read it whenever we feel like


dataset2 <- read.csv(fileName)

str(dataset2)
## 'data.frame': 4 obs. of 7 variables:
## $ Date : Factor w/ 4 levels "2020-06-15","2020-06-16",..: 1 2 3 4
## $ Open : num 10.1 10.8 10.6 10.5
## $ High : num 10.6 10.9 10.8 10.6
## $ Low : num 10.1 10.4 10.5 10.4
## $ Close : num 10.6 10.7 10.7 10.6
## $ Adj.Close: num 10.6 10.7 10.7 10.6
## $ Volume : int 33837300 43970200 34886400 34436100

There you are. Your dataset is ready for action.

And that’s all for the moment. Hope you find this small contribution useful.

I am an expert in programming languages and data manipulation, particularly in the context of financial markets. My expertise lies in extracting and analyzing financial data from various sources. I have a deep understanding of how to retrieve historical price series from platforms like Yahoo Finance using programming languages like R.

Now, let's delve into the concepts discussed in the article you provided:

  1. Yahoo Finance and Stock Trading:

    • Yahoo Finance is a website providing extensive information about the financial market, including stock trading details.
    • To obtain information for a specific stock, users can search for its Yahoo code, which represents the stock on Yahoo Finance.
  2. Historical Price Series Retrieval:

    • The article focuses on obtaining historical price series data, which is available for free on Yahoo Finance.
    • Yahoo Finance previously had its API, but due to legal issues, it was discontinued. Alternative APIs, such as RapidAPI, exist but have limitations in their free accounts.
  3. Time Series Download Process:

    • Users can simulate the download of historical time series data from Yahoo Finance by sending the same request the website uses.
    • The article explains the process of selecting a time period and frequency on the website and then downloading the data in CSV format.
  4. URL Structure for Data Retrieval:

    • The article provides insights into the URL structure used by Yahoo Finance for data retrieval, including the stock code, start and end dates, and the time interval.
  5. Time Representation in Yahoo Finance:

    • Yahoo Finance represents time as the number of seconds since the beginning of 1970, UTC. In R, this corresponds to the POSIXct class.
  6. Code Demonstration in R:

    • The article demonstrates R code for converting dates to the required format for Yahoo Finance's API request.
    • It explains how to handle odd behaviors related to market closing times.
  7. CSV File Download and Handling:

    • Once the URL is constructed, the article shows how to download the data in CSV format.
    • It highlights the option to save the dataset locally for offline use.
  8. Considerations and Conclusion:

    • The article concludes by discussing considerations such as downloading the same dataset every time the code runs and the option to save the dataset for future offline use.

In summary, the article provides a comprehensive guide on retrieving historical stock price data from Yahoo Finance using R, covering various aspects from URL construction to data download and handling.

Getting Historical Data from Yahoo Finance in R - DataScienceCentral.com (2024)
Top Articles
Latest Posts
Article information

Author: Carmelo Roob

Last Updated:

Views: 6380

Rating: 4.4 / 5 (65 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Carmelo Roob

Birthday: 1995-01-09

Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

Phone: +6773780339780

Job: Sales Executive

Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.