Wouldn’t it be awesome if you were, somehow, able to predict tomorrow’s Bitcoin (BTC) price? As you all know, cryptocurrency market has experienced a tremendous volatility over the last year. The value of Bitcoin has reached its peak on December 16, 2017 by climbing to nearly $20,000 and then it has seen a steep decline at the beginning of 2018. Not long ago though, a year ago to be precise, its value was almost half of what it is today. Therefore, if we take a look at the yearly BTC price chart, we may easily see that the price is still high. The fact that two years ago BTC’s value was only the one-tenth of its current value is even more shocking. You may personally explore the historical BTC prices using this plot below:
There are several conspiracies regarding the precise reasons behind this volatility and these theories are also used to support the prediction reasoning of crypto prices, particularly of BTC. Although these subjective arguments are valuable to predict the future of cryptocurrencies, our way of prediction approaches this issue from a different perspective, particularly, that of an algorithmic trading. We simply plan to use numerical historical data to train a recurrent neural network (RNN) to predict the BTC prices.
Obtaining the Historical Bitcoin Prices
There are quite a few resources we may use to obtain historical Bitcoin price data. While some of these resources allow the users to manually download CSV files, others provide an API that one can hook up to his code. Since when we train a model using time series data, we would like it to make up-to-date predictions, I prefer to use an API so that we may always obtain the latest figures whenever we run our program. After a quick search, I have decided to use the CoinRanking.com’s API which provides up-to-date coin prices that we can use in any platform.
Recurrent Neural Networks
Since we are using a time series dataset, it is not viable to use a feedforward-only neural network as tomorrow’s BTC price is most correlated with today’s, not a month ago’s.
A recurrent neural network (RNN) is a class of artificial neural network where connections between nodes form a directed graph along a sequence. [2]
An RNN shows temporal dynamic behavior for a time sequence and it can use its internal state to process sequences. In practice, this can be achieved with LSTMs and GRUs layers.
Here you can see the difference between a regular feedforward-only neural network and a recurrent neural network (RNN):
Our Roadmap
To be able to create a program that trains on the historical BTC prices and predict tomorrow’s BTC price, we need to complete several tasks as follows:
1 — Obtaining, Cleaning, and Normalizing the Historical BTC Prices
2 — Building an RNN with LSTM
3 — Training the RNN and Saving The Trained Model
4 — Predicting Tomorrow’s BTC Price and “Deserialize” It
BONUS: Deserializing the X_Test Predictions and Creating a Plot.ly Chart
Obtaining, Cleaning, and Normalizing the Historical BTC Prices
Obtaining the BTC Data
As I mentioned above, we will use CoinRanking.com’s API for the BTC dataset and convert it into a pandas dataframe with the following code:
This function is adjusted for 5-years BTC/USD prices by default. However, you may always change these values by passing in different parameter values.
Cleaning the Data with Custom Functions
After obtaining the data and converting it to a pandas dataframe, we may define custom functions to clean our data, normalize it for a neural network as it is a must for accurate results, and apply custom train-test split. We created a custom train-test split function (not the scikit-learn’s) because we need to keep the time series order to properly train our RNN. We may achieve this with the following code and you may find further function explanations in the code snippet below:
After defining these functions, we may call them with the following code:
Building an RNN with LSTM
After preparing our data, it is time for building our model that we will later train by using the cleaned&normalized data. We will start with importing our Keras components and setting some parameters with the following code:
Then, we will create our Sequential model with two LSTM and two Dense layers with the following code:
Training the RNN and Saving The Trained Model
Now it is time to train our model with the cleaned data. You can also measure the time spent during the training. Follow these codes:
Don’t forget to save it:
I am keen to save the model and load it later because it is quite satisfying to know that you can actually save a trained model and re-load to use it next time. This is basically the first step for web or mobile integrated machine learning applications.
Predicting Tomorrow’s BTC Price and “Deserialize” It
After we train the model, we need to obtain the current data for predictions and since we normalize our data, predictions will be normalized as well. Therefore, we need to de-normalize back to their original values. Firstly, we will obtain the data with the similar, partially different, manner with the following code:
We will only have the normalized data for prediction: No train-test split. We will also reshape the data manually to be able to use it in our saved model.
After cleaning and preparing our data, we will load the trained RNN model for prediction and predict tomorrow’s price.
However, our results will the between -1 and 1 which will not make a lot of sense. Therefore, we need to de-normalize them back to their original values. We can achieve this with a custom function:
After defining the custom function, we will call these function and extract the tomorrow’s BTC prices with the following code:
With the code above, you can actually get the model’s prediction for tomorrow’s BTC prices.
Deserializing the X_Test Predictions and Creating a Plot.ly Chart
You may also be interested in the overall result of the RNN model and prefer to see it as a chart. We can also achieve these by using our X_test data from the training part of the tutorial.
We will start by loading our model (consider this as an alternative to the single prediction case) and making the prediction on X_test data so that we can make predictions for a proper number of days for plotting with the following code:
Next, we will import Plotly and set the properties for a good plotting experience. We will achieve this with the following code:
After setting all the properties, we can finally plot our predictions and observation values with the following code:
When you run this code, you will come up with the up-to-date version of the following plot:
How Reliable Are These Results?
As you can see, it does not look bad at all. However, you need to know that even though the patterns match pretty closely, the results are still dangerously apart from each other if you inspect the results on a day-to-day basis. Therefore, the code must be further developed to get better results.
Further Improvements
I would love to hear about alternative ideas on how we can further improve this model. In addition, a completely different approach could be developing a model that parses latest Twits about cryptocurrencies and predict the prices of these cryptocurrencies by taking advantage of Natural Language Processing (NLP) techniques.
If you liked this tutorial, you can send a Clap (I would really appreciate it) and you can also visit my profile to see the other tutorials.
Source: Crypto New Media