Blockchain – An Introduction
August 30, 2018
Author: Akarsh Agarwal and Editor: Michael Gord
The word internet may sound so simple, but encompasses a lot more than we could have ever imagined. It has brought information, knowledge and connectivity to our entire plannet. It has simplified the search to answer any questions or learn any subject.
The information available on the Internet is nothing but data hosted somewhere in the world, which we can access using either IP Addresses or Domain Names (URLs). This data is of varied forms, public or private, document or video. Anything that is binary (1s and 0s) is considered to be a part of this interconnected web of data. This data needs to be protected from malicious usage and access to prevent catastrophic consequences.
Until the past decade, the solution to this problem has been to host data in datacenters situated in geographically different locations to prevent an outage of access to data, if a single datacenter goes down. The data is also protected to prevent others from breaking into the system and causing harm. Still, it was not totally safe due to hackers getting better day by day and learning new techniques to break into the system.
An industry that would be particularly affected by this type of data intrusion is the banking sector. The world works with currency. If you have more, you can buy more, and it is very important these numbers are not fabricated.
What if, your entire net was worth stolen in an instance because there was security breach in a data-center?
How do we prevent this?
A new idea came up in the mind of Satoshi Nakamoto. It was “Bitcoin”, the digital currency for the digital era. While bitcoin was in itself revolutionary, the greater innovation was not the currency but the technology and architecture it uses to store it’s transactions, the blockchain.
Blockchain, in a lay-man’s language, chain of blocks, in an accountant’s language, a distributed ledger, and in a technophile’s language, peer-to-peer database, has brought a new layer to the Internet. It removes the major component of a database, being hosted on a central server, to a distributed environment. Any peer connected to the network hosts this shared database and communicates changes to the database via P2P protocols.
This drastically decreases the probability of all the peers going down simultaneously, as they are situated across the world. It ensures that the network is constantly up and running.
This still might not sound different from what we already have, distributed datacenters. How is the Blockchain any different? What else does it has to offer? How it is unique? We shall answer all these questions in the paragraphs to come, to help you understand the true capability of a blockchain.
As mentioned, a blockchain is literally a chain of blocks, where each block contains some sort of data that is encrypted by the user creating the block. They encrpyt the data by using a set of keys, one public and one private, which are cryptographically linked. The private key can be used to generate a public key but the private key can not be generated from the public key.
Nodes who contribute computing resources to secure the network are also competing to create the next blocks, and be awarded with the sum of tokens released in the block. These nodes are referred to as miners. In the bitcoin blockchain, each block that is mined today has 12.5 bitcoin, which has a market value of over $13,000 USD at the time of writing. The mining reward in each block decreases with a half life of four years, so in 2008 there were 50 tokens released in each block, in 2012 it dropped to 25 and in 2016 it halved to the current amount of 12.5 tokens in each block.
After each block has been constructed and added to the chain, it is broadcasted by the miner of the block to the other peers in the network to announce the change in the height of the blockchain (the height refers to the number of blocks that have been mined). As the data is not hosted on a central server, every peer has their own copy of the chain and updates it after it receives a message from another node of the network. This ensures that even if one copy of the chain is corrupted, there are other copies to replace the corrupted one and ensure the availability of the correct data. Also, it makes sures that whenever other peers add to the chain, they add on top of the updated chain and not the old copy that they have. Each block is cryptographically linked to the previous blocks in the chain, so if a change is made to a block that has already been mined into the chain, the cryptography is altered and the nodes become aware of the altercation.
Each block after it is constructed is hashed to get a hash value, which represents that block. This hash value is shared with every peer connected to the network. A node must have the hash value to the previous block to include a new block to the chain, because the hash value of the previous block is an integral part of the data of the next block. This ensures that, when traversing backwards in the chain, we add the newly mined block to the last mined block. As mentioned, this also ensures that if someone has tampered with the data in the previous block, the hashed value in the next block will defer and the entire network will become aware that there is an integrity issues with this copy of the chain. This means that in addition to acting as a link between blocks, the hash values of blocks act as a verifying entity to the blockchain.
For a hacker to completely modify the data, he/she has to change the data of the block in every corresponding block upto the first block up the chain, the genesis block, and then make sure that every other node on the network does the same. This becomes increasingly more difficult as the number of blocks, number of nodes and the mining power of the chain increases. Today, there are approximately 1 million peers connected to the bitcoin blockchain network, so to make sure everyone has the edited copy made by the hacker, they would need to change the 1 million copies of the network simultaneously. This is very computationally expensive, especially considering that the network infrastructure is spread across many different countries.
Hashing the block with keys also makes it resistant to editing the data of a transaction, as the attacker would need access to the private key(s) of the peer they were attacking. Still, even if a hacker had access to the key(s), there is the computationally expensive problem, mentioned above, which adds to the difficulty in tampering with the chain.
Current databases support edit functionality. After being edited, the old data is lost or logged into a seperate file. It can be very difficult to piece together the exact changes in a relational database with hundreds or thousands of tables. This problem is eliminated with a blockahin, as once something is added to the chain, it is not edited without it becoming clear the exact place that was changed. It is an “append only database”, where peers keep on adding to the chain without removing any previous data. This ensures old values of data still remains and is accessible by searching past blocks on the chain. It is much easier to track the history when compared to the logs which other databases generate.
The above description gives proof about how immutability is supported by blockchain to avoid any tampering with the data. Everyone works on a verified copy and add to the correct blockchain’s datastore.
Every blockchain should have a sufficient number of peers to make sure that this holds out, as the more nodes on the chain equals greater security and resistance to hacking. You don’t provide security to a blockchain by adding complex protection algorithms, but rather by adding more peers, blocks and hash power to the network, which makes sure that it is not tampered with and altered for an individuals own interests.