Blockchain

a philosophical view for software developers

17 min readSep 30, 2021

Intro

For almost my entire career I’ve been passionate about databases and the ways we are manipulating data using electronic tools.

I was constantly reading and implementing projects that relies on databases, I wrote my diplomas thesis on database-related topics and I even conducted few bachelor license projects for students of ETTI Faculty on database topics. More than that, two of them were built using blockchain technology.

Since then I decided to write an article to summarize my knowledge in this area, which may be helpful for one who wants to start learning blockchain concepts and also need some good recommendations of reading materials. Lately, I have also started some formal studies of blockchain and obtain even certification from Blockchain Council.

So, let’s get started.

What is blockchain?

You will find on Google many definitions of ‘blockchain’, so many that I will only try to summarise some of them.

Blockchain is a system of recording information in a way that makes it difficult or impossible to change, hack, or cheat a system. A blockchain is essentially a digital ledger of transactions that is duplicated and distributed across an entire network of computer systems on the blockchain.

(Note: this is a widely accepted definition, but “impossible” is not actually the exact word to use. Maybe we can agree the statement is true in production implementations, where there are many contexts and also, a lot of physical limitations. However, it is not necessarily mathematically impossible to cheat, there’s a good description here by J.H. White in chapter 4.3.)

You can think of the blockchain as a specific type of database. It is storing data using blocks that are chained together. New data is entered into new blocks, and once the block is filled with data, this is chained onto a previous block and this makes the data chained together in chronological order.

Simplified, is a list of data blocks that are linked together with a timestamp.

As common databases have already known data types, the most common use for blockchain so far has been as a ledger for transactions.

Decentralised blockchains are immutable, which means that the data entered is irreversible. For Bitcoin, this means that transactions are permanently recorded and viewable by anyone.

The main difference between a blockchain and a database

Actually, it depends on what you mean by “a database”. Of course, we can refer to any organised group of files linked by some logic and maybe an API, as a database; right? We can even consider an Excel document, with formulas and links between sheets, as a database as well. Technically speaking, both of these are correct. Actually, in literature, there are many opinions regarding classification models, so I can recommend this article to you, it should help you get a better idea about this.

However, in this comparison, we will focus more on the differences between a common SQL/NoSQL DBMS and a blockchain.

A key difference is in the way data are structured. A blockchain collects and groups data in blocks that have a certain capacity in terms of storage. When a block is filled, then it is linked to the previous block, using a hash of the entire previous block content.

Another key difference is that the blockchain is distributed: it does not store data in a single place and does not have a single point of failure. Of course, there are also other distributed DBMS, such as Cassandra, for example, which stores data using multiple nodes and provides a lot of support for data consistency.

But there is one more key difference; an important one that makes blockchain technology so famous and revolutionary — it makes data immutable. There is no DML that allows you to modify the data, no matter what permission levels you have. You cannot cheat, you cannot root.

And what is more fascinating is that the solution was there with us all the time and it is a mathematical one. We do not need to struggle to make systems more secure, to hide sensitive data, to use more and more powerful servers to handle security and block attacks. We actually only needed to do the reverse: to share them with everyone and to create links between blocks of data, so if anyone changed anything, all the others would know. It is security by transparency and hiding things where everyone can see them.

Some history

The history of blockchain as a popular trend started with the publication of the whitepaper Bitcoin: A Peer-to-Peer Electronic Cash System, written by someone calling themselves Satoshi Nakamoto in 2008. The bitcoin project started the cryptocurrency era being the first implementation of a peer-to-peer network as a solution to the double-spending problem. Even though this project continues to function and circulate successfully to date, Nakamoto’s ideas behind the creation of bitcoin have exceeded the original use case. Now, these ideas are known collectively and independently as the blockchain.

However, the underlying maths concepts are older than the bitcoin implementation; the P2P concept was already known and other implementations existed before blockchain, but they did not become so popular and also, did not have the same technical results in terms of benefits.

So, are there predecessors of blockchain? The answer is yes. There were some:

DigiCash — founded in 1989 by David Chaum, an American cryptographer. It was based on the mechanism named “Blind Signature”, an innovation that eventually led to the development of blockchain technology. The whitepaper describing it was written even earlier, in 1983.
HashCash — an email PoW system designed to figure out spam emails and DoS attacks. It was developed in 1997 by Adam Back but the documentation was released 5 years later.
B-Money — an early age distributed cash system proposed by Wei Dai in 1998, who also developed Crypto++. The smallest subunit of Ether is wei named after him.
E-Gold Ltd — digital gold currency founded in 1996 by Douglas Jackson and Barry Downey. They were using precious metals as the underlying currency as these are globally acceptable.
Bitgold — although not implemented, it was proposed by Nick Szabo in 1998. It is one of the best known decentralised virtual currency projects undertaken before blockchain. Bitgold and bitcoin are so similar that at one point, people thought Nick Szabo actually was the anonymous Satoshi Nakamoto.

Use cases

The application of Blockchain is not limited to cryptocurrencies. There are many other implementations of it, such as:

Banking and Finance — although this was the first sector to benefit from the advantages brought by the blockchain, it is somehow also the most endangered. Currently, the cryptocurrency market is constantly growing, and centralised banking systems are in a position to adopt the new technology, or to be taken over by it.
Currency — it’s not only about some cryptocurrencies, or some countries that are acting cool and adopting their “official” cryptocurrency. One day one currency may emerge as the only currency. And we do not even know if “currency” will be a proper name for that representation of value, or “that currency” may be the value itself.
Healthcare — healthcare and data privacy are becoming more and more correlated. Worldwide systems are relying on electronic systems to store data, which are vulnerable to attacks, or to data loss. A decentralised approach may even have implications in the integrity of priority lists for vaccinations, access to medication, and so on.
Data privacy — probably one of the hottest topics of the moment that is changing because of the blockchain, and this is only the beginning. There will be the increasing need to securely store data when healthcare services and processing DNA become widespread.
Records of Property Ownership and Transfer — public notary, lost or burned papers, or forged documents must become history soon, and blockchain has the capability to change all of these.
Smart Contracts — when it comes to contracts we can even think further, about laws, constitutions of sovereign states, public declarations or international agreements.
Supply Chains — humans are moving objects daily from one place to another, from country to country and keeping track of all of these may be challenging and vulnerable to many types of fraud.
Voting — although voting is the basis of democracy, there are still no bulletproof electronic systems that may prevent fraudulent voting. This is a very complex topic, and probably deserves a dedicated article on its own. Fortunately I can recommend two very well made videos discussing this: Why electronic voting is a bad idea and why electronic voting is still a bad idea.
and many others.

A more comprehensive view on how is blockchain used today, I found on this interesting article.

Types of blockchains

Public blockchain — permissionless; anybody can access and read, write or participate without explicit permission or authorization. A public blockchain is decentralised and has no single network-controlled entity. It has more complex rules and consensus algorithms for better security. It is computationally expensive to mine and adds a new block because the computational power is distributed globally, e.g. Bitcoin, Etherum etc.

There are also other types of blockchain that play an important role in replacing some classical models of centralised networks. Private institutions are leveraging the main idea behind the blockchain and applying it to upgrade their legacy systems:

Permissioned (private) blockchain — only the members of the network can read/write/audit the blockchain. Consensus is based on a multi-party consensus algorithm. There are critics who do not consider private blockchain as “real” blockchain technology. With its centralised and exclusive nature, it defeats the purpose of the original blockchain idea. This model is faster and more cost-effective because it is hard to tamper with data and easy to validate transactions. In this model, there are permissions for nodes that can be part of the network. Also, this is less secure than a public blockchain as we make it accessible to a bigger group of users. A fine representation of a private blockchain is Linux Foundation’s implementation, Hyperledger Fabric.
Federated/Consortium blockchain — this is also a private and permissioned blockchain where entities can become members of the network by prior approval or voting. This type provides all the benefits of a private blockchain but adds another major one: removing the consolidation of power to only one company. This is a good model for organisational collaboration.

Components of the blockchain ecosystem

Like any other software platform, blockchain has an ecosystem where components are dependent on each other.

Projects — The Blockchain ecosystem is currently running with some major projects and more are in the pipeline. Bitcoin, Neo, Stellar are some popular names.
Users — Blockchain users are ordinary people, who make use of the blockchain or cryptocurrency to achieve some results.
Exchanges — Every Blockchain project has a robust ecosystem working under it, that includes a decentralised exchange. Exchanges are developed by the Blockchain team or the community of other developers.
Miners — Blockchain requires a large network of independent nodes around the world to maintain it continuously. In private blockchains, a central organisation has authority over every node on the network. In public blockchains, anyone can set up their computer to act as a node and these computers are called miners.
Developers — Currently there are two types of developers in the blockchain ecosystem: blockchain developers and dApps developers
Applications — Industries, developers and communities build blockchain applications to serve a specific purpose. They are called dApps and are the foundation of Web3.0.

Architectural view

These are the most common layers of are available with every platform.

App layer — most common contains legacy and enterprise app.
Integration platform — layers with different kinds of protocols, like REST, governance and API management.
Blockchain access layer — features to fetch and write data to the blockchain
Analytics — reporting, dashboard or analytics-based system

Some technical concepts for blockchain components.

Merkle tree

A Merkle tree is a fundamental concept of blockchain. It is a structure that allows secure verification and efficient content in a large body of data. It helps to verify the consistency and content of the data. It summarises all the transactions in a block by creating a fingerprint of the entire set of transactions, which allows a user to verify whether a transaction is included in a block.

A Merkle tree is created by repeatedly hashing pairs of notes until there is only one hash left, known as Root Hash or Merkle Root.

Using a Merkle Tree we are significantly reducing the data that a trusted authority has to store for verification purposes. It helps in verifying data consistency in an optimal way, since we are not checking the real amount of data — so we are reducing the computational effort. Also, it is more optimal than creating a hash on all transactions, because the risk of collisions increases proportionally with the number of transactions.

An example of a transaction is shown below. Instead of storing all hashes for all transactions, we are creating the hash of hashes and so on, till the root hash.

After we calculate a hash of the entire block, then we are taking into account the difficulty, which is set by the blockchain algorithm.

The important role of the Merkle tree is that it greatly compresses the amount of data we have to store. Otherwise, all nodes of a blockchain should keep a copy of every single transaction that has ever occurred and compare them line by line in order to validate the integrity of data.

This is actually separating the proof of data from the data itself.

Hashing

This is a function that converts an input item of any length into an output item of a fixed length. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.

Note that a hash function is not injective, not surjective and will return the same output every time you apply it to the same input, but using an output, you cannot know which was the original input.

Source: https://en.wikipedia.org/wiki/Hash_function

Nonce value

Miners are trying to guess the hash, based on difficulty. This can be set initially as n number of zeros a hash of a block should start with. The difficulty increases with the number of zeros. So, in this case, miners have all the transaction data, but they do not know one value, the so-called Nonce. They have to guess the specific nonce value that makes the block start with a specified number of zero. Because of the enormous volume of the number space, the only way to get the nonce value is random.

Let’s finally see what a block is

The block is the building group for any blockchain.

It has 2 main components:

Block header
Block body

The Block header has 6 sub-components:

Block version — does not matter in most cases, but it can signal which protocol decisions it supports
Merkle tree root hash — encodes the blockchain data in a secure matter. It enables nodes to quickly check the current block for integrity. Simplified, it serves the same role as a hash of archive files downloaded from an internet repository.
Previous block hash — this is the hash of the previous block, without which there will be no connection between blocks and no chronology.
nBits — this is the encoded target threshold under which a block is considered valid. The lower the target, the greater the difficulty to generate a block. Generating a block is more like a lottery. It generates numbers between 0 and 256-bit numbers and a hash is considered valid if it is under the target. Please note, as the target decreases (so the number is smaller), the number of zeros increases, so the difficulty increases.
Nonce — is the variable increment for the PoW (Proof of Work).
Timestamp — this not only adds some randomness to the block hash, but also makes it difficult for an adversary to manipulate the blockchain. A timestamp is considered valid if it is greater than the median timestamp of the previous 11 blocks.

So based on these components of Block Header, this is how blocks are linked:

In this scenario, if a hacker wants to attack a specific block number, they will have to alter not only preceding blocks but also following blocks, as other miners are constantly checking hashes and the longest chain of blocks.

Wallets

A wallet is a program that allows users to buy, sell or monitor balances. This only makes sense in the case we are storing values in the digital ledger of the blockchain.

Wallets use a private-public keys pair to authenticate users.

There are 3 types of wallets, available to store/reflect the transactions on blockchain:

software — mobile, desktop or web apps
hardware — stores private keys on hardware devices, such as USB
paper — the pair of keys are generated by the software and then they are printed, to make transaction possible (by sending funds to the address on paper)

How a transaction is executed

Transactions are part of the block body and are the most important component of a blockchain. Any other components have the role of contributing to the safety of transactions.

Transactions are data structures that encode the transfer of value between participants in the blockchain system. The process of transaction verification and recording is immediate and permanent. The transaction is approved through a process known as consensus.

A transaction is committed in 4 stages:

Initiation of transaction proposal. At the initial stage, the transaction is created and signed by the owner.
At this stage, the transaction is broadcasted to the network. At this stage, the transaction is broadcasted to the network
The transaction is verified. Once the transaction is broadcasted to the network, other authorized nodes verify it. If the transaction is valid, it is added to a Block, and if not, nodes reject the transaction.
The transaction is committed. Finally, the Block is added to the Blockchain, and the transaction is committed.

Mining

This is a process of recording new transactions on the blockchain ledger. When two users make a transaction, nobody can see it until a miner puts it in a block. It is only after confirming the nonce value by the miner, which matches a valid hash (under target encoded in nbits).

For example, if the difficulty is 4, in this case, the hash will be valid.

As described below, a hash function always generates the same output for the same input. So, the miners have to change something in the content of the Block in order to find a valid hash, so that would be the nonce value. They are endlessly checking values in order to find a match that would generate the desired hash value. Also, note that the timestamp is constantly changing during the miner’s effort to find (to mine) the nonce value.

By creating a block, a miner receives a monetary reward, so they will have an interest in investing effort in this process.

The mining process is not essential for a blockchain to exist. This is a way of validating transactions especially used for public blockchains. In private blockchains, there can be customised rules to validate transactions.

Longest chain rule

For adding a new block to the blockchain, we need to use a lot of effort to generate the blocks. As a rule, nodes will always select the longer chain over the shorter one.

Adopting the longest chain rule allows every node on the network to agree on what the blockchain looks like. So, they agree in this way on the same transaction history. This means that nodes that are acting independently can maintain a globally shared view of a file.

Let’s assume, there are 100 blocks on the chain and a malicious node gets corrupted on node 23. Affecting node 23 on a local node will also break the next node, so if this node tries to broadcast its blockchain to the network, the other nodes will reject it as they already have longer chains on their local nodes.

However, sometimes the longest chain rule does not necessarily mean the blockchain that requires the most energy to be created. This is the case when nodes have to check 2 versions of the blockchain with multiple difficulty periods. In this case, nodes will select the one with the most cumulative chainwork (the total number of hashes that are expected to have been necessary to produce the chain).

Types of mining

Solo mining — each miner will setup hardware and will register to the network for mining and rewards are not shared
Pool mining — a single miner does not have resources to mine alone, so they combine their mining capacity. This comes with sharing the rewards but also with higher income potential

Mining algorithm

Consensus mechanisms

Sometimes a confusion appears between the blockchain type and its consensus mechanism. This is why it is important to understand what consensus is and its role in a blockchain.

The consensus mechanism makes sure all nodes are synchronised with each other and agree on which transactions are legitimate and added to the blockchain. It makes sure that everyone in the network uses the same blockchain.

In a centralised system, a central administrator has the authority to maintain and update the database.

Consensus assures that the protocol rules are being followed and guarantees that all transactions occur in a trustless way. There are some specific goals in the Blockchain consensus protocol, such as agreement, collaboration, cooperation, equal rights for each node, and mandatory participation of each node in the consensus process.

Types of consensus algorithms:

Proof of Work (PoW) — is a decentralised consensus mechanism that requires members of a network to expend effort solving an arbitrary mathematical puzzle to prevent anybody from gaming the system. Proof of work is used widely in cryptocurrency mining, for validating transactions and mining new tokens.
Proof of Stake (PoS) — protocols are a class of consensus mechanism for blockchains that work by selecting validators in proportion to their quantity of holdings in the associated cryptocurrency. Unlike a proof of work (PoW) protocol, PoS systems do not incentivise extreme amounts of energy consumption.
Delegated Proof of Stake (DPoS) — is a popular evolution of the PoS concept, whereby users of the network vote and elect delegates to validate the next block. Delegates are also called witnesses or block producers. Using DPoS, you can vote on delegates by pooling your tokens into a staking pool and linking those to a particular delegate. You do not physically transfer your tokens to another wallet, but instead, utilise a staking service provider to stake your tokens in a staking pool.
Proof of Importance (PoI) — is a cryptocurrency term defined as a blockchain consensus technique — essentially, proof of important works to prove the utility of nodes in a cryptocurrency system, so that they can create blocks.

Other consensus mechanisms:

Proof-of-Capacity(PoC) — PoC is another consensus algorithm used in blockchains that allows mining devices in the network to decide mining rights and validate transactions with the help of their available hard drive space.
Proof-of-Activity (PoA) — PoA is a consensus algorithm used in Blockchain technology that ensures that all transactions occurring on the network of Blockchain are genuine and authentic. PoA consensus, which is a combination of proof-of-work and proof-of-stake, ensures that all miners arrive at a consensus. In other words, PoA is an attempt to consolidate the best features of PoW and the PoS systems.

Recommendations — materials

3b1b — But how does bitcoin actually work?
Investopedia — Blockchain

Certification programs

There are some online academies where you can learn more about blockchain. I personally tried blockchain council.

Contact me

Give me feedback, help me improve this article.

Text me on Linkedin.