from Investopedia

Understanding Blockchain — How it works!

By Wimal Perera on The Capital

The Dark Side
Published in
16 min readMay 13, 2020

--

“Bitcoin” and “Blockchain Technology,” nowadays, are 2 of the most frequently heard words within technical communities; especially amongst those working in banking and finance. However, the lack of foundational concept knowledge behind “blockchain technology,” has made understanding “blockchain” difficult.

A predecessor article (termed “Understanding Blockchain — Foundation”) explains several prerequisite concepts helpful for understanding the internals of “blockchain technology,” as listed below.

  1. Ledger (simplified in-line with the context of understanding blockchain)
  2. Cryptographic Hash Functions
  3. Symmetric & Asymmetric Key Cryptography
  4. Digital Signatures
  5. De-centralized Network

The above 5 terms will be interchangeably used throughout this article while explaining “how blockchain works.” Please feel free to look out for them under relevant sections within “Understanding Blockchain — Foundation” article, if you haven’t heard about any of them before or whenever you need more clarity on them while reading this article.

Figure 1 — Blockchain Overview

As per the conclusion from “Understanding Blockchain — Foundation;” “Blockchain” is a set of tools that can be used to implement a “distributed ledger,” that could be “de-centrally verified” and “de-centrally computed” (see Figure 1). So how can we make sure that everyone stays consistent and non-fraudulent within a “de-centrally ledger network?” That is why we need “cryptographic hash functions” and “digital signatures.”

Our story starts from a “centralized ledger” between 4 friends Alice, Bob, Casey, and Daniel (similar to left-hand side of Figure 1. See also Figure 2). So far there are 4 transactions in our “centralized ledger.” Once a transaction happens between 2 people, anyone can write the transaction in the public area for others to know. We can see that Alice, Bob, Casey, and Daniel have $30, $35, $65, and $65 in their accounts as their latest balances upon completion of the final (i.e. 4th) transaction.

Figure 2 — A Non-Secure Centralized Ledger

Although both, the “traditional centralized bank system ledger” and the “distributed blockchain ledger” (that we’re going to read here) have the common goal of providing security; the significance of blockchain is that the ability to make secure transactions even under the absence of a third-party regulatory body (such as the bank).

In the following sections of this article, let us investigate how we can transform this “un-secure centralized ledger” into a “decentralized secure ledger” that “does NOT depend upon the control of any third party” (such as a bank/regulatory body).

Introducing Digital Signatures

Let’s focus on simple things first. The ledger in Figure 2 has the problem of validating whether the “payer” of each transaction was actually responsible for;

a) doing the payment related with a given transaction for the “payee”

b) recording the transaction in the ledger after doing the payment

For instance considering the transaction “Alice pays Bob $20,” Alice (who is the “payer”) could later say that she didn’t pay $20 for Bob and she wasn’t the one who recorded that transaction in the ledger.

Figure 3 — Ledger with Traditional Signatures for Each Transaction

The immediate solution to fix this problem would be introducing signatures as in Figure 3.

What if Casey practiced Alice’s signature and forged it to create a new transaction the way Alice did? Also is it possible for Daniel to insert his transaction to the middle of the ledger between Bob’s and Casey’s transaction and be dishonest about it?

The solution for solving the above problems is;

a) modify the format of the transaction to ->

{sequence_id} {transaction_contents}

e.g. “01” “Alice pays Bob $20”

b) introduce “asymmetric key cryptography” into the ledger in which “each person” participating in the ledger must have their “own key pair” (i.e. own public key and private key)

c) “payer” “digitally signs” each transaction contents, in which the “hash signed” using payer’s private key is generated based on the “proposed transaction format in a)”

Figure 4 — Ledger with Digital Signatures for Each Transaction

Elaborating more on the solution illustrated in Figure 4, we can see the below characteristics.

  1. Casey can’t deny the fact that she didn’t do transaction 03 since Casey only knows about her “private key”
  2. Anyone can verify transaction 02’s contents on the ledger with Bob’s “digital signature” and verify that the contents of transaction 02 are not fraudulently modified by someone other than Bob (using Bob’s “public key”)
  3. The “digital signature” annexed with transaction 01 (i.e. “AliceDS_01”) is different from the “digital signature” annexed with transaction 05 (i.e. “AliceDS_05”), although both signatures belong to Alice. The reason is although Alice uses her same private key in both cases; the contents of the transactions are different. Hence each “unique digital signature” in “each transaction” verifies; not only the identity of the person who signed but also the total content of the transaction being signed.

Our solution in Figure 4 has the below 2 problems.

  1. Our ledger is still centralized. Hence we need some common place to store it and it needs to be looked up by some regulatory body (such as a bank). How do we de-centralize our ledger?
  2. As of figure 4, Alice has the lowest account balance so far. So how do we prevent a situation in which Alice tries to pay $30 for Bob resulting in her overspending?

De-Centralizing the Ledger

Figure 5 — Each Participant Syncing his/her own copy of the Ledger

Now we’re one step closer to the world of blockchain; in which our latest solution adheres to the same rules of the ledger described in Figure 4, but each individual will have his/her “own ledger” instead of having a “centralized ledger.” So in this system, when Alice wants to pay $10 to Bob, she will create a transaction with a unique id, include her digital signature (as in Figure 4), and then “broadcast” this information (i.e. digitally signed transaction) to all the participants in the system. How does “each recipient” get to know that the transaction they received via “broadcast” was actually “broadcast”ed by the person described in the transaction? By verifying the “digital signature…”

However, the main problem in this system is how would all participants “synchronize” their own individual ledgers up to the latest transaction that occurred within the system?

This was the exact problem addressed in the creation of Bitcoin. But just before looking at the solution (i.e. the anatomy of a “blockchain”) we need to look more deeply into an associated concept called “proof of work.”

Proof of Work

A “proof of work” in “blockchain context” is a “valid answer” (usually a string in most cases) for any computational problem having the below characteristics.

a) “Finding an answer for the problem” shouldn’t be straight forward and should require,

  1. a “significantly large amount of time” (at least the “average guaranteed time” taken to solve a single such problem has to be significantly large compared to the average time taken to execute a single transaction in the ledger within a given time span such as over a day) and
  2. a “significant computational effort” (the required effort has to be significantly large enough so that there exists “NO non-genuine shortcuts” and hence persuade anyone willing to try to solve the problem “only by genuine means” and “obtain a reward” upon solving the problem in the “genuine” way)

b) However “once a valid answer is found for the given problem, verifying that existing answer” could be done by anybody “effortlessly in seconds.”

Proof of Work: Concept — A Working Example

Given that we have the below transaction content in our ledger,

{ “06” “Alice pays Bob $10” “AliceDS_06” }

what is the “string” we need to append to the above transaction content so that we can get an MD5 cryptographic hash having 20 zeros in front? (Note that “AliceDS_06” is Alice’s digital signature, which is unique for this 06th transaction)

In other words, what is the value of “x” in the below expression?

MD5[ {“06” “Alice pays Bob $10” “AliceDS_06”} “x” ]

= {20 zeros}{rest of the MD5 hash}

Figure 6 — Finding a Solution as “Proof of Work”

When you see the above problem finding the value of “x” is computationally expensive since the only way to do it is to keep on applying different “arbitrary string”s until an “MD5 hash” is found with “20 zeros in front” (see Figure 6 and the properties of a “useful cryptographic hash function” described in “Understanding Blockchain — Foundation”). As illustrated in Figure 6 (according to the example, our hash function H is MD5), since we need 20 zeros in front, for the average case we would need to attempt 2²⁰ times iteratively with 2²⁰ different “arbitrary string”s to find an “answer” (i.e. one among the 2²⁰ different strings we tried) as a valid “proof of work.”

Hence the “average guaranteed time” taken to solve the above problem is significantly large (of course you could be instantly lucky to end up with the 1st random string you try. But that will be less than 1 out of millionth times!).

Some of the blockchain literature refers to this “valid answer for x” as “cryptographic nonce” or “nonce.”

However, once a “valid answer for x” (i.e. a “valid nonce”) is known, its verification is straight forward. All we have to do is generate the “MD5 hash” by appending “x” (i.e. the known “valid nonce”) and see if the “generated hash” has 20 zeros in front.

Using Proof of Work within a De-Centralized Ledger

Different “blockchain”-based (which we will see what in the next section shortly) implementations might enforce the usage of this “proof of work” in different ways. Some systems might perform “a proof of work per every ledger transaction”, some may perform “a proof of work per a batch of transactions” etc. For the simplicity of explanation, I’ll be referring to the scenario of “one proof of work per every single ledger transaction,” which is the most inefficient.

Also as we discussed in the previous section, solving a “proof of work”-based problem genuinely requires a “significant effort” and this effort needs to be rewarded. This “rewarding upon obtaining a valid nonce for a given proof of work” too is done in different ways in different blockchain implementations. Some systems will reward the problem solver a percentage commission, out of the total amount in the transaction.

For instance going back to our latest ledger solution in Figure 5, since all transactions are “broadcast”ed anyone can do the “proof of work” for anyone else’s transaction and upon providing a “valid nonce” can get a reward for doing it.

Proof of Work: Rewarding — A Working Example

Imagine that Alice has “broadcasted” (as in Figure 5) a new transaction;

{ “06” “Alice pays Bob $10” “AliceDS_06” }

and that Casey is willing to find a “valid nonce” by doing the “proof of work” for Alice’s transaction (“AliceDS_06” is the digital transaction annexed by Alice for her transaction contents).

Let’s assume that our hypothetical distributed ledger system has a mutual agreement among all of its users to pay 1% of commission out of the actual transaction amount for the person doing “proof of work.”

First Casey will verify Alice’s transaction by using Alice’s “digital signature.”

Once verified, since Casey needs to collect her reward upon successful completion of “proof of work;”

  1. she will split the original transaction from Alice into 2 split-transactions
  2. append the split-transactions with the original transaction content
  3. digitally sign this total transaction content from 2

{

{ “06” “Alice pays Bob $10” “AliceDS_06” },

{ “Alice pays Bob $9.90”, “Alice pays Casey $0.10” }

} “CaseyDS_06”

Then Casey will find a “valid nonce” by doing “proof of work.” Casey will use the “total contents of the original and resulting split transactions” signed with Casey’s digital signature, for her “proof of work” and will end up with a “valid nonce” (such as “345abc12tuv3”) upon successful completion.

Since the split transaction content has 2 unique “digital signatures” from Alice (which is “AliceDS_06”) and Casey (which is “CaseyDS_06”), it proves that;

  1. Casey has used the original transaction “broadcasted” by Alice
  2. The “valid nonce” (i.e “345abc12tuv3”) being discovered upon successful completion of “proof of work”, belongs to Casey

So ultimately,

  1. Bob will receive only $9.90 out of $10.
  2. Alice and Bob will witness $0.10 cents as a “charge made by the system” for the transaction.
  3. Whereas for Casey $0.10 cents is her reward for appreciating the “significant effort” she made to find a “valid nonce”.

Once again please note that the above is only a high-level example to understand the concepts in general, whereas the actual implementations of “blockchain-based distributed ledger” systems would vary from one to another.

Bitcoin Mining

In terms of “Bitcoin” (or “Cryptocurrency” in general, which is only a single practical application of “Blockchain”); one’s earnings made by supplying “nonce”s after solving “proof of work”s for transactions done by somebody else is called “Bitcoin Mining.”

From Distributed Ledger to BlockChain

I work for a software company named “Lexicon Digital.” Based on all information we’ve discussed so far I’m going to define a data structure which I will call a hypothetical lexicon block” (see Figure 7).

Figure 7 — The Hypothetical “Lexicon Block”

Note that the keyword “block” is very important. Why I call it the “hypothetical lexicon block” is that different blockchain implementations use slightly different design models for defining the anatomy of their block. We use this “hypothetical lexicon block” for the sole purpose of understanding how blockchain works at a high level.

As you can see the right hand side of Figure 7 defines the generic data structure of our “lexicon block” and left hand side illustrates an example. My assumptions related to the example on the left, are as below.

  1. We do a “proof of work per each transaction” in our “lexicon blockchain”
  2. Prior to getting involved in the transaction encapsulated inside “Block 10451”; Alice, Bob and Casey had $20, $20 and $25 in their wallets
  3. Casey did the “proof of work” for Alice’s transaction
  4. The reward (i.e. “mining” in terms of “cryptocurrency” world) for “proof of work” in our “Lexicon Blockchain System” is a flat rate of 1% out of the total transaction amount (in this case 1% out of $10 is $0.10)
  5. The valid answer (i.e. “nonce”) found by Casey after successfully completing the “proof of work”, for the transaction in “Block 10451” is; “35467xyz89abc024”
Figure 8 — Formation of the “Lexicon BlockChain” with Multiple “Lexicon Blocks”

I hope it's clear to you by now why we call it “blockchain,” it is because each block is associated with its previous block (similar to a LinkedList) and this entire “chain of blocks” together is what we use to reveal all information from the ledger in a given snapshot of time (see Figure 8).

Adjusting the Contents Used for finding a Nonce during Proof of Work

Also, note that the last section of the “block structure definition” (see Figure 7 right) is very important which states “finding a valid nonce for the proof of work using all contents included in the block.”

Figure 9 — “Proof of Work Problem Definition” for the “Lexicon Blockchain”

Comparing with the “proof of work” example we discussed in the previous section (with regards to “rewards” for “proof of work”); in this “lexicon blockchain” we will additionally add;

  1. the “current block id” and
  2. hash derived from the “total contents of the previous block”

when finding a “valid nonce” during doing “proof of work” for each “block” (see Figure 9).

Preventing Overspending

For instance, before Alice doing the transaction of paying $10 to Bob, we can traverse all the way down in the previous blocks, through the “blockchain,” and reach the most recent Alice’s transaction to get her current balance. Hence we can “prevent Alice from overspending” if she doesn’t have at least $10 with her. If Alice has less than $10, no one will do the “proof of work” for Alice. Also even if Alice does the “proof of work” for her transaction, no one will accept a “block” having negative account balances. Also, Alice is unable to update her previous balance fraudulently after doing a transaction and add it as a block to the middle of the blockchain, since the “next block” will have the “hash of the previous block contents,” “next-next block” with “hash of next block contents” and so on.

Distributed Blockchain-based Ledgers

Figure 10 — Each User has his/her own copy of the Ledger represented as a “Lexicon Blockchain”

Everyone in the lexicon network will have their “own copies of lexicon blockchain-based Ledgers” (see Figure 10 versus Figure 5). Having the latest block in your own copy of “blockchain-based ledger” is not required for you to do a new transaction since you know how much actual balance you’ve got in your wallet. But if you try to “overspend,” other’s in the network will get to know your financial status from their own “blockchain-based ledger”s and no one will be willing to do a “proof of work” for your transaction. Thus your transaction won’t be included within the “distributed blockchain.” So I leave it up to you critically think on several edge case scenarios in this regard and see how “blockchain” could come into the rescue.

Bitcoin Wallet Address

One last thing to note is that we used names like Alice, Bob, Casey, etc. to identify individuals in our network. But in the real world, since names might not be unique, most of the blockchain implementations use a globally unique id to identify the wallet of each individual uniquely. In bitcoin context, this is referred to as your “bitcoin wallet address.”

Preventing/Detecting Fraud in the “Lexicon BlockChain”

Scenario 1 — Fraud Prevention: Alice is trying to steal money from Bob’s Wallet

Suppose Alice wants to “broadcast” a fraudulent transaction for the system saying;

{ “01 Bob pays Alice $10 BobDS_01” }

to steal from Bob’s wallet; she can’t do it since Bob needs to “digitally sign” his transaction and only Bob has access to his own “private key”.

Scenario 2 — Fraud Detection: Alice is trying to insert a fake block by saying she paid money to Bob which she actually didn’t

Figure 11 — Casey’s Intelligent Copy of “Lexicon Blockchain” with “Candidate Future Blocks Buffer”; identifying Alice’s fraud transaction-based Blocks

Assume that Alice attempts to deceive the “blockchain,” by appending a “fraud block” with a “fake transaction” which she didn’t actually do (i.e. which she didn’t “broadcast”) by saying “01 Alice pays Bob $10 AliceDS_01” (such as introducing “Block 11351” in Figure 11 at time t = t1).

In a scenario like this there will be 2 continuations (i.e. a “fork”) in the “future blockchain” (see blocks proceeding from “Block 11351” in Figure 11 at t = t1 and t = t2);

  1. “Fraudulent Blockchain” formed to continue deception in which Alice is doing the “proof of work” for each “block” (shown as “red” in Figure 11)
  2. “Genuine Blockchain” formed with the “collective effort” of all “miners” in the “lexicon network” (shown as “green” in Figure 11)

Since Alice created her “fraud transaction” “under the hood” without “broadcast”ing she has to;

  1. do the “proof of work” for the “fraud block” she’s going to create to represent her “fraud transaction” and
  2. continue with “creating” and “doing proof of work” for all the “next blocks” to represent “all the genuine transactions” happening in the “lexicon blockchain” after her “fraud transaction.” This is because no one else’s “copies of blockchains” are aware of Alice’s fraud transaction. Note that the “proof of work” required to create each block is related to the “hash of the total contents from the previous block.”

But as we learned earlier “proof of work” is “time-consuming” and “effort-consuming,” and this scenario is one of the reasons WHY IT HAS TO BE.

Because the time and effort required to complete each “proof of work” is “significantly high,” Alice will run out of resources to cope up with the “speed in which the genuine blockchain of the network is getting built” by the “collective effort of all miners” in the network.

Hence during a “blockchain fork” due to fraud, the real “genuine blockchain” will always end up as the “longest blockchain” after passing a significant amount of time (see time t = t2 in Figure 11).

According to Figure 11, Casey won’t be able to identify the real versus fake block out of the 2 duplicate “Block 11351”s at time t = t1 (which is why both “Block 11351”s in Figure 11 at time t = t1 are marked in “green”). However as time passes and reaches time t = t2 (in Figure 11), Casey becomes aware that the much “shorter blockchain” having only 2 blocks is the “fraud” one. Hence she marks these “2 red blocks 11351, 11352” in the “shorter blockchain” as “fake blocks” (in “red”) and dumps them from “her own maintained copy of blockchain”. Further, Casey appends the “longer green blockchain from Block 11351 to 11360” (which is the “genuine blockchain”) to “her own copy of blockchain.”

So every time when anyone in the network adds new blocks to their own copy of “lexicon blockchain” they need to follow Casey’s approach (see Figure 11) in which;

  1. he/she needs to maintain a “buffer” of potential next blocks (i.e. “candidate future blocks”) to be added to his/her “own copy of blockchain” WITHOUT immediately adding new blocks upon receipt from a “miner broadcast” and
  2. in case of a “blockchain fork” (i.e. there are “multiple potential blockchains” in the “candidate future blocks buffer”) wait for some significant time, find the “fastest-growing longest blockchain” which is the “genuine” one and discard all other “fraudulent blocks” belonging to “fake blockchains”.

Conclusion

The world-famous “Bitcoin” (in general “cryptocurrency”) is only a single application of “blockchain technology.” I leave it up to you to think about other avenues in which we could use “blockchain;”

  1. How about developing a federal voting system for Australia using “blockchain?” Can it be cheated? The simplest case, presidential election, every voter has $1 which is his vote. He can only spend this $1 to one of the presidential candidates. Can this system be trusted?
  2. How about using “blockchain” to maintain B2B and B2C transactions happening within a business? All stakeholders in the business can be considered as all parties in its “supply-chain based blockchain network.”

Finally, congratulations! on reaching the end of this article. Having reached this point, you will be in a better position to understand how the internals of blockchain works and their applications. Do think about different edge cases that could happen within a “blockchain network” and critically argue how “blockchain technology itself” would come for the rescue.

Interested in taking that initial step on transforming your supply chain or banking network by applying blockchain? Contact Us.

--

--

Wimal Perera
The Dark Side

A Software Engineer with 12+ years of development experience; from frontend web to backend IT infrastructure. (https://www.linkedin.com/in/wimalperera/)