5. Blockchain Ecosystem Infrastructure

How is Blockchain revolutionizing digital infrastructure?

On June 3rd, 2024, the Berkshire Hathaway stock briefly plummeted in value by almost 100%, taking the market cap of the company from $900 billion to below $1 billion. This turned out to be a technical glitch caused by the Consolidated Tape Association, the company responsible for providing live ticker data to the NYSE.

A technical miscommunication made by one company caused hundreds of billions of dollars to briefly vanish into thin air.

This is not the first instance of such an event occurring, and it will likely not be the last.

How can blockchain be used to fix critical Internet infrastructure issues that plague the Web today?

Decentralized Data Oracles

In the case of the NYSE, the Consolidated Tape Association (CTA) acts as a data oracle, which is a data provider that can bring real-time information to a platform. In the case of blockchain, an oracle is crucial for bringing in external data, since the blockchain network is not directly connected to the Internet. This system lets smart contracts execute their commands based on internal and external inputs such as cryptocurrency prices, financial markets, sports results, or any other external piece of data that exists online.

Blockchain oracles are different from traditional ones because they are decentralized- instead of relying solely on one company to act as a data provider (like the NYSE relies on the CTA), they effectively take in multiple data points from a Decentralized Oracle Network (DON) to ensure that data is correct.

This removes trust from the equation - if you only have one company providing all of your data then you have to trust them that this data is correct, whereas if you have a whole network of data providers and they all say the same thing, then you can be far more certain that the data is correct.

Source: Chainlink

💡

Bringing cryptocurrency price data on-chain

This image is an example of ETH price data being provided by Chainlink, one of the largest projects in the blockchain oracle space. At the bottom of the image, there’s a section titled ‘Oracle Network’. This network is the DON responsible for providing real-time price data of ETH.

Here’s how it works:

Each node within the DON provides their real-time price of ETH
Chainlink compares all of the ETH prices provided by these nodes and picks the price point the most nodes support - if 9 nodes say the price is $2,900 and 1 node says the price is $2,905 then the price point supported by the most nodes is chosen ($2,900)
That price is then reflected as the officially verified price of ETH and sent out to all the applications using Chainlink for price data
dApps that utilize Chainlink’s price data reflect this price as the current price of ETH

source: Chainlink

Currently, Chainlink is one of the largest contributors to the blockchain oracle ecosystem, handling a vast majority of external data inputs into the blockchain world. Chainlink acts as the bridge between the Internet and blockchain networks, where data can flow in and out of a blockchain network using Chainlink. To bring external data on-chain, it is taken from the Internet, filtered for accuracy through Chainlink’s DON, and then provided to the blockchain network.

Aside from pricing data for cryptocurrencies, blockchain oracles can bring a whole suite of data on-chain. For example, if you decide to build out a sports betting dApp, you could utilize a blockchain oracle to provide the outcome of sporting events. A smart contract could take in funds and escrow them until the event is over, at which point the oracle will provide the results of the event and the smart contract would automatically release the payout to the winner.

A more business oriented example would be using oracles to connect to the Internet of Things (IoT) for the purpose of tracking goods throughout a supply chain. Sensors attached to a shipment could provide real-time data such as location, temperature, and condition of goods that can all be added to a blockchain ledger to act as immutable proof.

Source: Chainlink

To learn more in-depth about how oracles work on the blockchain and their different use cases, check out this article on What is a Blockchain Oracle by Chainlink.

📋 Practice Question

What is the main reason decentralized oracles are more effective and trustworthy than centralized ones?

They are cheaper to operate

They provide faster processing speeds

They rely solely on blockchain networks instead of outside data

They don’t have a single point of failure

Color Box

Cross-Chain Communication

Just as critical as access to external data is the ability for different networks to communicate with one another. Just like there are existing development standards for the Internet that ensure websites can be supported across all browsers and devices, blockchain needs systems in place that allow dApps and cryptocurrencies to be supported across different networks. While the standards for cross-chain communication are currently not very standardized, multiple mediums exist and new ones are being constantly developed.

Wrapped Tokens

Traditionally, cryptocurrencies that are available on one chain but not on another are represented as wrapped tokens. These wrapped tokens are tokens that mimic the price of the original asset but are available on chains that the original asset is not. A wrapped token is usually denoted with a ‘W’ prefix, which signifies that it’s not the real asset but a representation of that asset on another chain. Some of the most popular wrapped tokens include WBTC (wrapped BTC) and WETH (wrapped ETH), where chains that don’t support BTC or ETH directly can create wrapped versions pegged to the original token price.

Using a wrapped token can make it easier to bridge, or cross, cryptocurrencies between chains since the same wrapped token can be used across both chains.
For example, let’s say you have $1,000 worth of ETH on the Ethereum network which you want to bridge to the Avalanche network. Avalanche doesn’t support ETH directly because this chain does not follow the ERC token standard Ethereum does (discussed in Section 3 in the role of cryptocurrency). Instead, the Avalanche network uses WETH that is compatible with the chain.
You would therefore be able to convert your ETH into WETH and bridge that WETH from the Ethereum network to the Avalanche network. You’ve successfully brought your $1,000 over from the Ethereum network to Avalanche in the form of WETH which you can now use on the Avalanche network.

Source: Tangem Wallet

For some further reading how how wrapped tokens are custodied and created, check out this article by Coin Telegraph.

💡

How does bridging work?

Bridges are dApps that are designed to be able to send assets between chains. These bridges act as decentralized exchanges with the capability of transferring an asset from one chain to another. Most bridges support multiple different networks and will let you bridge an asset directly from one to another. They contain liquidity pools similar to normal decentralized exchanges, with the only difference being that those pools are spread across multiple networks.

If you were to bridge WETH from the Ethereum network to the Avalanche network, your WETH on the Ethereum network would be burned (removed from circulation) on Ethereum and a matching amount of WETH on the Avalanche network will be minted (created) and added to your wallet.

One popular bridge is Stargate, although some DEXes, such as Sushiswap, began directly integrating cross chain bridging capabilities. This can eliminate the need of using a DEX and then a bridge by simply combining them and allowing users to conduct cross chain transfers directly through an exchange.

Cross Chain Protocols

A more modern way of establishing cross chain communication is through the use of cross-chain protocols (CCPs). These protocols can effectively pass data between chains and are not restricted solely to tokens. They are critical to cross-chain operations because they act as a middleware solution - if it weren’t for these protocols, any dApp that would want to work on multiple chains would have to build out their own system, which would be time-consuming, expensive, and complicated.

In addition to its role as a data oracle provider, Chainlink offers its own CCP solution called the Cross-Chain Interoperability Protocol (CCIP). CCIP lets developers embed an interface directly into their dApps, giving them a secure way to access other chains. CCIP has 3 main capabilities:

Arbitrary messaging - allows you to send any encoded data to another chain to be received by a smart contract on that chain
Token transfer - lets you directly transfer tokens to a smart contract or wallet on another chain
Programmable token transfer - lets you send both tokens and encoded data to another chain at the same time

For a more in-depth look at how CCIP works, check out Chainlink’s CCIP documentation that covers how the protocol is used and how to embed it into your own dApp.

Circle, the company behind the USDC stablecoin, also has their own CCP called the Cross-Chain Transfer Protocol (CCTP). The USDC stablecoin is available across almost 70 different chains, making it widely compatible for cross-chain transfers. CCTP allows users to transfer any amount of USDC from one blockchain to another by rebalancing the amount of USDC on each chain after every transfer. This protocol is particularly useful for decentralized exchanges, which can use USDC as the transfer token to transfer assets across chains for a smooth user experience.

💡

The Sushiswap Example

Sushiswap is a popular DEX that uses CCTP to conduct cross chain trades. This integration allows users to trade any token for any other token on a different network. Unlike with most bridges that require a specific token (usually a wrapped token like WETH) to conduct the transfer, a user can exchange any token for a different token on a different network.

Let’s say you had $1,000 worth of AAVE tokens on the Ethereum network that you wanted to convert to the DAI token on the Base network. Using a normal bridge you’d likely have to: convert your AAVE into a wrapped token like WETH > bridge the WETH from Ethereum to Base > convert your WETH to DAI.

With the CCTP integration on Sushiswap, you skip this entire process and can directly swap your AAVE to DAI across chains. Here’s how the process would look on the back-end:

The exchange would automatically convert your $1,000 of AAVE tokens into USDC tokens
The CCTP mechanism would then burn $1,000 worth of USDC on the Ethereum network and mint $1,000 worth of USDC on the Base network
The exchange would automatically convert your $1,000 of USDC to DAI and deposit it into your wallet

Feel free to take a look at Sushiswap’s documentation as well as Circle’s documentation for more information on how these systems operate.

BYOB - Build Your Own Bridge

Since blockchain is a constantly evolving technology, many developers choose to build their own cross-chain communication methods for their apps. One popular method is to utilize Wormhole, a platform that has its own Software Development Kit (SDK) for building bridges directly into your own application. Wormhole’s SDK works on similar principles as the cross chain protocols do, but building your own bridge gives you much more customization capabilities than using a pre-built one.

Wormhole offers both automatic relaying, which is done fully on-chain, as well as specialized relaying, which may require some off-chain code.

Automatic relaying: handled entirely by Wormhole - you create a smart contract that sends all the data to a Wormhole relayer contract, which then transfers that data to the relayer contract on the receiving chain.

Specialized relaying: gives you more control and lets you create a custom relayer contract instead of using Wormhole’s standard one, which lets you add in any additional logic you’d like to use.

For more information on how these systems work, check out Wormhole’s documentation.

📋 Practice Question

Cross-chain protocols let you do all of the following except for:

Send messages from one chain to another

Pay for gas fees with a token other than that chains gas token

Send stablecoins from one chain to another

Use wrapped tokens to transfer assets between chains

Color Box

Decentralized Data Storage

While DeFi focuses primarily on the movement and storage of cryptocurrencies, blockchain networks are capable of storing more advanced data mediums including files, images, and videos. Instead of relying on a huge centralized server powered by the likes of Amazon or Google, decentralized data storage works by creating an open market for providing and using storage, where users are able to contribute to the network by providing storage space in exchange for payment.

Decentralized data storage is possible through technology such as Merkle Trees, which are special data structures that allow users to easily find and retrieve data across a network. This system is what is responsible for assigning hashes to each transaction created on-chain and makes it possible to determine the exact block number in which it was created. Merkle Trees are often found outside of the blockchain world and used in various databases. Applications such as GitHub use Merkle Trees to track code repositories.

To learn more about Merkle Trees, check out this GeeksForGeeks article as well as the Ethereum documentation on Merkle Patricia Trie, which is a version of a Merkle Tree that Ethereum uses.

InterPlanetary File System (IPFS)

IPFS is the base layer for operating a decentralized storage network. It is responsible for distributing and identifying data across the network and making it available to everyone - it is not the data storage medium itself but rather a protocol designed to replace ineffective centralized systems of finding and retrieving data.

IPFS acts as a decentralized alternative to HTTP, the current standard for retrieving data from centralized servers. HTTP works by sending a request to access a web page from your device to a centralized server using the URL of the website to find its location on the server. Websites on the Internet operate by this standard, where if a user searches for a site, an HTTP call is made that retrieves the site from the server.

Source: CCNA

IPFS, on the other hand, works by using a blockchain system to retrieve data stored on nodes across a network. Each piece of data has its own unique identifier that lets the system find and retrieve it from the network and present it to the requestor. Instead of using the URL of a site to determine which server to retrieve data from, IPFS uses this unique identifier system to validate and pull data from the network.

Here is how IPFS stacks up to HTTP:

	HTTP and Centralized Servers	IPFS and Decentralized Servers
Speed of Access	A high number of requests made at the same time can overwhelm a server and cause a slower response time (laggy websites), and if the server goes down completely then you can’t access that site at all	Since data is stored across a network of nodes and not just on one server, data can be accessed faster and smoother even with high demand because the requests are not all going to the same location but rather to different nodes
Data Retrieval	HTTP operates on location-based retrieval where it has to access a specific server and find the data using the URL, where these websites are frequently located on remote server farms, far away from actual users	Data retrieval is context-based, meaning that if a request comes frequently from a specific location, the node closest to that location can hold a copy of the data, making the physical distance between the user and the data stored much shorter
Scaling	Centralized servers are difficult and expensive to scale because servers are generally complex and require a ton of energy to operate	Decentralized nodes can be added and removed as needed, with users being incentivized to offer storage data on their own devices in exchange for payment via crypto
Censorship	Whoever owns the server can manipulate the data on it and add/remove any data they’d like	Data can’t be censored as long as there is more than 1 node in operation because there is no single point of control over it

To learn more in-depth about how IPFS works, check out the IPFS documentation.

What are some applications built on IPFS?

Since IPFS is just the system responsible for routing requests, actual applications need to be built in order to harness its true potential.

One such application is Filecoin, a fully decentralized storage network. Filecoin works by creating a blockchain that operates as a huge decentralized database, with each node contributing storage space to the network for consumers to use. In return for providing storage space on their devices, node operators are paid in FIL, the network’s token.

When a user wants to store data on the network, they are able to view the available providers and their prices to decide which best suites their needs. Once a provider is chosen, the user will send them the data and they will store it in exchange for payment in FIL. This system operates purely peer-to-peer, with no intermediaries - the client picks a provider, sends them their data to store, and pays for it in FIL.

Even though the data is stored directly on a provider’s device, they can’t view it because it is encrypted. In order to earn money on their work, storage providers must prove to the network that the data is being securely held and is properly encrypted. They must submit proof of encryption by adding new blocks to the Filecoin network that contain transactions, which validate that the system is secure and encrypted. Once they prove that the data is safe, they earn FIL from the deal they made with the client as well as via block rewards for operating the network.

Source: Filecoin

💡

An example of data being stored on Filecoin

Imagine you are a small business that has 10 TB of important data. You need a cost effective and secure solution to store your data, so you decide to use the Filecoin network.

You would find a storage provider on Filecoin that matches your storage and cost needs
You would agree to a deal and then securely send your data for storage to the node
You would pay for the storage in FIL for as long as you need it based on the agreed upon rate
You would be able to retrieve the data at anytime and terminate the deal

Filecoin operates by minimizing proximity from the user to the data being stored. With a global network of nodes, clients are able to store data on the node closest to them, which makes it much faster and cheaper to send and retrieve data to and from storage.

Multiple nodes can also contain copies of the same data, making it even more accessible to everyone.
For example, if you are based in Philadelphia and create a website, you would host it on a Filecoin node that is located in or near Philadelphia. If the website becomes popular with users in London, instead of requesting the site from the nodes based in Philadelphia, London-based nodes could repost it so the data is physically near the new users.
This maximizes the efficiency of the network while also creating a scalable economy that optimizes the flow of data worldwide.

Check out the Filecoin website and documentation for a more detailed and technical understanding of how the system operates.

Another popular protocol for data storage is Arweave. Arweave functions in a similar manner by sending requests and data to nodes across its network. This lets users utilize Arweave for a whole variety of tasks, including file storage, site hosting, and creating shareable content. for a more complete overview of available applications, check out this list by Alchemy for a whole suite of other decentralized storage tools and learn how they differ from one another in terms of efficiency and capability.

📋 Practice Question

Decentralized storage solutions focus on decreasing the amount of ___ between a person requesting stored data and the data itself.

Distance

Time

Encryption

Accessibility

Color Box

EVM and other Virtual Machines

Ever wondered how smart contracts are able to be created, stored, and executed on-chain? This is possible thanks to the use of a Virtual Machine, the mechanism responsible for operating smart contracts. The most prominent of the blockchain VMs is the Ethereum Virtual Machine (EVM), which is responsible for operating the Ethereum network and all of the smart contracts deployed onto it.

Traditional blockchains that do not operate with a virtual machine, such as Bitcoin, act solely as distributed ledgers; they simply keep track of how much cryptocurrency each address on the chain holds. The ledger follows a strict, predefined set of rules, such as that a Bitcoin wallet can’t send more BTC than it holds. This is simple and effective, yet it doesn’t let users execute more complicated transactions.

A chain such as Ethereum that operates using the EVM, is more complex. The Ethereum network does not act solely as a distributed ledger, but rather as a distributed state machine. Each transaction made on the Ethereum network changes the state of the chain, and EVM processes each of these changes to maintain balance and keep all the smart contracts functioning correctly.

This system lets the EVM track the entire state of the Ethereum network including wallet balances, smart contracts, and the global state of the entire blockchain.

The EVM creates an easy way for developers to write and deploy their own code onto the Ethereum blockchain. They simply create a smart contract using a programming language such as Solidity (a popular language for developing smart contracts) and then compile them so that they are compatible with the EVM. The code is then deployed onto the blockchain as a transaction that includes the contract itself, as well as the initial state of the contract - for example, if the contract is meant to hold ETH, it will initially have a balance of 0 ETH until ETH is added into it. Once it is deployed on-chain, users are free to interact with the smart contract by sending transactions directly to it. Each state change made by the transactions is handled directly by the EVM.

The EVM is also responsible for dictating how much in gas (fees) users have to pay per transaction. Gas cost primarily depends on the type of transaction, where more complex ones that require additional computational power will cost more than simple ones.

For example, a transaction with a smart contract that has to call on an external data source or send an external message will cost more than a transaction that is simply sending money from one wallet to another because it requires more computing power to process. The EVM is responsible for determining an appropriate gas cost based on these types of factors in order to maximize the efficiency of the network.

This image illustrates the process that a transaction goes through in the EVM. An operation, such as a smart contract, costs a certain amount of gas to execute. If the operation:

Creates a message to call another contract or external data source
Retrieves data stored on-chain
ends data to be stored on-chain

…it will cost extra gas to create that transaction. The amount of gas needed for each of these operations is determined by the EVM code, which looks at the amount of compute required and how busy the network is.

Source: Ethereum.org

To learn more about the technical aspects of the EVM, check out the official Ethereum documentation.

The EVM is the most popular virtual machine, so many other existing blockchain networks are built to be EVM compatible. The main benefit of having an EVM compatible chain is that it lets developers deploy their smart contracts on the network without having to rewrite or edit them - they can immediately be launched on-chain just like they would be on the Ethereum network, which saves a lot of time and effort for developers deploying a dApp on multiple chains. Many other EVM compatible chains also tend to be much cheaper and faster to transact on than Ethereum, which can appeal to dApps that want to avoid having users pay high transaction fees. Check out this article by Unchained for a more comprehensive list of pros and cons of EVM compatible chains.

What are some other virtual machines popular in the blockchain space?

Solana, the 5th largest blockchain by market cap, has surged in popularity because of its speed and low transaction cost. It runs on its own virtual machine called the Solana Virtual Machine (SVM) that operates a bit differently than the EVM.

Solana is able to operate at a much faster rate than the Ethereum network because the SVM has a unique capability:

it can process multiple transactions at the same time.

This is because the SVM utilizes parallel processing, which lets multiple smart contracts be executed simultaneously. Ethereum can only process one transaction at a time, making it far less efficient.

Solana is able to do this because instead of using Proof of Stake as its consensus mechanism like Ethereum does, it utilizes Proof of History (PoH), which creates a system that timestamps each transaction to help put them in order. The SVM then utilizes Sealevel, the engine responsible for processing transactions, to identify non-overlapping transactions to be executed simultaneously.

This image illustrates how parallel processing on Solana works. Transactions arrive and are put into queues, where each transaction is timestamped so they can be sorted in order of arrival. Validators then take in transactions from this queue and process them simultaneously, instead of all the validators processing one transaction at a time. This system makes Solana so efficient and cheap because incoming transactions are processed so quickly.

Source: Squads.so

If multiple transactions are going to or from the same wallet(s) and need to be executed in a specific order, they can’t be executed in parallel to one another. However, if these transactions are completely unrelated and going to different places, then they can be executed in parallel because the state of one does not impact the state of the other. This allows the SVM to be able to settle the states of multiple transactions at the same time, making it far more efficient than the EVM.

💡

Complexity can cause problems

While the SVM system makes Solana far faster and more efficient than Ethereum, it is far more complex and requires much more compute intensity. This has been evident with the Solana blockchain, which has experienced several instances of downtime. While it is built to handle a high amount of transactions per second, large loads of transactions can still overwhelm the system, causing the network to shut down. The most recent outage occurred in February of 2024, where the Solana network went completely down for almost 5 hours.

Ethereum, while not as efficient, has a higher level of safety and security because of the more limited transaction throughput. As a result, it is far less likely to experience an outage and is considered one of the most secure chains, which has contributed to its popularity.

As blockchain knowledge evolves and the development of these blockchains continues to increase, these issues will become less and less prevalent. As of June 2024, Solana has only experienced one outage since February 2023, compared to incidents occurring almost monthly in 2022. Check out this tracker that tracks all incidents of downtime of Solana’s blockchain to see in-depth details about what went wrong during its outages.

For a more in-depth look at how the SVM operates, Squads offers a comprehensive overview in this article.

Other chains also have their own virtual machines that are custom built to propagate certain tasks: Filecoin has its Filecoin Virtual Machine (FVM) that is custom built to tailor to its storage solution, Cosmos utilizes the CosmWasm Virtual Machine for advanced smart contract development, and Neo uses the Neo Virtual Machine (NeoVM) to operate its enterprise infrastructure offerings.

📋 Practice Question

Virtual Machines are responsible for tracking the state of the chain, which includes tracking _ and __.

Volume, scalability

Wallet balances, gas fees

Exchange activity, trading pairs

NFTs, memecoins

Color Box

Scalability

To effectively compete with traditional data and finance systems, blockchain needs to be scalable. Currently, Visa claims to have the capability of processing over 65,000 transactions per second (TPS). Ethereum currently processes a measly 15 TPS, while Bitcoin only performs 7 TPS. This is what makes the fees on these networks so high - when the network is congested, fees spike to reduce demand and therefore keep the amount of transactions at a minimum.

How can blockchain solutions be scaled effectively to compete with corporate systems processing tens of thousands of transactions per second?

Networks such as Bitcoin, Ethereum, Avalanche, Binance Smart Chain, and Solana are called Layer Ones (L1s). They are called L1s because they serve as the infrastructure layer for decentralized systems. L1s operate through consensus mechanisms (Proof of Work, Proof of Stake, etc.) and as a result of their decentralization, they are considered to be the most secure layer. They are responsible from preventing attacks on the network, as well as for settling all transactions that occur.

Because of their security, most these chains tend to be limited in their TPS rate. As a result, a new solution was introduced:

Layer 2 blockchains

A Layer Two (L2) blockchain is a network that is literally built on top of a L1 network for the purpose of scalability. L2s operate by bundling incoming transactions and then sending all of them as a single transaction to the L1 for settlement. This system drastically reduces settlement time for transactions, as well as the cost of processing them; while transactions on Ethereum (an L1 blockchain) average around $10-$20 during peak times and can take several minutes to settle, transactions on Base (an L2 built on top of Ethereum) average fees of <$0.01 and settle nearly instantly.

L2s operate on a variety of different scaling solutions, but there are three main systems that make up a majority of the L2s in the crypto ecosystem:

Side Chains, Optimistic Rollups, and Zero-Knowledge Rollups.

Side Chains

Side chains are blockchains that essentially run in parallel to L1s. These chains have their own consensus mechanisms (usually Proof of Stake) that let them process and confirm all transactions that occur on the chain before sending them to an L1 to be permanently settled.

The Polygon network is an example of an L2 that runs on a side chain. It utilizes this system by having its own side chain that is responsible for processing all transactions that occur on the network and then periodically sending them as a single transaction to the Ethereum network for settlement. As a result, Polygon can, theoretically, process up to 65,000 TPS, with each transaction costing pennies.

Even though Polygon is built on top of Ethereum, it still operates entirely as its own blockchain network with its own gas token, MATIC. All transactions that occur on the chain are paid for in this token and not in ETH. Polygon simply leverages the security and decentralization of the Ethereum network to settle transactions while providing a scalable environment. The result of this system is that Polygon has much cheaper transactions and a faster settlement time than Ethereum.

Check out this article by Coin Telegraph for a beginners guide to the Polygon ecosystem and its token MATIC.

Source: IQ.wiki

Optimistic Rollups

Optimistic Rollups operate by bundling transactions together and sending them directly to a L1 for validation and settlement. Unlike side chains, these systems do not have their own consensus mechanism to verify transactions. Instead, they operate by assuming that all transactions are correct.

Optimistic rollups follow these steps to settle transactions on the L1:

Once a block is full, all transactions are submitted in a compressed format (to take up less data) to the L1 chain.
The L1 validators then have a challenge period to look through the new block and submit fraud proofs, or proof of invalid transactions.
If the sequencer that is responsible for bundling transactions and sending them to the L1 submits an invalid transaction, then it gets penalized.
If no invalid transactions are found, then the block is settled on the L1.

This system of penalizing the sequencer creates a secure method of sending transactions to a L1 because it disincentivizes the sequencer to submit fake transactions at the risk of losing money.

Optimism, aptly named for its use of optimistic rollups, is a popular L2 built on top of Ethereum. The Optimism sequencer is responsible for collecting all the transactions that occur on the chain and sending them to Ethereum to be validated and settled. This image illustrates the blocks being sent from Optimism to Ethereum, where each transaction listed represents a block full of transactions being delivered. The blocks are sent as Binary Large Objects (blobs), which is a compressed data format that is able to attach data directly to a block.

For a more detailed look into how Optimism works, check out the Optimism Docs.

Check out the address of the Optimism batch inbox on Etherscan

Zero Knowledge (ZK) Rollups

Zero Knowledge (ZK) Rollups operate similarly to Optimistic Rollups with one key difference: they can prove a transaction to be accurate without revealing any of the data inside it.

ZK Rollups use a tool called a Zero-Knowledge (ZK) Proof that uses advanced cryptography to verify that a transaction is legitimate without referencing any of the data that the transaction holds. This lets the validator prove that a transaction is valid without revealing any of the data actually stored within that transaction, making it ideal for maintaining a private network for sensitive data.

Similar to Optimistic rollups, ZK rollups have a block sequencer that collects all the transactions that occur on the chain, but instead of sending the block directly to the L1 to be verified, it creates a cryptographic proof that verifies all the transactions within that block to be valid and then sends the block to the L1.

ZKsync is an ecosystem built on top of Ethereum that utilizes ZK rollups to create scalable blockchains with faster and cheaper transactions. This image represents the lifecycle of transactions that go through ZKsync:

Users submit their transactions to the ZKsync sequencer, which creates the blocks on the L2
The sequencer sends the block to a Prover, which creates the ZK Proof that verifies the transactions in the block are correct
The Proof is then sent to the L1 (Ethereum), where a smart contract verifies it to be correct
Once the proof is verified, the block is settled on the L1

Check out the ZKsync documentation to learn the technical details of this process and how Zero Knowledge Rollups work.

Source: ZKsync

💡

What about the other layers?

While L1 and L2 are the most common layers, L3s have begun to arise as well. L3s are built on top of L2s and are generally referred to as App Chains. These app chains are generally built to be more specialized and customizable chains to be utilized for specific ecosystems. App chains today are mainly being developed to host on-chain games, as well as some DeFi applications.

L0s are the base of the entire blockchain world. They provide the base infrastructure for L1s and are also what make inter-blockchain communication possible. You can’t directly interact with a L0, only the L1 chains built on top of it.

What are some more advanced scaling mechanisms?

While L2s offer a great solution to the scalability issue, developers are looking to scale L1s directly in order to get their increased security. One solution that arose is the idea of Sharding.

Sharding consists of breaking up a blockchain into smaller groups called shards which would work in parallel to process transactions. Each shard would essentially operate as its own chain with its own validators and block sequencer for processing incoming transactions. The shards communicate with one another, and a main shard would monitor the network and maintain integrity.

Through sharding, a blockchain would be able to process incoming transactions in parallel to one another. Instead of all the nodes operating the network having to verify one block at a time, all the shards would verify blocks side by side. This will drastically increase the amount of transactions per second the network is capable of handling.

The Ethereum network currently has over 900,000 node operators contributing to the network through its Proof of Stake system. If it were to integrate a traditional sharding system, this upgrade would split up the node operators into 64 separate shards, where each shard will have about 14,000 node operators. This process would increase the TPS rate of the network by 64 times, since all 64 shards will be processing transactions in parallel to one another, bringing it from 15 TPS to 960 TPS.

Source: Blockchain Lab

How can Sharding become even more efficient?

While 960 TPS is a lot better than 15 TPS, it’s still not nearly high enough to compete with the likes of Visa or Solana. Furthermore, sharding results in a lot of complications for the chain, including:

Non-uniform gas fees: if each shard on the network has its own sequencers and validators, they could operate at different speeds and as a result, have differences in fee amounts. This can lead to users utilizing one shard more than the others, creating an imbalance and bringing the TPS rate back down.
Communication issues: there would be a significant increase in the complexity of the chain, where multiple independent groups of validators would have to agree on the state of the chain. Instead of one large group of validators working together to process transactions, having multiple smaller groups can create disagreement about the correct state of the chain.

As a result, Ethereum developers came up with a solution that could scale the network to process hundreds of thousands of transactions per second: Danksharding.

Danksharding eliminates the traditional sharding mechanism of literally splitting up the network into multiple parts. Instead, it focuses on data-sharding, where the network splits up data from each block into shards to process that data at a significantly faster rate.

Danksharding works by utilizing the previously mentioned Binary Large Objects (blobs) to randomly sample transaction data. Whenever a new block full of transactions is submitted, multiple blobs (up to 64) will be attached to each block produced on the network, and all the transaction data will be saved onto them. Each validator will then sample small parts of each blob to verify the data it holds is correct. Since the validators don’t have to check every single transaction coming in and instead are only sampling a small amount of data, they will operate much faster and therefore create a higher TPS rate for the network.

Here is a step by step breakdown of how Danksharding on Ethereum will work:

Incoming transactions will be bundled into rollups, with each rollup storing all of the associated data from these transactions within blobs attached to the rollup.

New blocks will consist of multiple rollups, therefore exponentially increasing the amount of transactions able to be stored per block. Each new block will have up to 64 attached blobs, all of which contain the associated data of the transactions that need to be verified.

The validators will all sample random parts of each blob to verify that all transactions are valid. Once the validators verify the data from the blobs to be accurate, the new block is added onto the blockchain.

This system also solves the problem of storing additional transactional data through a very simple solution: deleting the blobs.
The blob data is only required to verify the incoming transactions are valid. After all of the transactions are settled, the blob that stores all of the extra data will be deleted after a certain period of time, which will free up space on the network.

💡

Proto-Danksharding

While the ultimate goal of the Ethereum network is to reach a full Danksharding state, there are tons of adjustments that need to be made in order to prep the network for this upgrade. One of the biggest ones that was implemented in early 2024 was Proto-Danksharding. This upgrade, also known as EIP-4844, is what introduced the idea of storing rollup data in blobs.

L2s on top of Ethereum are now able to send transaction data to Ethereum by storing it in blobs. However, each block can currently only hold up to 6 blobs, not 64 as envisioning in the danksharding system. Rollup-based L2s such as Optimism currently use this system to verify transactions, which serves as a way to test the reliability of blobs before fully integrating them into the Ethereum network.

For further reading and more technical details, check out the Danksharding page on the Ethereum website that goes into much deeper detail about how this system works.

📋 Practice Question

Which scaling solution creates cryptographic proofs that can verify a transaction to be accurate without revealing its contents?

Optimistic rollups

Side chains

Sharding

Zero Knowledge rollups

Color Box

Data Availability

While all of these infrastructure tools are critical to making blockchain achieve advanced functionality, none of this would be possible if all the data stored on-chain was not accessible. Data availability refers to this exact idea: making blockchain data available to everyone at any time.

In a typical database, only those that operate that database can view and verify transactions. If this standard persisted into the blockchain ecosystem, it would mean that only the node validators would be able to view blockchain transactions because they are the ones that are validating them. Instead, the decentralized and open nature of blockchain makes all of this transaction data available to the public. Anyone can go and view all transactions occurring on the network for themselves in order to verify them.

Data availability (DA) in a blockchain network is responsible for providing users with access to this data. This is the reason that blockchains are trustless, since they let users verify for themselves the validity of the data. For the nodes operating the network, this system is particularly important because it lets all the nodes on the network review the data and determine its accuracy.

Data availability does not prove whether transactional data is accurate, it simply makes it available for anyone to view. It is still up to the nodes that operate the network to verify that incoming transactions are correct.

Here is a breakdown of the role of DA within block verification:

Propagation: when a new block full of transactions is suggested to be added to the chain, DA lets all the nodes access that new block
Validation: once a node gets access to the new block, DA lets each node have access to the transaction data on it in order to verify that the transactions are correct
Header Verification: since each new block is connected to the previous block by containing that block’s hash, DA gives access to the transaction hash of the previous block to ensure the chain follows the proper order
Consensus Compliance: nodes have to verify new blocks were properly agreed upon by the network, so DA lets a node check whether the new block reached that threshold
Updated Chains: each node must ensure it’s up to date with the current state of the chain, so DA lets a node check to see whether any new blocks were added

Source: Celestia

Without having data availability, a blockchain risks becoming centralized and controlled by a single party; if no one except the person that operates the database can see the transactions occurring on it, then that person is free to manipulate the database without anyone knowing it was edited. Blockchain prevents this issue from occurring by creating an open database that is run by a group of nodes where if any user attempts to submit a false transaction, that transaction will automatically be detected and removed.

💡

Voting on the blockchain

When it comes to electing a leader, a democratic voting process is critical to ensuring that a winner is selected fairly by a majority. Unfortunately, countries all over the world see instances of voter fraud occurring, where a less popular candidate somehow manages to win an election despite evidence suggesting that they did not receive enough votes. Vote tallying is a secretive process that is usually closed off to the public and is reliant on a small group of people to correctly count up the votes. This can result in manipulation and bribery, where officials are coerced into letting the losing party become the winner of the election, and no one can prove it occurred because they don’t have access to voter data.

By moving voting onto the blockchain, voters can rely on the data availability of the chain to clearly determine a winner. DA lets everyone directly view how many votes each candidate received, where each vote can be listed as a transaction and verified by the nodes operating the chain to be accurate. This creates a fair and trustless system that can determine a clear winner by verifying all votes to be legitimate.

Data Availability Layers

DA layers are special data availability solutions where instead of storing all transaction data directly on the chain itself, transaction data is stored in a blockchain separate from the main chain. This layer is responsible for providing users access to transactional data while at the same time saving storage space on the chain by storing this data separately.

Using a DA layer instead of having full data availability on a blockchain network can significantly increase transaction speed and decrease storage cost. In fact, rollups and sharding both utilize DA layers to operate.

Rollups use a DA layer to store all of the blobs created by transactions. This system creates much less of a strain on the blockchain network by only requiring the cryptographic proofs to be stored on-chain instead of all the transaction data - the blobs that contain all the actual data are stored in the DA layer, and the proofs that prove these transactions to be valid are maintained by the chain. This way, the blockchain network still has evidence directly on it of all transactions being valid, and if anyone wants to see the actual transaction data, they can do so on the DA layer. This lets blockchains maintain full data availability while also cutting down on the cost of storing data.

Having a DA layer can also be useful for blockchains that utilize sharding because it creates a shared space for storing transaction data. Since each individual shard essentially acts as its own mini chain, having a shared DA layer where all the shards can store transaction data would make it significantly easier to manage data availability across the entire network.

One popular DA layer is Celestia. This protocol acts as its own blockchain network that can provide a DA layer as a service to other chains. Celestia offers economy of scale to these chains where instead of having to create their own data availability on their chain, they can utilize Celestia and save money as they scale. Celestia also has its own native token TIA that acts the platform’s main form of payment - blockchains that utilize Celestia for data storage can directly pay for that storage in TIA. Users can also stake TIA directly on Celestia to help secure the network and participate in its consensus mechanism while earning interest on their holdings.

Source: Celestia

To learn more about Celestia and its inner workings, check out the Celestia guide.

💡

Modular vs Monolithic chains

When a network processes all transaction data directly on a single chain with no reliance on external tools, it is referred to as a being a monolithic chain. Bitcoin, for example, is a monolithic chain because the transactions are created, settled, and stored directly on the Bitcoin network. There is no reliance on external data or any other tools - everything is done directly on Bitcoin’s chain. While this system makes the chain simple and secure, it creates challenges with scaling or upgrading the network because of the constraints on the amount of data that can be stored and how fast transactions can be settled.

Modular chains, on the other hand, utilize external means of executing, settling, or storing data. Celestia, for example, is a modular chain because transactions are executed via rollups and then stored in a DA layer. Separating the different operations of a blockchain can create effective scaling solutions because it can outsource each operation. A blockchain network that chooses to become modular would not have to worry about creating an effective data availability system because it can outsource that responsibility to Celestia, which specializes in this exact operation. However, modularity can increase complexity, so modular chains have to ensure their security is well maintained in order to avoid any issues.

To learn more about the differences between modular and monolithic chains, check out Celestia’s guide on modular and monolithic blockchains.

Data availability is crucial to keep blockchains decentralized and open. As soon as transaction data becomes controlled and hidden by a single party, a database can easily be manipulated and distorted. By having a system in place that lets anyone verify for themselves that all transactions are accurate instead of simply trusting that they are correct, blockchain creates a trustless and secure environment for users to operate freely within.

To learn more about data availability, check out this article by CoinGecko on what is data availability, as well as this YouTube video on data availability and DA layers.

📋 Practice Question

True or False: without data availability on a blockchain, users will not be able to access and verify transactional data.

True

False

Color Box

Works Consulted on this page:

Alchemy. (n.d.). Best Decentralized Storage Tools. Retrieved from https://www.alchemy.com/best/decentralized-storage-tools

Arweave. (n.d.). Arweave. Retrieved from https://www.arweave.org/

Celestia. (n.d.). Basics of Modular Blockchains: Modular and Monolithic Blockchains. Retrieved from https://celestia.org/learn/basics-of-modular-blockchains/modular-and-monolithic-blockchains/

Celestia. (n.d.). Celestia. Retrieved from https://celestia.org/

Celestia. (n.d.). What is Celestia? Retrieved from https://celestia.org/what-is-celestia/

Chainlink. (2024). Blockchain Oracles. Retrieved from https://chain.link/education/blockchain-oracles

Chainlink. (n.d.). CCIP Documentation. Retrieved from https://docs.chain.link/ccip

Chainlink. (n.d.). Chainlink. Retrieved from https://chain.link/

CoinGecko. (2024). Data Availability in Blockchain and Crypto. Retrieved from https://www.coingecko.com/learn/data-availability-blockchain-crypto

Cointelegraph. (2024). A Beginner's Guide to Understanding Wrapped Tokens and Wrapped Bitcoin. Retrieved from https://cointelegraph.com/learn/a-beginners-guide-to-understanding-wrapped-tokens-and-wrapped-bitcoin

Cointelegraph. (2024). Polygon Blockchain Explained: A Beginner's Guide to Matic. Retrieved from https://cointelegraph.com/learn/polygon-blockchain-explained-a-beginners-guide-to-matic

CosmWasm. (n.d.). CosmWasm. Retrieved from https://cosmwasm.com/

Ethereum. (n.d.). Ethereum. Retrieved from https://ethereum.org/en/

Ethereum. (2024). EVM Documentation. Retrieved from https://ethereum.org/en/developers/docs/evm/

Ethereum. (2024). Patricia Merkle Trie Documentation. Retrieved from https://ethereum.org/en/developers/docs/data-structures-and-encoding/patricia-merkle-trie/

Filecoin. (n.d.). Filecoin. Retrieved from https://filecoin.io/

Filecoin. (2024). Filecoin Documentation. Retrieved from https://docs.filecoin.io/

Filecoin. (2024). Filecoin Virtual Machine (FVM). Retrieved from https://fvm.filecoin.io/

GeeksforGeeks. (2024). Introduction to Merkle Tree. Retrieved from https://www.geeksforgeeks.org/introduction-to-merkle-tree/

IPFS. (n.d.). How IPFS Works: How IPFS Represents and Addresses Data. Retrieved from https://docs.ipfs.tech/concepts/how-ipfs-works/#how-ipfs-represents-and-addresses-data

Medium. (2024). What is Sharding in Blockchain? Retrieved from https://medium.com/@theblockchains/what-is-sharding-in-blockchain-a11dc10c54cf

NEO. (n.d.). NEO. Retrieved from https://neo.org/

Optimism. (2024). Rollup Overview - Optimism Documentation. Retrieved from https://docs.optimism.io/stack/protocol/rollup/overview

Solana. (2024). Solana Status - Uptime. Retrieved from https://status.solana.com/uptime

Squads. (n.d.). Solana SVM (Sealevel Virtual Machine). Retrieved from https://squads.so/blog/solana-svm-sealevel-virtual-machine

Stargate. (n.d.). Stargate Finance - Bridge. Retrieved from https://stargate.finance/bridge

SushiSwap. (n.d.). SushiSwap. Retrieved from https://www.sushi.com/swap

SushiSwap. (2023). SushiXSwap V2 - SushiSwap Blog. Retrieved from https://www.sushi.com/blog/sushixswap-v2

Unchained Crypto. (2023). EVM Compatibility. Retrieved from https://unchainedcrypto.com/evm-compatibility/

Wormhole. (2024). Wormhole Documentation. Retrieved from https://docs.wormhole.com/wormhole

YouTube. (2023). Data Availability & DA Layers EXPLAINED (Animated). Retrieved from https://www.youtube.com/watch?v=Ku8MwyUbOH8

zkSync. (n.d.). zkSync Transaction Lifecycle Documentation. Retrieved from https://docs.zksync.io/zk-stack/concepts/transaction-lifecycle

5. Blockchain Ecosystem Infrastructure

How is Blockchain revolutionizing digital infrastructure?

Decentralized Data Oracles

What is the main reason decentralized oracles are more effective and trustworthy than centralized ones?

Cross-Chain Communication

Wrapped Tokens

Cross Chain Protocols

BYOB - Build Your Own Bridge

Cross-chain protocols let you do all of the following except for:

Decentralized Data Storage

InterPlanetary File System (IPFS)

What are some applications built on IPFS?

Decentralized storage solutions focus on decreasing the amount of ___ between a person requesting stored data and the data itself.

EVM and other Virtual Machines

What are some other virtual machines popular in the blockchain space?

Virtual Machines are responsible for tracking the state of the chain, which includes tracking ___ and ____.

Scalability

Layer 2 blockchains

Side Chains

Optimistic Rollups

Zero Knowledge (ZK) Rollups

What are some more advanced scaling mechanisms?

How can Sharding become even more efficient?

Which scaling solution creates cryptographic proofs that can verify a transaction to be accurate without revealing its contents?

Data Availability

Data Availability Layers

True or False: without data availability on a blockchain, users will not be able to access and verify transactional data.

Works Consulted on this page:

Virtual Machines are responsible for tracking the state of the chain, which includes tracking _ and __.