Bitcoin is a trustless digital currency built on cryptography. I think it clearly explains the relationship between cryptography and blockchain.
How do we understand the causal relationship between them? Some may think that blockchain needs features like traceability and immutability, which is why Satoshi Nakamoto designed several cryptographic algorithms to support his vision. But this is completely wrong and a reversal of the truth.
You only need to understand two basic types of algorithms: hash algorithms and asymmetric encryption algorithms. Learning cryptography is different from learning other technologies; understanding the principles of cryptography requires a solid mathematical foundation. In fact, engineers can treat cryptography as a black box; you don't need to focus too much on its technical principles, but only need to understand its characteristics and application scenarios.
Hash Algorithms Let's first look at the hash algorithms that have been mentioned frequently before. Hash algorithms, also known as hash functions, can convert data of any length into a short, fixed-length data fingerprint.
In fact, there isn't just one fixed implementation of hash algorithms; it's a general term for a class of algorithms. Common ones include MD5 and SHA-256. To judge whether a hash algorithm is excellent enough, we have four criteria: fast in the forward direction, difficult in the reverse direction, input sensitivity, and collision avoidance.
Fast in the Forward Direction What does it mean to be fast in the forward direction? This is easy to understand. Given the information and algorithm rules, the hash result of the information can be computed in a limited time. If you say you designed an excellent hash algorithm, but the computation takes several seconds, that only means your design is not excellent enough. Although the time is limited, it is fundamentally unusable in real scenarios. To give you a more intuitive feel, I specifically wrote a program to test how fast the commonly used MD5 and SHA-256 hash algorithms can compute.
As you can see, the final results are impressive. In about one second, MD5 computed 10 million times, while the SHA-256 algorithm computed 5 million times. I only used an ordinary computer here; if it were running on a server, I believe the results would be even more astonishing. Through this test, I believe your understanding of the standard of being fast in the forward direction should be clear enough. Of course, simply analyzing the computation speed of hash algorithms is meaningless; it still needs to be placed in specific business scenarios. This is just to give you an intuitive impression.
Difficult in the Reverse Direction Next, let's look at the second standard, which is that if given a hash result, it is basically impossible to reverse-engineer the original information in a limited time. In other words, an excellent hash algorithm cannot reverse the plaintext from the ciphertext, which is the root of its irreversibility.
This is a command I executed in my computer's command line, obtaining the MD5 hash value of "Geek Time." Imagine if I didn't tell you that e0aac893629b048e8797800294f55004 is the MD5 hash result of "Geek Time," could you reverse-engineer the plaintext it represents? Not only would you be confused, but I believe a computer would also be puzzled by this question, as it can only guess the result through exhaustive search. Of course, this doesn't mean that hash algorithms cannot be cracked. On the contrary, MD5 hashes can be said to have been indirectly cracked. Since it is difficult to reverse, I can preemptively exhaust common data combinations and store them, and when I need to crack, I can just look it up, which is simple and crude.
Input Sensitivity Let's look at another standard, which is that even a slight change in the original information should result in a significantly different recalculated hash value compared to before. This standard mainly reduces the risk of partial inference of the original information. If the newly generated hash value is not significantly different from the previous one, the computer can use the difference to infer the change, which is certainly easier than exhaustive search, and this is not allowed. Let's look at an example:
Collision Avoidance#
You might wonder, taking the MD5 algorithm as an example, that any data information can yield a 32-bit hash value after being processed by the hash algorithm. Wouldn't that lead to repetition, since a 32-bit hash value can represent a relatively small amount of data? In fact, you are only half right; 32 bits in a computer is represented as a 128-bit binary number, so in essence, MD5 hashes can represent a large amount of data. If you calculate it, you'll find that this is actually a very large number. If you still think it's not safe enough, there are higher-bit algorithms to choose from, such as SHA-256 and SHA-512. Of course, a large amount of data representation does not mean that two different data sets cannot yield the same hash value; this has indeed occurred in practice. Therefore, when designing hash algorithms, one should avoid different plaintext information producing the same hash value.
Hashes in Blockchain Now that we understand the four standards for judging excellent hash algorithms, I believe you have some understanding of the characteristics of hash algorithms. So how are hash algorithms applied in blockchain? Let's continue. Since hash algorithms essentially extract summaries of information, acting as data fingerprints, they are often used for data integrity verification, and blockchain has made appropriate extensions based on this. The main places where hash algorithms are used in blockchain are transaction hashes and block hashes, with the main functions being to verify the integrity of transactions and blocks, as well as to serve as unique identifiers for transactions and blocks in the blockchain network. Speaking of this, did you think of the second discussion topic I left in the previous lecture? That's right; when blockchain data is written to disk, we can use transaction hashes and block hashes as the primary keys for transaction and block data. As long as the correspondence between data and identifiers is maintained, transactions and blocks can be stored in any database.
Block hashes have another function: the block attributes include the hash of the previous block, thus constructing a data chain associated with block hashes. Therefore, it is not an exaggeration to call the blockchain a hash chain. Utilizing the sensitivity of hash algorithms to input, if an attacker wants to tamper with blockchain data, they must start from the modified point and sequentially modify all subsequent blocks. The addition of hash algorithms increases the cost for the attacker to tamper, and it can be said that hash algorithms enable the immutability of a single node in the blockchain. Asymmetric Encryption Algorithms After discussing hash algorithms, let's turn our attention to asymmetric encryption algorithms. Speaking of asymmetric, there must be a corresponding symmetric encryption algorithm. As the name suggests, the difference between symmetric and asymmetric lies in how the keys are handled. Symmetric means that the same key is used for both encryption and decryption. Asymmetric encryption, however, is different; it is divided into public keys and private keys. If you use a public key for encryption, only the corresponding private key can decrypt it, and vice versa. So, since there is a symmetric encryption algorithm, why invent an asymmetric encryption algorithm? Here, I will use an express delivery example to help clarify.
You are working hard in a different city and send a precious gift to your parents back home, packaged in a beautiful box. To prevent the gift from being damaged, the best method would be to deliver it personally, but that is not realistic, so you have to send it by express delivery.
When your parents receive your gift, they are very happy and immediately send you some local specialties using the same box you used for the gift. After several rounds back and forth, the once beautiful box has become battered, and you can even see the items inside through the box, making it easy to lose. This example helps us vividly understand the shortcomings of symmetric encryption. The gift and the local specialties represent the information that needs to be exchanged, while the initially beautiful box represents the key. To ensure that the key is not leaked, the best way for symmetric encryption algorithms is face-to-face negotiation. Even if the key is negotiated face-to-face, it cannot be used multiple times afterward; otherwise, the more times it is used, the higher the risk of leakage, and it cannot be guaranteed that no one else has tampered with it in the meantime.
Using the express delivery example again, to ensure that the express items do not get lost, this time for the New Year, you buy your parents an express insurance box, which has two keys: one for your parents and one for you. This way, when sending express items in the future, you can place the items in the insurance box, so even if they are transported by express delivery, you don't have to worry about the items being swapped, because no one else can open the insurance box except you and your parents. The insurance box here is equivalent to the public key, while the key is equivalent to the private key. This "insurance box" scheme corresponds to the idea of asymmetric encryption algorithms in cryptography. The keys of asymmetric encryption algorithms are a pair, consisting of a public key and a private key. The public key can be shared openly, while the private key needs to be kept secret. As long as the private key is not leaked, the information exchange process is secure. Of course, this example still differs from asymmetric encryption; a true private key should only exist in one copy.
Although asymmetric encryption algorithms solve the problems encountered in the use of symmetric encryption algorithms, their encryption and decryption efficiency is far inferior to that of symmetric encryption algorithms. Therefore, a mixed-use approach is often adopted in practice, where plaintext data is encrypted using symmetric encryption algorithms, and then the symmetric encryption key and the plaintext data hash are encrypted using asymmetric encryption algorithms. Upon receiving the information, the other party uses the corresponding paired key to decrypt the data and verify the plaintext data hash. The mixed-use approach can effectively balance security and timeliness. Like hash algorithms, asymmetric encryption algorithms are also a general term for a class of algorithms. The most commonly used ones are RSA and ECC (Elliptic Curve Cryptography). RSA is often used in internet information transmission, such as HTTPS communication, while blockchain primarily uses ECC variants, such as ECDSA and ED25519. Compared to RSA, ECC has shorter keys, providing higher security with better performance.
Asymmetric Encryption in Blockchain#
Now that we understand the characteristics of the algorithms, let's look back at how asymmetric encryption algorithms are used in blockchain. If you have previously learned about blockchain, you will find that there seems to be no place where asymmetric encryption is directly used for data encryption and decryption. Indeed, the use of asymmetric encryption algorithms in blockchain is not directly for data encryption but rather utilizes their ability to establish identity verification, which is digital signatures. What is a digital signature? As the name suggests, it is similar to the concept and function of our handwritten signatures, used to prove your identity and confirm authorization for the signed documents or data.
In public chains like Bitcoin and Ethereum, there are concepts similar to addresses and accounts, which we can consider as the ID cards in blockchain. This ID card is different from those issued uniformly by the government in reality; it is a decentralized representation based on the public key of asymmetric encryption algorithms. Here, a small knowledge point can be extended: when you create an account through a digital currency mobile wallet, the blockchain network does not know about it until you receive a transfer. So how do we confirm that this account belongs to you? The reasoning is very simple; as long as you can prove that you can spend the balance in the account, that is enough, and the key to this proof lies in the digital signature. Since the account is a representation of the public key, there must be a corresponding private key, which only exists in your possession.
The transaction structure can be divided into two parts: the left part is the basic attributes of the transaction, while the right part only includes the signature. If we do not look at the signature, the left part can basically represent the content of the transaction; it just lacks proof. Therefore, we treat the left part as a whole, perform a hash calculation on it to obtain the transaction summary, and then use your private key to encrypt the transaction summary, with the encryption result being the transaction signature. To verify the transaction, you only need to follow the reverse process: use the corresponding public key to decrypt the transaction signature to obtain the transaction summary contained in the signature, and then recalculate the transaction summary using the hash algorithm on the plaintext transaction content. Finally, compare the transaction summary in the signature with the independently calculated transaction summary; if they match, the transaction verification is successful.
The essence of a digital signature is also a form of encryption, but it targets the hash summary of the data, not the data itself. What are the benefits of doing this? There are mainly two points: on one hand, it proves that it is indeed you who constructed this transaction, as no one else has your private key and cannot impersonate you. On the other hand, because digital signatures involve hash algorithms, they inherit the characteristics of hash algorithms, ensuring the integrity of the transaction and ensuring that the transaction will not be tampered with without reason.
Summary#
In this lecture, I analyzed the commonly used cryptographic algorithms in blockchain, starting from hash algorithms and asymmetric encryption algorithms. Hash algorithms are used to abstract transaction hashes and block hashes. In addition to ensuring data integrity, they can also be used to construct hash chains associated with block hashes, increasing the cost of malicious tampering.
The use of asymmetric encryption algorithms in blockchain is not directly for data encryption but mainly for transaction signatures. Digital signatures integrate the dual characteristics of hash algorithms and asymmetric encryption algorithms, ensuring the integrity of transactions while also proving the identity of the transaction initiator. Blockchain does not only use hashes and asymmetric encryption; almost all blockchains have made some derivations based on this, such as ZCash, which pursues privacy transactions and introduces zero-knowledge proofs in cryptography, and blockchains focused on privacy computing that introduce homomorphic encryption, etc. The knowledge of cryptography is vast, and what is currently used in blockchain is just the tip of the iceberg. Through today's learning, we can see that it is precisely because of the support of cryptography that blockchain seems to stand on the shoulders of giants.
For the overall historical development of cryptography, you can refer to "An Introduction to Cryptography."
-
This lecture focuses on the most common and basic cryptographic algorithms in blockchain. If you want to learn more about the categories of cryptography involved in blockchain, you can read the knowledge list compiled by Vitalik.
-
If you still feel unsatisfied and want to understand the principles of the cryptographic algorithms introduced in the text, you can refer to these articles: "Principles of Hash Algorithms," "Principles of RSA Encryption Algorithms," and "Introduction to ECC Elliptic Curve Encryption Algorithms."
The main thing is to understand that the private key represents ownership, which is the cornerstone of proving identity. The public key is given to others (Zhang San, Li Si, Wang Wu all know it), and encrypting the data hash with the public key does not reveal whether it was Zhang San, Li Si, or Wang Wu who operated it. There is no "signature" significance.
Network Model#
Why does blockchain adopt a peer-to-peer network model as the link for data transmission between nodes? To truly understand this idea, let's make a comparison. Whether ordering takeout or buying train tickets, most of the network application architectures we encounter in daily work and life are client-server models. The operating mechanism of this model is as follows: various clients send requests to the server, and the server is responsible for receiving and processing requests, ultimately returning the processing results to the clients. To give you an intuitive understanding, I also drew a schematic diagram.
Combining the schematic diagram, we can clearly recognize that this is a centralized network architecture, and its service capability entirely depends on the central server. If the central server crashes unexpectedly, the entire service will collapse. The availability of the entire application relies solely on the activity of the server, regardless of the clients.
In contrast, a peer-to-peer network is completely different; it is a distributed network architecture without a single central server. Each node in the network has equal rights and obligations. Each node has the right to initiate requests to other nodes in the network and also has the obligation to respond to requests from other nodes.
What does the equality of rights and obligations among nodes mean? This means that the availability of a peer-to-peer network increases with the number of nodes; the operation of the network does not depend on any single node, and nodes can join or leave at will. Even if some nodes crash or are attacked, as long as there are still normally operating nodes, the entire system can continue to function as usual.
How can we vividly understand the difference between the two? Let's take the banking system as an example. If we follow the current centralized architecture, one day the bank branches are largely shut down and do not provide services. The reason could be very simple; excluding network interference, it is very likely that the bank's main server room has encountered a problem. This is a single point of failure that can occur in a centralized architecture.
In contrast, the blockchain architecture is different; one bank is equivalent to one blockchain network, and each bank branch represents a blockchain node. If one node cannot provide services, we can switch to another node at any time. The problems of one or two nodes do not affect the entire network. Of course, this example is described in a very extreme and crude way, just to help you understand; the actual banking system architecture would not be so fragile.
Network Topology#
From a certain perspective, whether the chicken or the egg came first, peer-to-peer networks and blockchain are mutually complementary CP combinations.
In previous courses, I mentioned that the decentralization of blockchain is relatively decentralized, while absolute decentralization is just an ideal state. If we correspond this to peer-to-peer networks, this is actually determined by the topology of the peer-to-peer network. How do we understand this correspondence? A network with only one node is essentially a centralized network, and at this time, the blockchain behaves as centralized. If all nodes in the network establish connections with each other, the peer-to-peer network can be seen as perfectly symmetrical, and the blockchain will exhibit absolute decentralization characteristics. However, if we do not take such extremes and allow the peer-to-peer network to oscillate between these two theoretical extremes, we will find that the blockchain network will always be in a dynamic balance between centralization and absolute decentralization, and the degree of decentralization is mainly based on the choice of the topology of the peer-to-peer network.
You might still be a bit confused by this statement; don't worry, let's sort out the logic in between together. Compared to centralized networks, peer-to-peer networks have obvious advantages, but they also have obvious shortcomings: new nodes must know at least one existing node in the network before they can join; otherwise, the new node will be disconnected from the entire network. This point is very easy to understand: in a centralized network, if a client wants the server to respond to its request, it must first send its request to the server, which requires the client to know where the server is, just like you must know the website address to go online. In contrast, in a peer-to-peer network, all nodes serve as both servers and clients, so nodes must know the addresses of other nodes to join the network. The different handling of the node discovery mechanism also leads to different network topologies, which I will analyze for you below.
The first structure introduces a central index node to save the access information of all other nodes. When a new node joins, it first sends its information to the central node to obtain the connection information of other nodes already in the network, and then forms a peer-to-peer network with other nodes. This structure is similar to using a search engine for information retrieval, but there is a possibility of a single point of failure; if the central index node crashes, new nodes will be unable to join the network.
The second structure can be understood as a hands-off structure. The new node chooses to connect to an existing node in the network, and the connected node can inform the new node of the information of other nodes it is connected to, allowing the new node to randomly choose to connect to other nodes, thus forming a network topology without a specific pattern.
The third structure combines the characteristics of the previous two structures. The entire network is formed by multiple seed nodes creating a backbone network, while other ordinary nodes connect to a seed node, thus forming a mixed structure that is overall decentralized but locally centralized.
The last network topology is structured, which differs from the three topologies mentioned above. The connections between nodes follow a certain algorithm, forming an ordered structured network, such as a tree network structure. Most structured network algorithms are implemented based on distributed hash table algorithms, which are indexing algorithms used to quickly locate target nodes in systems with a large number of nodes. You can refer to the extended reading for specific principles.
Blockchain Network Topology#
Now that we understand the differences between these four network topologies, how are nodes organized in the blockchain? Looking at the Bitcoin and Ethereum networks mentioned earlier, you will find that their choices differ. Bitcoin adopts a hybrid network topology; when a new node joins, it needs to provide information about several neighboring nodes already in the Bitcoin network, which can be ordinary nodes or seed nodes. Once the connections between nodes are established, the new node sends its information to neighboring nodes, which then sequentially forward the new node's information to their respective neighboring nodes, ensuring that the new node's information is widely disseminated in the network.
The peer-to-peer network in Bitcoin is relatively simple and easier to understand, but the efficiency of information transmission is not high, and it places a higher load requirement on network bandwidth. You will understand this better after seeing the subsequent transaction diffusion process. In contrast, Ethereum's choice is different; it uses a structured network topology, and the creation of this structure relies on the Kademlia algorithm based on the distributed hash table concept (hereinafter referred to as the Kad algorithm). Although the principle of the Kad algorithm is relatively complex, if you can first understand what it does, it will greatly reduce the difficulty of learning.
In simple terms, the Kad algorithm defines a method for calculating distances between nodes for discovering new nodes. So how is this calculated? Let's continue to look.
Node A first finds the two nodes closest to itself from its stored node information in K buckets, then requests these two nodes to find two nodes closer to Node A from their respective K buckets. Thus, Node A can obtain information about up to 2 × 2 new nodes. It then requests the two closest nodes among these 2 × 2 new nodes to return two nodes closer to Node A, and this process continues iteratively until no new nodes are discovered.
Through this method, Node A connects only to the nodes closest to itself, maintaining order in the Ethereum network. However, you need to note that the distance between nodes is logical distance, not physical distance. This means that even if two Ethereum nodes run on the same computer, their logical distance may still be infinitely far apart. This is due to the algorithm itself; for example, in the diagram, Node C appears to be closer to Node A, but in reality, the logical distance is still closer to D/E.
By comparing the choices of different network topologies in Bitcoin and Ethereum, we can find that there is no fixed pattern for choosing network structures in blockchain; there is no unique standard. If you are familiar with the technology system of consortium chains, you will find that consortium chains do not place as much emphasis on peer-to-peer networks as public chains. There are two main reasons for this: on one hand, the nodes in consortium chains are fixed, and there is rarely a need for nodes to join or leave at any time; on the other hand, the number of nodes in consortium chains is relatively small, so there is no need for a node discovery mechanism to guide new nodes to join.
Transaction Diffusion With the foundational knowledge laid out earlier, we can finally get transactions flowing through the network. Next, we will deepen our understanding of the network through transaction diffusion. Previously, we mentioned that blockchain networks do not produce transactions but merely transport them. From the schematic diagram of the blockchain network, we can see that there are many external devices such as mobile phones, computers, and cars on the periphery of the blockchain network. These devices can connect to any blockchain node to obtain information related to themselves, such as balance, historical transactions, etc.
At the same time, the transactions constructed by external devices will be sent to the nodes they are connected to, making them the source of transactions in the blockchain network.
The propagation process is relatively easy, but we should note the differences in handling duplicate receptions of the same transaction based on different network topologies. In the Ethereum network, nodes only connect to their nearest neighboring nodes, so there is no possibility of loops in transaction diffusion. Its diffusion path resembles a one-way continuous broadcast, just like water always flows downward; transactions gradually move away from the initial node, resulting in higher diffusion efficiency.
In contrast, Bitcoin's transaction diffusion handles this differently. Its network topology is random; when a transaction diffuses, it spreads to the connected nodes, but it does not know in advance whether the other party has already received the same transaction, so the efficiency of network diffusion is lower. As the scale of network nodes expands, the connection relationships between nodes become more disordered, and a large amount of network bandwidth is wasted on the repeated diffusion of transactions. Nodes receiving duplicate transactions will not pay attention to them again but will directly discard them, resulting in wasted efforts. You see, although Bitcoin's network topology is simple, it also pays a price for it. The diffusion of blocks also follows the general logic of transaction diffusion, except that external devices are not needed in this case; instead, it is initiated by miner nodes that meet consensus conditions. You can keep this impression for now, and I will explain it in detail when we learn about consensus algorithms in the next lecture.
Summary#
The peer-to-peer network is the most commonly used network interaction model in blockchain. Compared to the centralized client-server model, each node has equal rights and obligations, which complements the decentralization characteristic of blockchain. However, the node discovery mechanism in peer-to-peer networks is much more complex than in centralized networks, leading to different network topologies such as centralized, random, mixed, and structured. Different blockchains have different choices, affecting how blockchain networks handle transaction and block diffusion. In fact, whether the idea of decentralization came before the enlightenment of peer-to-peer networks in blockchain is not the most important question. The reason for discussing this issue is that studying the relationship between the two can help you understand blockchain networks more profoundly. In conclusion, I would like to express my inclination: I believe that the idea of peer-to-peer networks inspired Satoshi Nakamoto to think of using its non-absolute centrality to build a digital currency system that eliminates third-party intermediaries.
Basics of Smart Contracts#
Did you know that smart contracts are not exclusive to blockchain? In fact, the concept of "smart contracts" was proposed as early as the 1990s and can be summarized as follows: a set of promises, specified in digital form, including protocols within which the parties perform on these promises.
a set of promises, specified in digital form, including protocols within which the parties perform on these promises
However, due to the lack of effective carriers at that time, smart contracts did not develop further. It wasn't until Ethereum emerged that blockchain technology was combined with smart contracts, ultimately presenting itself to us. How do we understand the concept of smart contracts? We can start by associating it with "contracts," using our previously familiar knowledge for comparison. Think about the contracts and agreements we encounter in our daily work and life; we can clearly recognize that the concept of smart contracts is actually similar to them, except that smart contracts are in digital form. Of course, smart contracts differ from traditional contracts. The mechanism of their binding force is not the same; the binding force of traditional contracts comes from the endorsement of authoritative institutions. Once the parties to the contract fail to fulfill the obligations stipulated in the agreement, other participants have no right to privately hold the breaching party accountable or punish them; they must go through judicial channels such as courts for reasonable and legal claims.
However, smart contracts are different; part of their binding force comes from the contracts themselves. A smart contract is essentially a piece of computer program code, so it has a strong logical rigor. It is absolute in its execution; the code itself describes the rights and obligations of each participant, as well as the processing logic flow under various conditions. You could say that code is law. You might have a question: since code is law, why can't the various software applications we usually encounter be called smart contracts? This actually leads us to the other part of the binding force of smart contracts, which is the blockchain network that provides the operating environment for smart contracts. How does the blockchain network differ from the operating environment provided by software applications? Combining the explanations of the nature and basic technologies of blockchain from previous sections, you will find that the difference lies in the characteristics of blockchain. Traditional software applications are personalized services provided by a single enterprise or individual, and the final interpretation of the software lies with the application developer. Therefore, in special scenarios, it is difficult for users to develop a sense of trust. Additionally, when you use an application, it is equivalent to signing a traditional contract with the developer, and the binding force still comes from authoritative institutions. In contrast, the blockchain network is jointly maintained by all participants, and the blockchain protocol followed by all members provides the trust and binding force for smart contracts.
The execution of smart contracts relies on off-chain transaction initiation. Once on-chain, its operation can exclude any third-party interference. Moreover, even if there are disputes over the execution process of the contract, the traceability feature allows for tracking the execution process of the smart contract.
It is precisely because of the differences in binding force that smart contracts face some landing issues. Because smart contracts lack the supervision of authoritative laws, they are only constrained by code and blockchain trust. This makes the general public still lack some trust in smart contracts, especially when it comes to important contractual agreements, where there is still a general trust in authoritative institutions.
Of course, this is just a transitional phase that new things must go through when entering the market. With the continuous promotion of blockchain and smart contracts, public trust in blockchain will deepen, and the popularization of smart contracts is only a matter of time. Why am I so optimistic? This is because Ethereum not only introduced smart contracts into blockchain but also provided an operating environment for smart contracts to land; it also provided a channel for the promotion and popularization of smart contracts, which is the Ethereum Improvement Proposal (EIP).
Ethereum not only provides technical support but also offers reusable templates. The entry threshold has been lowered, and the impact on user experience is profound. Because users worldwide can submit various Ethereum improvement drafts, if the community accepts the proposal, the subsequent new version of Ethereum will implement these solutions. This is equivalent to building a direct communication channel between the operators of the Ethereum community and users. The thoughts of a few people are limited, but the wisdom of the masses is infinite, and the development potential of the platform is boundless.
Among these improvement proposals, a significant portion is suggestions for smart contract standards. Compared to blockchain's support for smart contracts, I actually place more emphasis on smart contract standards. If you understand some object-oriented programming, you will definitely appreciate interfaces because they can constrain and standardize a class of related behaviors. Smart contract standards are similar; guiding the development of smart contracts through universal behavior standards is a shortcut, and the landing of value networks cannot be separated from the formulation of smart contract standards. Purely theoretical descriptions may not give you a particularly deep impression; next, I will use a currently popular smart contract standard, EIP-721 (721 is the proposal number, and NFT is the name of the proposal), to illustrate the impact of smart contract standards on value networks.
The origin of this standard is the Non-Fungible Token, abbreviated as NFT. Since there are non-fungible tokens, there must be fungible tokens. Fungible tokens refer to currencies that can be substituted for each other and can be divided. For example, the 100 yuan you have and the 100 yuan I have have no difference in purchasing power, and we can also split 100 yuan into two 50s or ten 10s. Non-fungible tokens, however, are different; each NFT is unique and indivisible. You might recall the example of property mortgage we used earlier; you will find that NFTs seem to provide a solution for this. Property information has uniqueness and cannot be replicated or split. If we can mark property ownership through NFTs, then we achieve anchoring of property in the virtual world. Once the anchoring relationship is established, we can transfer property on the blockchain network, and this process will be as convenient as making a transfer. Moreover, the application scenarios of NFTs are not limited to this; we can use NFTs to bind various assets, such as a photo, a painting, a song, a piece of text, a ticket, etc. Any entity with asset attributes can be registered as an NFT. For example, the CEO of Twitter wants to auction the first tweet he posted through NFT.
With NFTs, we can anchor the value of the virtual and the real. Of course, this does not mean that NFTs are the only means to achieve value networks; the construction of value networks relies on the support of all smart contract standards. However, the formulation of standards is not something that can be accomplished overnight. First, a blockbuster application must appear in the market, attracting everyone to imitate due to its innovative model, further crudely pushing such applications to a peak, followed by the bursting of the bubble. When market sentiment calms down, some people will take the time to think about the deeper logic behind the bubble, abstracting the common behavioral patterns contained in the applications, and summarizing them into EIP proposals, ultimately providing standardized solutions for similar application scenarios in the future.
The Metaverse and Value Networks#
If, like me, you were once a rebellious teenager who loved reading, you must have read fantasy novels. When I was in middle school, I was very obsessed with the online game series, deeply attracted by the immersive gaming pods depicted in them, sincerely hoping to experience it in my lifetime. I once thought this was just a beautiful fantasy, but unexpectedly, the metaverse might turn all of this into reality. The first time I heard the term "metaverse," I thought it was a name created by some overly imaginative young person. However, after learning more, I realized that it is actually a multiplayer online virtual world depicted in the science fiction novel "Snow Crash." This sparked my interest; the metaverse represents an immersive virtual world where players can engage in cultural, social, entertainment, and other activities. Compared to traditional games, the metaverse has characteristics such as a reliable economic system, virtual identities and assets, stronger social interaction, immersive experiences, and open content creation. It can be said that in the metaverse, except for material things being virtual, other aspects are no different from the real world.
Like blockchain technology, the metaverse is not constructed from a single technology but is a virtual world formed by the integration of four major technologies: blockchain, gaming, networking, and display.
Blockchain technology provides a decentralized asset trading platform for the metaverse, while NFTs/DeFi and other smart contracts serve as the medium for players' virtual assets; games provide interactive content for the metaverse; and 5G networks provide reliable guarantees for data transmission.
If we set aside blockchain technology, the metaverse merely represents a more realistic immersive gaming environment than traditional ones and cannot be considered a virtual world. However, with the support of blockchain technology, it can ensure the security of players' virtual assets and identities, thus achieving open and transparent asset trading in this world, realizing the transfer of value. In this sense, blockchain is the key to making the metaverse a virtual world.
Our obsession with the metaverse is not only because it points the way for the next stage of the gaming industry but also because it provides a reference for the value network in the real world. Exploring the metaverse is essentially exploring value networks. It can be said that, just like the process of formulating smart contract standards, the current metaverse is the blockbuster application we are looking forward to. As it continues to develop, bubbles will inevitably follow, and after the calm, abstract thinking will eventually provide standardized solutions for building value networks.
Summary#
Without blockchain technology, smart contracts would still just be a concept. Without the support of smart contracts, blockchain would be unable to fully showcase its capabilities. Ethereum is the one that "introduced" smart contracts to blockchain, and it is not only the talent scout for smart contracts but also provides standardized ideas for their promotion. It can be said that the standardization of smart contracts is the prerequisite for the true landing of value networks. NFTs provide a feasible solution for anchoring real assets and virtual values, achieving a key step in the landing of value networks. Although we are still far from value networks at this stage, we can glimpse the ideas for the landing of value networks through the development process of the metaverse.
Blockchain is a chain structure formed by the interconnection of block hashes, with the main purpose of maintaining the continuous integrity of data, allowing the blockchain to retain its immutability while gaining traceability. Data archiving is the process of clearing previous data starting from a certain historical block in the blockchain, while the subsequent blocks maintain their original order. Therefore, the essence of data archiving is to sacrifice some traceability while maintaining immutability. However, if we can provide a way to read the archived blocks again after clearing the historical data, then data archiving is actually a relatively excellent solution.#
If I am just a holder of Bitcoin, I only care about my balance and historical transaction records; I do not care about any other information. Therefore, I do not need to synchronize 350GB of historical data for this small requirement, as this data is meaningless to me. However, for some large nodes, they can synchronize historical data and use this data as production materials to provide related services, such as mining nodes and the nodes behind blockchain explorers. Public chains generally adopt this classification scheme for node roles.
The Bitcoin network currently has about 10,000 full nodes that maintain complete blockchain data, while holders only need to run a wallet node on their phones to conduct transactions. It can be seen that although the classification of node roles is an effective means to alleviate storage redundancy, it is based on sacrificing decentralization, which is a compromise to some extent.
After interpreting two different solutions for storage redundancy, we can see that we cannot fundamentally solve the problem of storage redundancy; we can only alleviate data growth through different technical means. Consortium chains break through by sacrificing the traceability of blockchain, while public chains do so by sacrificing decentralization. Additionally, we can also have another way to solve storage redundancy, which is to reduce the amount of data and slow down the trend of data growth. For example, storing the hash of the original data on the blockchain instead of the data itself. Of course, this is just a business-level solution, not a technical-level approach.
Quantum Computing Threats
After discussing the issue of storage redundancy, let's look at another topic that everyone is talking about: the threat of quantum computing to blockchain. We generally feel that once quantum computers mature, blockchain systems led by Bitcoin will face collapse because the computing power of quantum computers is incomparable to that of classical computers. But is this really the case? To help you better understand this issue, I think it is necessary to first explain what quantum computing is. Of course, I cannot explain complex features like superposition states/quantum entanglement because they are too profound, so I can only help you understand through an example borrowed from the "Science Voice" public account. Imagine that if you use one hand, how many numbers can you represent at the same time? Undoubtedly, you can only represent one of the ten numbers from 1 to 10 at a time, which is the basic rule of how we store data using classical computers; at any given time, a bit can only store one binary number, either 0 or 1. But if you put your hand in your pocket, how many numbers can your hand express when you take it out? The answer is definitely ten possibilities, and until you take your hand out, the number is uncertain.
The example illustrates the difference between classical and quantum computers in data storage: one stores specific values, while the other stores probabilities of values.
What is storing the probability of values? It means storing all possible values together. For example, a 5-bit classical computer can only express one number at a time, while a 5-bit quantum computer can express multiple numbers simultaneously, resulting in a performance gap of 32 times, and the efficiency of storage will also increase exponentially with the increase in bits. This is just the storage capability of quantum computers; having strong storage capability alone does not fully demonstrate the overwhelming effect of quantum computers over classical computers. More importantly, quantum computers have parallel computing capabilities. How can we describe the terrifying nature of this parallel computing? I will illustrate it with an example. Suppose you have 64 × 64 pipes divided into two columns, and only one group of pipes can connect. How many attempts do you need to find that unique group of connecting pipes? According to the classical computer's approach, we can only try one group at a time, comparing the first pipe on the left with the first pipe on the right. If it doesn't work, we keep the first pipe on the left unchanged and switch the second pipe on the right. We continue this comparison until we find the two pipes that can connect. In the worst-case scenario, we would need to try 64 × 64 times to find it. In contrast, quantum computers are different; they can represent 64 pipes simultaneously, so one attempt can find that unique matching result.
Through the above interpretations of quantum computers in terms of storage and computation, do you also have some concerns about the era of quantum computing and blockchain? However, I would say that these concerns are merely alarmist. Do you believe it? This is not only because quantum computers still exist only in laboratory environments and are far from commercialization; perhaps our generation may not even catch up. On the other hand, as quantum computing develops, will blockchain technology stand still? This is almost impossible. At this stage, we mainly worry that quantum computers can instantly mine all rewards from blockchain systems using PoW consensus algorithms led by Bitcoin, leaving other miners with no coins to mine, or that they can use quantum computers to crack Bitcoin's public and private keys, stealing other users' Bitcoins.
You can pause and think for a moment: are the concerns mentioned earlier really justified? In my view, they are also unnecessary. Not to mention the current quantum computers, whether they can perform hash calculations or crack asymmetric encryption algorithms, this reasoning also overlooks the role of blockchain protocols. Let's analyze this using Bitcoin as an example, and you will understand.
The Bitcoin protocol stipulates that regardless of how the overall network's computing power changes, the block generation speed should remain around an average of 10 minutes. If the block interval is too short, the mining difficulty will increase; if the block interval is too long, the difficulty will decrease. If quantum computers join the mining process, the block interval will inevitably shorten, thus increasing the mining difficulty. Even if the increased difficulty is not enough to defeat quantum computers, it will only result in all subsequent block rewards being obtained by quantum computers, making it impossible for the scenario of instantly mining all Bitcoins to occur.
Furthermore, why can't Bitcoin change its current cryptographic algorithms to quantum-resistant cryptography through hard forks? As quantum computing develops, blockchain technology will not remain stagnant. I believe that there will always be a way to solve the problems as they arise.
Smart Contract Security#
Finally, I want to emphasize a point about the regret regarding smart contract security. Compared to the previous two issues, which are more or less caused by the technology itself or other threats, smart contract security is entirely a human problem.
Smart contracts are program codes independently written by software engineers based on blockchain platforms like Ethereum. And we all know that there are almost no software systems without bugs in the world, and smart contracts are no exception. It can be said that there are almost daily attack incidents caused by improper writing of smart contracts, just with varying degrees of severity. For example, the DAO contract attack incident that directly caused the split of Ethereum, we can list the code with vulnerabilities in that contract, which is actually quite simple.
-
function withdrawBalance() {
-
amountToWithdraw = userBalances[msg.sender];
-
if (!(msg.sender.call.value(amountToWithdraw)())) { throw; }
-
userBalances[msg.sender] = 0;
-
}
Combining the code, let me explain it to you. The intention here is that if a user wants to withdraw funds, they first pay the user and then clear the user's balance. However, this payment step will recursively call this method, causing the logic to get stuck at the payment step, and the user's balance will never be cleared, leading to tragedy. In fact, the solution is also very simple: just swap the order of the two actions.
Therefore, if you have the opportunity to write smart contracts in the future, you need to always remember to understand every line of code you write; otherwise, a vulnerable smart contract is a generous gift to hackers. In summary, through three examples, we have shown that although blockchain technology has developed for more than a decade, there are still immature aspects. However, the emergence of problems only represents the current predicament and does not mean that it cannot be changed. Technology is constantly iterating forward, and each technological update serves to solve practical problems. The current regrets do not imply that a better future is impossible.
Currently, in the blockchain industry, I believe the market is focusing on the following directions:
-
- More powerful performance: The transaction costs of Bitcoin are getting higher and higher, which essentially means that the performance of the database is getting worse.
-
- Truly solving more practical problems: Currently, people's understanding of blockchain is limited to Bitcoin, and there has not yet been another phenomenal application.
-
- More user-friendly: Currently, using blockchain applications has a learning cost, making it impossible for novices to use.
Blockchain + E-commerce#
Through the previous interpretations, we have established a preliminary understanding of the concept of industry chains. Next, I will envision how the industry chain operates from the perspective of blockchain + e-commerce. Before starting the specific content, I need to give you a precautionary note: the following content is purely my own brainstorming, and in the future, it is unlikely that a decentralized e-commerce industry chain will emerge, as it involves too many aspects. The reason I still want to discuss it is that the development of blockchain, or the promotion of blockchain, cannot rely solely on sticking to conventions; it should be bold enough to break through thinking patterns. I hope you will also not be bound by reality.
What is the main difference between the e-commerce industry chain and traditional e-commerce? In one sentence, it can be summarized as using the standardized thinking of smart contracts to regulate the processes of the e-commerce industry. How do we understand this? We can abstract the processes in the e-commerce industry to form smart contract standards, with each standard having different implementations. The final effect is that participants in the e-commerce industry can freely assemble different smart contract implementations according to their roles.
For example, users can use Alibaba's product aggregation service to search for products and place orders, use WeChat Pay for payment, and specify JD Logistics for delivery. Hearing this, are you a bit looking forward to it? To give you a more intuitive feel, I will take books as an example to explore the design ideas of the e-commerce industry chain. Currently, the most successful case of blockchain application in the e-commerce field is undoubtedly the product traceability platform. Utilizing the immutability and traceability features of blockchain technology, it can achieve full lifecycle traceability of a specific product from raw materials to the hands of users. Based on the product traceability application, we can analyze the characteristics of the industry chain along the traceability line.
Among these, the process of book printing can be abstracted into a product production smart contract. If you still remember the NFT in the smart contract chapter, you will find that there is already a path for implementation. In the process of the industry chain, every time a producer produces a product, they will upload the corresponding asset information of that product (including but not limited to basic product information, such as asset code, factory price, category, etc.) to the blockchain, and then the product can enter the market for circulation. This is the most fundamental difference between the industry chain and the existing product production model. The existing product production is based on assembly lines for batch production; for example, producing 10,000 copies of a book, each book may have the same product code. However, what circulates on the industry chain is not product categories but independent unique products, achieving one product per code.
What are the benefits of this? The industry chain is a precursor to value networks, and if the assets circulating on it are also assets, it will facilitate the subsequent popularization of value networks. So, in simple terms, the industry chain is a value network for a specific industry, while the value network is the collection of all industry chains.
Channel Distribution After the publisher prints the books, they can distribute different quantities of books based on different e-commerce channels. In the industry chain, this process is not necessarily required. Why is that? Through the previous interpretation of the industry chain, we can see that the industry chain is actually weakening the presence of enterprises, which means diminishing the concept of channels. Channels mainly serve to establish a buying and selling platform between product producers and users, while the decentralized characteristic of blockchain is precisely inhibiting intermediaries.
At this stage, we can abstract a product distribution smart contract, which can be simply understood as wholesale. Off-chain, it is transferring products from producers to channel warehouses, while on-chain, it is transferring the ownership of products from producers to channel merchants. However, it is important to note that the off-chain storage turnover of products must be coordinated with the on-chain transfer of ownership. What does this mean? Every product in the industry chain has asset attributes and cannot simply be allocated by quantity; instead, the on-chain assets must match the off-chain products, otherwise, there is a risk of product mixing.
After the publisher uploads the asset information of the books to the blockchain, the information of those books already exists in the industry chain. However, for users, searching for books among the numerous products in the industry chain is undoubtedly like finding a needle in a haystack. Therefore, it is necessary to abstract a product aggregation search smart contract standard to facilitate users in browsing and searching for products. The role of the platform is to present the products in the industry chain to users based on the aggregation search contract. Users can also customize contracts according to their needs, such as only focusing on specific categories of books.
At this point, you might have some doubts; this seems not much different from the current e-commerce platform and does not provide any new experience? Let's illustrate this using the previous issue of big data killing familiar customers. The reason why current e-commerce platforms can exploit familiar customers is mainly due to inferring users' purchasing intentions based on their historical purchase records, and then using the information asymmetry between the platform and users, along with users' reliance on the platform, to lead users to buy products at higher costs without them realizing it.
However, the e-commerce platform supported by the industry chain is different. First, users' historical purchase records are personal privacy; users can use smart contracts to authorize data selectively. Before users authorize the platform to access this data, the platform cannot know users' purchasing tendencies in advance and thus cannot make product recommendations. Additionally, the information about products should be public on-chain, and anyone can view it, so there is no information asymmetry between the platform and users. The platform merely has the technical capability to aggregate products from the complex product database. If users are dissatisfied with the platform's aggregation service, they can switch to another smart contract implementation at any time, abandoning the original platform. In other words, the e-commerce industry chain is strict with platforms but can provide a better experience for users.
Book Ordering Once users find the books they want, it is time to place an order and make a payment. At this stage, we can abstract many smart contract standards, such as inventory deduction contracts, payment contracts, logistics contracts, etc. In fact, all these contracts belong to the same order contract. From a logical perspective, the essence of purchasing a product is that the user is signing a contract agreement with other participants. The content of the contract can be understood as follows: if the ownership of the product is transferred to the user, and then a specific logistics company is chosen for delivery, the user will release the currency locked in the contract according to the product price, distributing it to the platform, the merchant, and other participants.
Through the explanation just now, I briefly outlined the process from book production to user ordering, describing my understanding of the e-commerce industry chain ecosystem. The content described is not professional, and you do not need to get too caught up in the details, such as how products are allocated or how returns and refunds are handled. You just need to understand that in the e-commerce industry chain, product information is shared within the same blockchain network, and the competition among e-commerce platforms gradually manifests as the aggregation, classification, and differentiated marketing of on-chain data.
-
Currently, enterprises with certain business interactions are gradually forming small-scale consortium chain pilot projects. Once new business models are formed, other enterprises will inevitably join, and small alliances will gradually grow, forming a business alliance chain led by core enterprises.
-
As the business alliances of core enterprises further expand, multiple consortium chains surrounding several core enterprises will also unify into a super industry chain due to business needs and regulatory requirements. Its final form will be the standardization of business processes and the realization of differentiation, and competition among enterprises will gradually transform into differentiated service competition under the same data conditions.
ERC-20, ERC-721, ERC-1155, ERC-4626 Token Standards and Composability#
ERC-20, ERC-721, ERC-1155, ERC-4626, and other Ethereum token standards—do you know what these various token standards are? Why are these standards important? Are you interested in understanding what purposes these token standards serve? Want to know the overall situation? This long post will answer your questions.
-
Ethereum is a world computer. It is a shared resource maintained by a network of anonymous and untrusted nodes; consensus is reached, and the security of the network is economically guaranteed. The Ethereum network provides trusted neutrality, allowing anyone to build independently and collaboratively on it.
-
Application Programming Interface (API) is a mechanism for different programs to communicate and for developers to coordinate. Developers will hide the internal workings of their programs as much as possible. Communication is simplified and improved to the maximum extent.
Basic Computer Science: API Explained
From the perspective of abstraction, an API is the most common manifestation of abstraction in the real world. An API is a set of defined rules that explains how programs/applications communicate with each other.
For example, let's imagine that an e-commerce website has a price bot; users provide the characteristics of fruits to the price bot, and the bot returns a price information.
To integrate information with the price bot, you need to give it an object (fruit) information and receive a price. Therefore, first, you need to package all the object information: fruit_a = [apple, red, 200g, harvested 3 days ago].
We need to feed this information to the price bot. First, we call (price_bot), and then we let price_bot use the calculate_price function to calculate fruit_a, giving us a price, i.e., price_bot.calculate_price(fruit_a).
The price bot will calculate the price as best as it can. As users, we do not know or care about the calculations happening behind the screen; we only know that the final price bot will give us a price, i.e., price_bot.calculate_price(fruit_a) = price_fruit_a.
This is the API of the price bot: a list of functions supported by the price bot and instructions on how to use them. This is a schematic diagram, allowing developers to integrate it without mastering the application.
Assuming this example is real, the documentation for this API should look like this:
-
Integration Protocols
-
Transfer Assets
-
Build Composable Investments
-
Borrow, Lend, and Mortgage Assets
Basically, everything happening on-chain is either an API or directly integrated by an API. In fact, you can view different types of token standards as a piece of code that conforms to a specific API template. If a smart contract follows a specific template, then it is that type of token. (https://t.co/GoMlbfN9Vq)
ERC-20
This is the ERC-20 token template. To generate an ERC-20 smart contract, developers need to create code to execute all the methods and events below. All ERC-20 contracts support these functions; a developer (different) can rely on these to use any ERC-20 contract.
ERC-20 is the most basic token standard, representing the majority of valid tokens currently. It includes governance tokens, ve-tokens (voting escrow tokens), stablecoins, etc. (ETH is not an ERC-20 token.)
ERC-721#
ERC-721 tokens are generally referred to as NFTs (Non-Fungible Tokens). These tokens (usually) represent unique or identifiable items within collectibles, including PFPs, art collectibles, properties, etc.
ERC-1155#
The ERC-1155 token standard combines the features of ERC-20 and ERC-721 token standards, providing a single interface to manage any combination of these token types. This can serve as a more modern alternative to ERC-20 and ERC-721 and has unique functionalities for gaming.
ERC-4626#
ERC-4626 is the latest token standard, describing yield-bearing vaults. This standard provides a common interface for ERC-20 tokens deposited into (or redeemed from) vaults to earn yields. This can include liquidity mining and aggregation but can also be applied to more fields.
ERC-777#
ERC-777 is a highly configurable but rarely used token standard. It provides an upgrade to ERC-20, allowing developers to attach code that runs when sending and/or receiving tokens. Although it is included in https://ethereum.org, we rarely see ERC-777 in practice.
Computer science is magic, developers are magicians, and abstraction is the spell. Composability is the goal.
The concept of abstraction supports the world computer in showcasing its most important capabilities.
DeFi is a better way—composability: the ability to combine two independent things so that their combination exceeds the sum of their parts.
Native ETH ..... ERC-20, 721, 777, 1155, 4626...... With each ERC addition, we are becoming more advanced. Each type of token can have more functions, and each monetary Lego will bring more value.
Each addition of an ERC standard brings us closer to programmable money. Programmable money is a brand-new concept. If money is concrete and programmable, like Lego, each protocol absorbs plastic (i.e., value) and produces Lego blocks (usually money). These blocks can combine with other blocks to create customized, brand-new things.
The token standards indicate how composability is manifested on Ethereum. Oh dear...... we are all builders! Otherwise, what do you think we would build???
: Composability. The reason we have composability: abstraction. The purpose of abstraction: managing complexity. What is the significance of managing complexity? — Changing the world.
- Original link: https://twitter.com/salomoncrypto/status/1553982842232717313?s=21&t=xhVu0_-rNMJ-q4nojhiyMQ
The most powerful cryptographic technology that has emerged in the past decade is probably universal succinct zero-knowledge proofs, commonly known as zk-SNARK (zero-knowledge succinct arguments of knowledge).
zk-SNARK allows you to generate a proof (this proof is derived from a specific output of some computation, which can be used to verify that computation) in such a way that even if the underlying computation is very time-consuming, the proof can be verified quickly. The "ZK" (zero-knowledge) adds an extra feature to the proof: it can hide certain inputs of the computation.
For example, you can generate a proof for the statement: "I know a secret value, and if you add a number to the chosen word 'cow' and then perform SHA256 hashing a hundred million times, the output hash will start with 0x57d00485aa."
In the blockchain field, this technology has two major application scenarios:
-
Scalability: If the verification of a block is very time-consuming, someone can verify the block and generate a proof, while others only need to quickly verify the proof.
-
Privacy: You can prove that you have the right to transfer certain assets (you received the asset, and you haven't transferred it yet) without revealing the source of that asset. This does not leak the information of the two parties in the transaction, ensuring the security of the transaction.
However, zk-SNARK is quite complex; in fact, from 2014 to 2017, they were often referred to as "moon math." The good news is that since then, the protocols have become increasingly simplified, and our understanding of them has deepened. This blog post will attempt to explain how ZK-SNARKs work in a way that an average person with a basic math level can understand.
In the blockchain field, this technology has two major application scenarios:
-
Scalability: If the verification of a block is very time-consuming, someone can verify the block and generate a proof, while others only need to quickly verify the proof.
-
Privacy: You can prove that you have the right to transfer certain assets (you received the asset, and you haven't transferred it yet) without revealing the source of that asset. This does not leak the information of the two parties in the transaction, ensuring the security of the transaction.
However, zk-SNARK is quite complex; in fact, from 2014 to 2017, they were often referred to as "moon math." The good news is that since then, the protocols have become increasingly simplified, and our understanding of them has deepened. This blog post will attempt to explain how ZK-SNARKs work in a way that an average person with a basic math level can understand.
- Note that we will focus on scalability; once scalability is achieved, the privacy of these protocols becomes relatively easy to implement, so we will return to this topic at the end.
Why ZK-SNARKs are "hard"
Using the initial example: we have a number (we can encode the entire "cow" with the secret input at the end as an integer), we calculate the SHA256 hash of that number, then repeat it 9,999,999 times, and finally check the output's beginning. The computational load here is particularly large.
A "succinct" proof means that the growth of the proof size and verification time is much slower than the growth of the computation that needs to be verified. If we want a "succinct" proof, we cannot require the verifier to perform some computations at each round of hashing (because that would make the verification time proportional to the computation load).
Instead, the verifier must check the entire computation process in some way without peeking at each part of the computation. One natural technique is random sampling: let the verifier check the correctness of the computation at only 500 different points. If all 500 checks pass, we assume that the rest of the computation process is likely correct.
This process can even be transformed into a non-interactive proof using the Fiat-Shamir heuristic: the prover computes the Merkle root of the computation process, randomly selects 500 indices based on the Merkle root, and provides the corresponding 500 Merkle branches.
The core idea is that the prover does not know which branches will be revealed until they have already made a "commitment" to the data.
If a malicious prover tries to tamper with the data after learning which indices need to be checked, it will change the value of the Merkle root, leading to a new set of random indices being selected, which will require tampering with the data again... trapping the malicious prover in an endless loop, unable to achieve their goal.
But unfortunately, simply randomly checking the computation process has a fatal flaw: the computation process itself is not robust. If a malicious prover flips a bit at some point in the computation, it can lead to a completely different result, and the random sampling verifier will almost never catch it.
Just one deliberate insertion error can lead to a completely wrong result, which will almost never be captured by random checks.
If one were to propose a zk-SNARK protocol, many would reach this point and then get stuck, ultimately giving up. How can the verifier check each segment of the computation without looking at each one individually? It turns out there is a brilliant solution.
Polynomials
Polynomials are a special class of algebraic expressions with the following form:
We can simply repeat the above game with SS, gradually "lowering" the degree of the polynomial we care about until the degree of the polynomial is low enough for us to check directly.
Can we review this again??
The three best types of polynomial commitments are FRI, Kate, and BulletProof.
Kate is conceptually the simplest, but it relies on very complex elliptic curve pairing "black boxes."
FRI is cool because it only relies on hashing; it works by gradually reducing the polynomial to lower-degree polynomials and performing random sampling checks at each step, using Merkle branches to prove equivalence.
To prevent the size inflation of individual values, we do all arithmetic and polynomial operations over finite fields (usually modulo some prime).
Polynomial commitments naturally support privacy protection because the generated proof is already much smaller than the polynomial, so polynomial commitments can only expose a little information about the polynomial. However, we can add some randomness to the polynomial, reducing the exposed information from "a little" to "zero."
What problems are still under research?
-
Optimizing FRI: Many optimizations involving carefully chosen value domains, "DEEP-FRI," and a series of other techniques to make FRI more efficient are being researched by Starkware and others.
-
Better methods for encoding computations as polynomials: Finding the most efficient way to encode complex computations involving hash functions, memory access, and other features as polynomial equations remains a challenge. Significant progress has been made in this area (e.g., see PLOOKUP), but we still need more progress, especially if we want to encode general virtual machine execution as polynomials.
-
Incremental verifiable computation: It would be great if we could efficiently "expand" proofs as computations continue. This is valuable in the "single prover" case and also in the "multi-prover" case, especially for blockchains where different participants create blocks. Recent work in this area can be found in Halo.
-
Original link: https://vitalik.ca/general/2021/01/26/snarks.html