Bitcoin: A Peer-to-Peer Electronic Cash System (Translation + Annotations)

Author: Satoshi Nakamoto Translator: Jin Xiao Abstract: A purely peer-to-peer version of electronic cash allows online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide a partial solution to this scenario, but if a trusted third party is still needed to prevent double spending, then existing solutions are still lacking in major benefits (not good enough). This network timestamps transactions by hashing them into an ongoing chain of proof-of-work (PoW) based on random numbers. The longest chain not only provides proof of the sequence of events observed but also proves that this chain comes from the largest pool of CPU computing power. As long as the majority of CPU power is not controlled by nodes with malicious network properties, they will grow to become the longest chain and surpass those attackers.

The network itself requires very little structure. Information is broadcast in the most efficient way, and nodes can leave or join the network at any time. These nodes accept the longest PoW chain as proof of what happened while they were away from the network. The abstract points out that existing technology makes sending a transaction via P2P reliable, which is done through digital signatures, implying that Bitcoin uses digital signatures to ensure transaction delivery. However, it then points out that digital signatures cannot solve double spending because a digital signature can only guarantee that the item belongs to the sender, but there is no way to ensure that the item, once sent to one person, will not be sent to others since data can be copied. The existing solution is for both the sender and receiver to trust a third party as an intermediary. This third party records that A has an item and gives it to B, completing a transaction. This means that this third party has absolute authority, and A and B can only transact because they "trust" this third party. It then gives a broad introduction to the blockchain system, highlighting several key points: timestamps, hashes as identifiers, the use of hashes in proof of work, and the chain coming together. Providing the sequence of events indicates that blocks are "records of history," and the chain coming from the largest CPU power indicates that it is unique and cannot be overturned under the rules of CPU proof of work. Finally, it states that the blockchain structure is streamlined, and nodes are flexible, implying that this is a smart system of multi-node intelligent cooperation (even though each node is limited in intelligence, all nodes follow a simple set of rules, allowing the entire system to exhibit tremendous power).

Introduction Most trade on the internet relies on a financial institution to provide third-party trust for electronic payments. Although this system works well for most transactions, it still faces inherent flaws based on trust models. Complete irreversible transactions are not truly possible because financial institutions cannot avoid dispute resolution. The cost of coordinating mediation increases transaction costs, limits the minimum practical transaction size, and cuts off the possibility of small, ad-hoc transactions. Furthermore, there are greater costs associated with irreversible payments for irreversible services. Because of the possibility of revocation, the need for trust is dispersed.

Merchants need to be vigilant about their customers, so they collect more information than necessary. A certain percentage of fraud is seen as possible. These costs and uncertain transactions can be avoided when a person is transacting with physical cash, but there is no existing mechanism to ensure that transactions are conducted through a communication channel without a trusted party.

This section reflects that blockchain is designed to solve the trust issue, as the existence of third-party institutions creates a naturally occurring problem of trust. Existing mechanisms incur significant costs in maintaining this trust. On the other hand, blockchain embodies irreversibility and immutability, indicating that these two characteristics are inherent to blockchain. More precisely, this reflects the disruptive reason for the emergence of blockchain. Looking back at current technological developments, the computer has led the third information revolution, enabling information to flow quickly across networks. However, while the internet allows for rapid information flow, it cannot achieve material transfer similar to that in the real world. This is because information is virtual, while material in reality is a tangible entity. It is precisely because the internet is virtual that if an object is simulated in a virtual world, it can be copied without limits, but it is impossible to simulate the transfer of material in reality. For example, if A wants to give B an item, in the real world, it is a tangible transfer of an item, while in the virtual world, it is a record of a message. The question arises: is this message recorded by A, by B, or by a third party? Clearly, whether recorded by A or B, it is impossible for both A and B to simultaneously agree, as they can both privately alter the record and breach the contract, leading them to seek a third party for assurance (such as a notary or intermediary). A and B both entrust their trust to this third-party institution for assurance, allowing the entire system to operate together.

Thus, current simulations introduce a trusted third-party institution, where people entrust their trust to this third party and believe that this third party guarantees the uniqueness and transferability of the simulated item (non-replicable). Whether this third party is truly trustworthy is the cost of operating this system. This is the "natural flaw" of the current system. Therefore, if one wants to use the internet to simulate the transfer of material in reality, linking real-world items to virtual items without needing a third party as a guarantor of this transfer, then the blockchain mechanism must be introduced. Blockchain allows material transfer to be as quick and convenient as information flow, while being guaranteed by the collective participation of the entire network (the collective assurance of all participants is equivalent to a naturally existing non-falsifiable guarantee, unless 51% of the participants collude to breach the contract). I believe this is the core reason why "blockchain" technology is expected to lead the "fourth technological revolution."

What is now needed is an electronic payment system based on cryptographic proof to replace trust, allowing any two parties to transact directly without needing a trusted third party. Computationally guaranteed irreversible transactions will protect sellers from fraud, and conventional escrow mechanisms can easily be implemented to protect buyers. In this paper, we propose a solution to the double spending problem using a P2P distributed timestamp service to generate computational proof of transaction order. The entire system is secure as long as honest nodes control more CPU computing power than colluding attack nodes. It points out that, in fact, the trust issue has shifted from the original state coercion (banks), corporate sizes (WeChat, Alipay), or other trust transfers to the mathematical system of "cryptographic encryption," becoming a question of whether to believe that mathematics is reliable.

However, the wording in the latter part actually implies that the blockchain system is inherently more advantageous for the payer than for the payee, indicating that the payee needs to be concerned about the payer's attacks (double spending). Finally, it states that blockchain solves double spending, with the basic topology being P2P, and timestamps being the method for confirming transaction order.

Transaction We define an electronic coin as a chain of digital signatures. Each owner transfers the coin to the next person by digitally signing (private key signing) the hash of the previous transaction and the public key of the next person, appending this signed result to the end of the coin. A recipient can verify this signature to ensure the chain of ownership. This statement in the original text is somewhat counterintuitive. The original text states, "We define an electronic coin as a chain of digital signatures," meaning "define coin as chain." This is a strange concept because intuitively, a coin should be individual, while a chain is a series, making it difficult to relate a coin to a chain.

However, Bitcoin is essentially a chain, formed by a series of transactions in the order they occur, while the coin itself does not exist; the coin is inferred from the transactions on the chain and does not appear directly. At the same time, according to the previous description, giving a coin to the next person is done through digital signatures, meaning that the source and destination of the coin are proven through public and private keys. Suppose A wants to give B a coin; this process becomes a transaction (as shown in the middle Tx of the diagram), which records how much coin B receives and B's public key (indicating the destination), while also providing A the ability to operate on the previous transaction (for example, the first Tx in the diagram, which records that X gave A some money, meaning A can operate on the output of the transaction from X to A). After hashing these two together, the payer A signs this hash with their private key and adds it to the transaction. This transaction is then widely broadcast to other places. At this point, as an observer (miner), one has all previous transactions before the first transaction locally, so when seeing this transaction (the middle Tx), one can use the public key of payer A (which refers to the first transaction, A's public key obtained from the first transaction) to verify the signature of this transaction (the middle transaction) to confirm whether this transaction was initiated by payer A (since only A holds the private key, only a signature signed with A's private key can be verified against the previous transaction, proving that A can operate on the first transaction and possesses the money transferred from X to A). This thus proves that this transaction was initiated by payer A.

Let’s take the middle Tx (transaction) as a reference. This Tx is initiated by Owner 1, as can be seen from the diagram, the hash input is the previous Tx (prev Tx) and the public key of the next recipient (next owner's pub key). This hash result is signed by Owner 1's private key and appended to the current transaction. Do not connect this with the next Tx yet; when Owner 1 finishes signing this Tx, they broadcast it. Once a miner packages this transaction into a block and chains it onto the honest chain (the longest chain), it indicates that this Tx is established. Therefore, at this point, Owner 2 will not know immediately whether the transaction was successful, but will need to wait a while to "check" (this is a simplistic way of saying it; it can also mean there is something to poll, and as long as the transaction is packaged, it will notify Owner 2) whether this transaction has been packaged; if so, it represents that the transaction was successful. The problem with this process is that the recipient cannot verify that the payer (one of the owners) has not double spent the currency.

A common solution is to introduce a trusted central authority or mint to check each transaction for double spending. After each transaction, the currency must be returned to the mint to issue a new coin, and only currency issued directly from a trusted mint can ensure it is not double spent. The problem with this solution is that the fate of all currency relies on the operation of this mint, and every transaction must go through them, just like a bank. The term "mint" here is initially difficult to understand; it is actually a very simple phenomenon, but this process is so simplistic that no one notices that the current system operates this way. For example, in today's online banking (or Alipay), when A transfers 100 yuan to B, the bank actually opens a transaction, deducting 100 yuan from A's account and adding 100 yuan to B's account. This -100 and +100 are essentially the bank's "mint," destroying A's 100 yuan and then producing 100 yuan for B (though the entire process is only reflected in the flow of information).

Because of this, it is necessary to ensure that this "currency" can only be issued from this "mint" (bank), and that this "mint" is mutually trusted by both parties, so that every transaction "must" go through this "mint" with no other avenues. The existing system is also guaranteed by the bank's "mint" (the only trusted party) to ensure that when A's account is -100, only B's account is +100 and C's account does not also become +100 (double spending). (Of course, if the bank colludes with A secretly, when A transfers 100 yuan to B, C's account could also be +100, resulting in double spending. While it seems unlikely to happen now, this is because the bank's own "supervision" system ensures it, such as daily reconciliations, internal bank supervision, etc. The cost of this supervision is the maintenance cost of this system. The reason the bank can know whether a sum of money has been double spent is that the bank possesses all "historical transactions" prior to the current transaction, and its method of verifying whether a transaction is legitimate is to check whether the results of all previous transactions meet the requirements of the current transaction. We need to give the payee a way to know that previous owners have not signed earlier transactions.

For this purpose, the earliest transaction is crucial, so we do not need to worry about whether subsequent transactions attempt to double spend. The only way to ensure the existence of a transaction is to have the ability to query all transactions. In a mint model, this mint possesses all transactions and decides which transaction arrived first.

To accomplish this without a trusted party, transactions must be publicly announced, and we need a system that allows all participants to reach consensus on a single chain of historical order (which they were received).

The payee needs to prove that during each transaction period, the majority of nodes agree that this was the first received. It points out that first, the payer is the one who would commit double spending, while the payee is concerned about whether the payer will do this (obviously, because the benefit is transferred from the payer to the payee). "The earliest transaction is the only important one" means that the basis for double spending is that a transaction must have already occurred, and then the payer wants to ignore this transaction and initiate another transaction using the currency that has already been used. Therefore, this earliest transaction refers to "the transaction that has already occurred." The mint does not care which of the two (or more) transactions the payer believes occurred first; it only cares whether the transaction initiated by the payer is "legitimate" (i.e., whether this person has already used this money). Therefore, it is the mint that confirms which transaction is first, not the payer who decides which is first. At this point, if this mint does not belong to any third party, then this transaction must be "widely announced," allowing everyone involved to know that this event has occurred and that everyone knows all previous "historical events" to verify whether the currently broadcast transaction is legitimate. However, we know that achieving consensus among all participants is a very difficult task, which is the problem that blockchain solves. The last sentence states that the payee needs a majority (over 51%) to consider that transaction legitimate for it to be deemed legitimate.

Timestamp Server We propose a solution starting from a timestamp server. The job of a timestamp server is to take a hash of a block formed from a set of data (items), timestamp it, and widely broadcast this hash, similar to posting in a newspaper or global news group (Usenet). This timestamp proves that at that time, this data must have existed; obviously, to obtain this hash (it can only be done at that time). Each timestamp includes the previous timestamp in its hash, forming a chain, with each new timestamp reinforcing all previous timestamps. This section begins to explain the work done by miners. This set of data refers to many transactions, which are then packaged into a block, timestamped, and hashed to ensure the chronological order, which is to complete the "ensuring historical order" introduced in the second part. Because obtaining this hash is linked to the timestamp, this hash indicates time.

Since these blocks form a chain, and the growth of blocks is proof of CPU power (which will be described later), the relationship is that hash -> reflects the timestamp -> hashes are linked together -> a new block on the chain requires CPU -> thus, the previous hash must be correct (constantly reinforcing credibility).

This diagram illustrates the connection between time and blocks, blocks and hashes, and how these hashes are linked together due to the nature of the blockchain, making the previously linked timestamps immutable.

4. Proof of Work#

A P2P-based distributed timestamp server will require a proof-of-work system similar to Adam Back's Hashcash, rather than using newspapers or global news groups. The proof-of-work mechanism introduces scanning for a value when hashing, for example, starting from a string of 0 bits under SHA-256. The average work required is the exponent of the number of required 0 bits, and it can be verified by executing a single hash. This section is another core part of the Bitcoin blockchain, widely known as PoW. In the previous chapters, it was mentioned that first A and B had a transaction, then this transaction was broadcast. Since there is no third party, the participants are all nodes in the network. These nodes, upon receiving this transaction and many others, package them into a block and hash it. Now the question arises: if these observing miners each receive many transaction messages and package their own blocks, how can consensus be reached across the network?

Assume miners C, D, and E all received transaction messages, and because the transaction messages they received are not completely consistent and the times they received them are also not completely consistent, the resulting block hashes will certainly differ greatly. Anyone with P2P experience knows that at this point, it is necessary to ensure that C, D, and E reach consensus so that the entire network recognizes the "historical order of events" that occurred in the blockchain. Satoshi Nakamoto's method here is to introduce PoW to allow C, D, and E to use "CPU power" to probabilistically succeed in competing for the right to record the block (commonly known as "accounting rights"). The term "scan" in the original text refers to brute-force enumeration. For SHA-256, achieving a certain condition can only be done through brute-force enumeration, and clearly, the speed of "brute-force" is determined by CPU computing power, while the scale of brute-force is the "difficulty" of this PoW mechanism.

Enumerating this value is a "probabilistic" event (imagine how one would brute-force crack an unknown password for a compressed file), but because it requires a considerable number of enumerations, the final average result is indeed measurable (this is the expected value of probability; in some mining pools, this is approximately referred to as the lucky value? Not too sure). For our timestamp network, we implement this proof of work by adding a nonce to the block until a value of 0-bits required for the block's hash is found.

Once CPU efficiency is spent as proof of work, this block cannot be changed to redo the corresponding work. After the block is chained to the one following it, changing this block will require the work needed for all blocks following this one. The way to "guess" the value that meets the requirements for this block is to change the nonce here. (Because for hash functions, any small change will result in a completely different outcome. The way to obtain a satisfactory value is to continuously change the nonce so that the entire block's hash meets the difficulty requirement.) Therefore, this nonce represents the labor (CPU power, electricity costs) "proof" expended by this miner to seize the "accounting rights" for this round of blocks. Thus, the nonce can be directly viewed as "proof of labor (work amount)." The latter part explains that if there are changes in the block, then this hash will change, requiring the block's nonce to be recalculated. In other words, to change the previous block, one must compute all the blocks required for this round (exerting all necessary labor) to accomplish this.

This diagram is similar to the previous one, but here it emphasizes the content of each block, indicating that a block contains the hash of the previous one (Prev Hash), and the Nonce is a part of the block. Once the Nonce, Prev Hash, or Tx is changed, the hash of this block will also change, and all subsequent blocks will need to be changed. Proof of work also solves the problem of deciding representation in majority decisions. If the majority only votes based on one IP (one-IP-one-vote), this could be undermined by anyone who can allocate many IPs. Proof of work is essentially one CPU one vote. The main (majority) decisions are represented by the longest chain, which has the most work invested in it. If a majority of CPU power is controlled by honest nodes, then the most honest chain will grow the fastest and surpass any other computational chain.

To change a past block, an attacker would need to redo the proof of work for that block and all blocks following it, and then catch up and surpass the work of all current honest nodes. We will later demonstrate that the likelihood of a slow attacker catching up will decrease exponentially as subsequent blocks are added. This section adds that PoW is essentially a collective decision-making representation problem, equivalent to the problem of seizing accounting rights, because seizing accounting rights means selecting a representative.

This section also mentions why Satoshi Nakamoto did not consider using IP as a decision-making factor: he believed that using IP as voting rights is much easier than using CPU as voting rights (whether this is the best choice is uncertain, but it currently seems to be the best; however, current CPU voting rights are controlled by "mining pools," which has led to some instability to a certain extent (but still much better than controlling IP). Of course, some mining pools publicly declare that their computing power does not exceed a certain value to maintain the stability of the entire system (I will add the link to this news once I find it)).

The article then emphasizes again that to change a block, an attacker will incur a significant cost, and the honest chain, due to the rules it advocates, will naturally become the longest chain in the game (which will be described later). To compensate for the continuously increasing hardware speed and the varying interests of operating nodes over time, the difficulty of work is determined by a moving average target, which is the average number of blocks per hour (this average target). If the increase in computing power is too rapid, the difficulty will increase.

This section points out that the difficulty of PoW increases with the overall system's difficulty because the computing power of computers is constantly improving. Consider the evolution from CPU mining to GPU mining to ASIC mining; this is the brilliance of PoW. However, this also brings a reason for skepticism about Bitcoin: the current computing power is too high due to the large number of participants, and the difficulty has increased significantly, with overall computing power rising sharply in recent years. Therefore, if several years later, the incentives generated by Bitcoin (mentioned later) cannot bear such high computing costs, will it lead to a cliff-like drop in computing power? This could result in a collapse of Bitcoin's credibility (those who can control computing power could attack), leading to a sudden crash of Bitcoin.

5. Network#

The steps to operate this network are as follows:#

New transactions are broadcast to all nodes.
Each node collects the new transactions into a block.
Each node works on its block to find the proof of work.
When a node finds the proof of work for this block, it broadcasts this block to all nodes.
Nodes will only accept this block if all transactions in this block are valid and have not been spent.
Nodes proceed to the next block's proof of work and use the hash of this block as the previous hash for the next block to indicate acceptance of this block.
Nodes always consider the longest chain to be correct and continuously work to extend it. If two nodes simultaneously broadcast different versions of the next block, some nodes will first receive one of the blocks, and in this case, they will work for the first block they received, but will keep the other block in case it becomes longer. This tie will be broken when the next proof of work is discovered, at which point one branch will become longer. Nodes working on other branches will switch to this longest branch.
This section implicitly indicates Bitcoin's "maximum rule," which is that everyone defaults to the longest chain being correct. If this cannot be guaranteed, it is clear that the entire system cannot function. The correctness of this longest chain is guaranteed by the "incentive mechanism" described later, creating a game scenario: if one wants to create a blockchain application, then either everyone must agree on a rule (coercive force), or some method must be used to allow people to voluntarily agree on a rule (incentives). However, because competing for blocks incurs costs, how to conduct a game of "coercive force/incentives" regarding this cost is key to whether a blockchain can grow healthily.
The latter part explains the key to a block being recognized: if miners (the majority) regard this block as the previous block and incorporate its hash into a new block and work for the new block, then this block is recognized. Thus, it indicates that only when the next block is produced is the current block "valid." This is quite critical because all attacks will target this issue, and the imbalance between the payee and payer also arises from this.
This section also reflects the essence of game theory in blockchain applications.
The latter part describes a special case, stating that due to the characteristics of distributed networks, information communication is not real-time, so some nodes may recognize one block while others recognize another block, causing the blockchain to fork. However, it explains that because all nodes believe only the longest chain is the only chain, this situation will be broken after several blocks in multiple chains, resulting in block reorganization.
Broadcasting new transactions to all nodes is unnecessary. As long as transactions reach many nodes, they will enter a block (in the longest block before long). Block broadcasting has the ability to tolerate lost information. If a node does not receive a block, it will discover the missing block when it receives the next block and request the missing block.
This section is a supplement to the previous one, explaining the importance of 51% (the majority). This majority will not be represented by an intuitive number but will gradually emerge as everyone recognizes the longest chain over time.
At the same time, it mentions a key point: "Broadcasting new transactions to all nodes is unnecessary," clearly stating that transactions do not need to be flooded. Because the information of the transaction itself can spread in the blockchain system, not only can transactions be spread, but blocks can also be spread. The spreading of blocks is a "rule" in the system (to reach the longest chain). I believe this can bring some new thoughts in distributed systems.

6. Incentive Mechanism#

According to convention (agreement), the first transaction in a block is a special transaction, where the creator of this block owns a new currency. This provides an incentive mechanism for nodes to support this network and offers a way to initialize the distribution of currency into the entire system. Since there is no central authority to issue them (currency), the stable increase of a certain amount of new currency is similar to gold miners spending resources to mine gold and introduce it into the circulation system. In our context, CPU time and electricity are the costs incurred. This section finally addresses a question that has been avoided: where does the currency come from? In the Bitcoin system, Satoshi Nakamoto has set the total amount to be 21 million bitcoins. The production of these bitcoins is halved every 21,000 blocks (currently approximately every 10 minutes a block is produced, expected to reach 0 around the year 2140). The method of producing bitcoins is to grant a certain amount of currency to the person who seizes the accounting rights, thus simultaneously solving the issuance of currency and the reward for miners (just like gold miners in reality).

This method of granting currency out of thin air is that the first transaction of each block is a special transaction, which has only an input, and the output for this input (the output of the previous transaction) is empty (to be mentioned later). As stated in the notes of Chapter 2, coins do not exist; coins are inferred from transactions. Just like A giving B 100 yuan, when recording this information, it can record A's "account -100, B's account +100" or record "A gave B 100 yuan" as a transaction information.

The blockchain chooses the latter. This currency that appears out of thin air is a special transaction information. This incentive mechanism can also reward miners through transaction fees. If the output value of a transaction is less than its input value, then this difference is the transaction fee, which is added to the reward of the block containing this transaction. Once a fixed amount of currency (all currency) enters this circulation, this incentive mechanism can completely transform into transaction fees, and this system can completely avoid inflation (the total amount of currency is fixed, with no new currency issued). This part explains another aspect of the incentive mechanism, which is that besides the out-of-thin-air reward, transaction fees are also a source of income for miners. The incentive mechanism may help encourage nodes to remain honest.

If a greedy attacker can gather more CPU power than all honest nodes, they face a choice: either use this power to double spend and deceive others, or use the power to generate more currency. They should find that following the rules allows them to gain more benefits, as these rules support them to have more currency than others who join in, rather than destroying the system and harming their own wealth. This reflects the game relationship discussed earlier.

7. Reclaiming Disk Space#

Once a currency's latest transaction is buried into enough blocks, the transactions consumed before this can be discarded to save disk resources. To ensure that the block's hash is not compromised, transactions are hashed into a Merkle Tree, where only the root node is included in the block's hash. Old blocks can be compressed by stubbing off branches of this tree. The internal hashes do not need to be saved.

This section addresses a solution to the problem of continuously generated blocks in the blockchain system. As of the cutoff date (January 28, 2017), the total number of blocks has exceeded 90GB. Although storage is becoming increasingly inexpensive, it is impossible for the average public to use it because information continues to expand. This reflects the brilliance of the blockchain system: it does not store transactions but uses the Merkle Hash Tree method to store the Root Hash, achieving "zero-knowledge proof." Individuals do not necessarily need this block; having the "hash" (index) of this block is sufficient, with IPFS, public nodes, and highly trusted nodes helping to store these blocks. "Zero-knowledge proof" ensures that the block is absolutely correct and not forged.

A block that excludes transactions will be approximately 80 bytes in size. If we assume a block is generated every 10 minutes, then 80 bytes * 6 * 25 * 365 = 4.2MB per year. In 2008, the typical memory capacity of PC systems was 2GB, and according to Moore's Law, it is predicted to grow by 1.2GB each year, making it feasible to store all block headers in memory.

8. Simplified Payment Verification#

Verifying payments does not require running all network nodes.

A user only needs to keep a copy of the block headers of the longest proof-of-work chain. This chain only needs to query network nodes until they are confident they possess the longest chain and can connect through the Merkle branches to the block containing the timestamped transaction (the user's transaction). They cannot verify this transaction themselves (because there is only a hash), but by connecting to the position in the chain, they can see that a network node has previously accepted it (the transaction), and the blocks added afterward further prove that the network has accepted it. The issue brought about by reclaiming disk space leads to the problem of simplified payment verification, as some nodes may no longer hold all block information, creating a game of using space to exchange for verification convenience. However, it will not compromise the security of the information. As long as one holds the hash as an identifier, any node can always request the original information from other nodes.

This diagram indicates that as long as there is a hash chain, it suffices.

In this scenario, if honest nodes control the network, then this verification is reliable, but when a power-dominant attacker controls the network, it becomes easier to attack. Because when nodes in the network can verify themselves, this simplified method can be deceived by fabricated transactions when the attacker can continuously ensure they possess more than the entire network's computing power.

One protective strategy is to accept warnings from network nodes when these nodes detect an illegal block, prompting user software to download the entire problematic block and warn the transaction to check for consistency. Commercial institutions that frequently receive payment information may still run their full nodes to maintain greater independent security and faster verification. This mechanism feels somewhat like a patching method. However, it seems to be a solution to reduce hard disk storage, representing a game result between storage and security convenience. There may be opportunities to write articles here, perhaps indicating an obstacle to the promotion of blockchain applications. Because if too few nodes retain the blocks, it could cause problems. While it cannot steal or tamper with the network, it could lead to a collapse. (Of course, this does not refer to the Bitcoin network but to some smaller blockchain networks.)

9. Combining and Splitting Value#

While it is possible to handle currency independently, it is unwise for each penny in a transfer to become a separate transaction (meaning coins are not combined from unit denominations). To allow value to be split and combined, transactions include multiple inputs and outputs. Normally, there will be one input from a large transaction in the front or include many smaller total inputs, and then there will be two outputs: one for payment and the other for change, regardless of how many, all returned to the purchaser.

This section finally discusses the composition of a transaction (Tx). The previous discussions have elaborated a lot; just grasping one key point is new: a transaction is verified by previous transactions, plus the amount transferred and the destination. Inputs are the outputs of previous transactions. By verifying all inputs, one can determine whether the payer's balance is greater than the amount to be transferred.

All Tx associated with inputs are the previous "historical information," and nodes can retrieve results from their blocks. The first transaction of each block (the incentive transaction) is equivalent to having only an input and no output. An additional point to note here is that each transaction must be fully spent; it is simply the sum of all inputs, with part given to the payee, and the change returned entirely to the purchaser (recalling the workflow of the mint mentioned earlier).

This mechanism has a natural advantage in number splitting. Existing currencies always have a minimum unit as a "base" unit (for example, the Chinese yuan: 1 fen), but Bitcoin has no such restriction; it only cares about the difference and not the smallest unit. Thus, this is the combination and splitting of value.

It should be noted that the entire distribution (fan-out) relies on several transactions, and these transactions rely on more. This is not a problem. There is no need to extract a complete independent copy of transaction history. This part is slightly difficult to translate; I am not very clear on what "fan-out" refers to here.

10. Privacy#

The traditional banking model achieves levels of privacy by restricting access to information for relevant participants and third parties. When all transactions need to be publicly broadcast, this method cannot be used. However, privacy can still be maintained by breaking the information flow elsewhere: by anonymizing public keys. The public can see that one person sent a certain amount to another person, but no information can link the transaction to the individual. This is similar to the level of information released in stock trading; in stock trading, the time and individual transactions publicly released are recorded on the "tape," but it does not disclose who participated.

This section points out that the use of public and private key mechanisms is a form of pseudo-anonymization.

As an additional firewall (preventive mechanism), using a new key pair for each transaction can ensure that these keys are not linked to one person. The connections of some multi-input transactions are still unavoidable because this point (multiple inputs) reveals that these inputs belong to the same person. The risk is that if any one key's owner is discovered, then other transactions associated with this person will also be revealed.

The original text recommends this approach for good reason. Because elliptic curve encryption can be broken by quantum brute force, if the public transfer keys generated in the entire system are disclosed.

Although cryptographically, one should not derive a private key from a public key, there are still exceptions. Therefore, Satoshi Nakamoto strongly recommends using a new pair of keys for each transfer (because once a transaction is used as inputs, the transaction used as inputs is no longer useful (note the change mechanism mentioned in 9, where each must be fully spent)), ensuring that each transaction uses a new public and private key. The latter part addresses the issue of the correspondence between public and private keys and real individuals (loss of anonymity). Changing keys can also prevent this from happening, but it should be noted that because all records are publicly traceable, if any one public-private key is linked to a person, then all associated public-private keys can be linked to this person.

11. Computation#

We consider a scenario where an attacker attempts to generate a replacement chain longer than the current honest chain. Even if this is achievable, it does not imply that the system is under arbitrary control, such as creating value out of thin air or taking money that does not belong to the attacker. Nodes will not accept invalid transactions as payments, and the most honest chain will never accept a block containing invalid information. An attacker can only attempt to change their own transaction to reclaim money they recently spent. The competition between the most honest chain and the attacker's chain can be viewed as a Binomial Random Walk. Successful events occur when the most honest chain expands by one block, leading to a +1 lead, while failure events occur when the attacker's chain expands by one block, resulting in a -1 gap. The probability of the attacker catching up given a loss can be viewed as a Gambler's Ruin problem. Assume an unlimited credit gambler starts from a loss and potentially attempts to catch up to break even an unlimited number of times. We can calculate the probability of them reaching break even, which is the probability of an attacker catching up to the most honest chain, as follows:

P = the probability that the honest chain discovers the next block

q = the probability that the attacker discovers the next block

qz = the number of blocks the attacker has spent catching up

Assuming p > q, the probability of the attacker catching up will decrease exponentially as the number of blocks increases. Because the odds are against them, if they are not lucky enough to catch up early, their chances of success will diminish as they fall further behind. Now we consider how long it takes for a new transaction to be sufficiently confirmed so that the sender cannot change the transaction. We assume the payer is an attacker who wants the payee to believe they have already paid and then reclaim the money after payment; the payee will be notified with a warning when this occurs, but the payer hopes this happens after a long time.

The receiver generates a new key pair and gives the public key to the payer shortly before the payer's payment. This prevents the payer from preparing a blockchain in advance of the time, working continuously until they are lucky enough to reach the front, and then executing the transaction at that time. Once this transaction is sent, this dishonest sender begins secretly working on a parallel chain containing a version that replaces their transaction. The receiver waits until this transaction is added to a block and then z blocks are added afterward. They do not know the exact number of blocks the attacker has completed (how many blocks have been done), but assuming the honest chain takes an expected average time to produce a block, the attacker's potential progress is a Poisson distribution, with the expected value of the distribution being

Convert to C code:#

#include<math.h>
double AttackerSuccessProbability(double q, int z){
 double p= 1.0 - q;
 double lambda = z * (q / p);
 double sum = 1.0;
 int i, k;
 for (k = 0; k <= z; k++)
 {
 double poisson = exp(-lambda);
 // Run to get results, we can see the probability decreases exponentially with z.
 q=0.1
 z=0 P=1.0000000
 z=1 P=0.2045873
 z=2 P=0.0509779
 z=3 P=0.0131722
 z=4 P=0.0034552
 z=5 P=0.0009137
 z=6 P=0.0002428
 z=7 P=0.0000647
 z=8 P=0.0000173
 z=9 P=0.0000046
 z=10 P=0.0000012
 q=0.3
 z=0 P=1.0000000
 z=5 P=0.1773523
 z=10 P=0.0416605
 z=15 P=0.0101008
 z=20 P=0.0024804
 z=25 P=0.0006132
 z=30 P=0.0001522
 z=35 P=0.0000379
 z=40 P=0.0000095
 z=45 P=0.0000024
 z=50 P=0.0000006
 // Solve for z value when P < 0.1%:
 for (i = 1; i <= k; i++)
 poisson *= lambda / i;
 sum -= poisson * (1 - pow(q / p, z - k));
 }
 return sum;
}

P < 0.001 q=0.10 z=5 q=0.15 z=8 q=0.20 z=11 q=0.25 z=15 q=0.30 z=24 q=0.35 z=41 q=0.40 z=89 q=0.45 z=340 This indicates that waiting for 6 blocks is an absolutely safe method. 12. Conclusion We have proposed a non-trust-based electronic transaction system.

We started from the usual currency framework composed of digital signatures, which provides strong control over ownership relationships but does not solve the double spending problem. To address this issue, we proposed a P2P network using a proof-of-work mechanism to record a public transaction history. This mechanism becomes computationally impossible for an attacker to change when the majority of nodes control the main CPU power. The network is robust in its unstructured simplicity.

Nodes require very little collaboration to work together. They do not need to be authenticated because information does not need to be routed to a specific location but only needs to be sent out as widely as possible. Nodes can leave or rejoin the network at will, accepting the proof-of-work chain as proof of what happened while this node was away.

They vote based on CPU power, indicating acceptance of legitimate blocks by extending this block and rejecting illegal blocks by refusing to work after this block. Any necessary rules and incentives can be enforced through this consensus mechanism.