A Brief History of the GFW

The Internet Archive, archived on April 21, 2017: https://web.archive.org/web/20170421092911/http://blog.renren.com/share/200487056/5148854419

Timeline
On September 22, 1998, the Ministerial Office of the Ministry of Public Security approved the research and decided to carry out the national public security work informationization project - the "Golden Shield Project" across the national public security organs.

On April 20, 1999, the Ministry of Public Security submitted the project initiation report and project proposal for the Golden Shield Project to the National Development and Reform Commission.

In June 1999, the National Computer Network and Information Security Management Center was established as a bureau-level institution.

From 1999 to 2000, Fang Binxing, who had taught for many years at Harbin Institute of Technology, was transferred to serve as the deputy chief engineer of the National Computer Network and Information Security Management Center.

On December 23, 1999, the State Council issued a document establishing the National Informationization Work Leading Group, with Vice Premier Wu Bangguo as the group leader. Its first subordinate agency, the Office for Computer Network and Information Security Management, was located at the already established National Computer Network and Information Security Management Center, replacing the Inter-Ministerial Coordination Group for Computer Network and Information Security Management, to organize and coordinate the network security management of departments such as the Ministry of Public Security, Ministry of State Security, Confidentiality Bureau, Commercial Password Management Office, and Ministry of Information Industry.

From 2000 to 2002, Fang Binxing served as the chief engineer, deputy director, and professor-level senior engineer at the National Computer Network and Information Security Management Center.

On April 20, 2000, the Ministry of Public Security established the leadership group and office for the Golden Shield Project.

In May 2000, Project 005 began implementation.

In October 2000, the Ministry of Information Industry established the Computer Network Emergency Response Coordination Center.

On December 28, 2000, the 19th meeting of the Standing Committee of the Ninth National People's Congress passed the "Decision on Maintaining Internet Security."

In 2001, Fang Binxing's "Computer Virus and Its Prevention Technology" won the third prize in national defense science and technology, ranking first.

In 2001, Fang Binxing received a special government allowance from the State Council and was awarded the title of "Outstanding Individual for Significant Contributions in Key Projects of the Ministry of Information Industry" by the Ministry of Information Industry, along with the title of "Advanced Individual" jointly awarded by the Organization Department, Publicity Department, Central Political and Legal Affairs Commission, Ministry of Public Security, Ministry of Civil Affairs, and Ministry of Personnel.

On January 19, 2001, the Shanghai Branch of the National Computer Network and Information Security Management Center was established, located on the 6th floor of 508 Zhongshan South Road, Huangpu District, Shanghai. The Shanghai Branch of the National Computer Network Emergency Technology Processing Coordination Center is a central financial fully funded public institution directly under the Ministry of Industry and Information Technology.

On April 25, 2001, the "Golden Shield Project" was approved for project initiation by the State Council.

In July 2001, the Office for Computer Network and Information Security Management approved Harbin Institute of Technology to establish the National Key Laboratory for Computer Information Content Security, led by Hu Mingzeng and Fang Binxing.

On July 24, 2001, the Guangzhou Branch of the National Computer Network and Information Security Management Center was established, located at 2 and 4 Jianzhong Road, Yuexiu District, Guangzhou.

On August 8, 2001, the National Computer Network and Information Security Management Center established the National Computer Network Emergency Response Coordination Center, abbreviated as CNCERT/CC.

On August 23, 2001, the National Informationization Leading Group was re-established, with Zhu Rongji, a member of the Standing Committee of the Political Bureau, serving as the group leader.

On November 28, 2001, the Shanghai Internet Exchange Center of the National Computer Network and Information Security Management Center was established. It provides "Internet exchange services, data exchange for the East China region of the Internet backbone, data traffic monitoring and statistics, inter-network communication quality supervision, maintenance and operation of exchange center equipment, inter-network interconnection cost calculation, and inter-network interconnection dispute coordination," located at 508 Zhongshan South Road, Huangpu District, Shanghai.

On November 28, 2001, the Guangzhou Internet Exchange Center of the National Computer Network and Information Security Management Center was established, located at 204 Jianzhong Road, Yuexiu District, Guangzhou.

In December 2001, the comprehensive building of the National Computer Network and Information Security Management Center began construction in Beijing.

On December 17, 2001, the Hubei Branch of the National Computer Network and Information Security Management Center was established.

In 2002, Fang Binxing served as a visiting researcher, doctoral supervisor, and chief scientist of information security at the Institute of Computing Technology, Chinese Academy of Sciences. From 2002 to 2006, Fang Binxing served as the director, chief engineer, and professor-level senior engineer at the National Computer Network and Information Security Management Center, later promoted to honorary director.

On January 25, 2002, it was reported that "the Shanghai Internet Exchange Center of the National Computer Network and Information Security Management Center was recently opened and put into trial operation, with four national-level interconnection units including China Telecom, China Netcom, China Unicom, and China Jitong being the first batch to connect. The access of China Mobile Internet is underway and is expected to become the fifth access unit soon."

On February 1, 2002, the Xinjiang Branch of the National Computer Network and Information Security Management Center was established.

On February 25, 2002, the Guizhou Branch of the National Computer Network and Information Security Management Center was established.

On March 20, 2002, multiple provincial branches of the National Computer Network and Information Security Management Center were established simultaneously.

On September 3, 2002, Google.com was blocked, primarily through DNS hijacking.

On September 12, 2002, the block on Google.com was lifted, but features such as webpage snapshots were subsequently blocked, using TCP session interruption.

In November 2002, the major national information security project "Large-Scale Broadband Network Dynamic Interruption System" (Large-Scale Broadband Network Dynamic Disposal System) with a funding of 66 million yuan won the second prize in national defense science and technology. Yun Xiaochun ranked first, and Fang Binxing ranked second. The Computer Network and Information Content Security Key Laboratory of Harbin Institute of Technology, the Network Technology Research Institute of Tsinghua University, and the Grid Computing Research Department of Tsinghua University participated.

From 2003 to 2007, Fang Binxing served as the director of the Internet Emergency Response Coordination Office of the Ministry of Information Industry.

On January 31, 2003, the national information security major project "National Information Security Management System" (Project 005) with a funding of 490 million yuan won the first prize for national scientific and technological progress in 2002, with Fang Binxing ranking first, Hu Mingzeng ranking second, Tsinghua University ranking third, Harbin Institute of Technology ranking fourth, Yun Xiaochun ranking fourth, Peking University ranking fifth, and Zheng Weimin ranking seventh, with the Institute of Computing Technology, Chinese Academy of Sciences participating.

In February 2003, the comprehensive building of the National Computer Network and Information Security Management Center in Beijing was completed.

On September 2, 2003, the National Computer Network Emergency Response Coordination Center was renamed the National Computer Network Emergency Technology Processing Coordination Center.

On September 2, 2003, the national "Golden Shield Project" meeting was held in Beijing, marking the full launch of the "Golden Shield Project."

In 2004, the national information security major project "Large-Scale Network Specific Information Acquisition System," with a funding of 70 million yuan, won the second prize for national scientific and technological progress.

In 2005, Fang Binxing served as a part-time professor, distinguished professor, and doctoral supervisor at the National University of Defense Technology.

In 2005, Fang Binxing was selected as an academician of the Chinese Academy of Engineering.

By 2005, "the system" had established four mirrored main systems in Beijing, Shanghai, Guangzhou, and Changsha, interconnected by a ten-gigabit network. Each system consisted of an 8-CPU multi-node cluster, with the operating system being Red Flag Linux and the database being Oracle RAC. By 2005, the National Computer Network and Information Security Management Center (Beijing) had already established a 384*16 node cluster for network content filtering (Project 005) and SMS filtering (Project 016). This system had mirrors in Guangzhou and Shanghai, interconnected by a hundred-gigabit network, capable of collaborative work or independent takeover.

On November 16, 2006, the first phase of the "Golden Shield Project" was officially accepted by the state in Beijing, designed for the Ministry of Public Security of the People's Republic of China, handling business related to public security management, foreign hotel management, immigration management, and public order management.

On April 6, 2007, the foundation stone for the server building of the Shanghai Branch of the National Computer Network and Information Security Management Center was laid, located at 5788 Yanggao South Road, Kangqiao Town, with an investment of 90.47 million yuan, "... is a national-level major project approved by the National Development and Reform Commission. Currently, only Beijing and Shanghai have established branch centers, and it plays an important role in safeguarding national information security."

On July 17, 2007, a large number of users using domestic email service providers experienced widespread issues with bounced and lost emails when communicating with foreign parties.

In December 2007, Fang Binxing was appointed president of Beijing University of Posts and Telecommunications.

On January 18, 2008, the Ministry of Information Industry decided to relieve Fang Binxing of his position as honorary director of the National Computer Network and Information Security Management Center and director of the Internet Emergency Response Coordination Office of the Ministry of Information Industry, "for other duties."

On February 29, 2008, Fang Binxing was elected as a representative of Anhui Province in the 11th National People's Congress.

On August 10, 2009, Fang Binxing strongly advocated for real-name registration on the internet at the "First China Internet Governance and Law Forum."

Institutional Relationships
The National Computer Network and Information Security Management Center (Security Management Center) is a direct department of the former Ministry of Information Industry, now the Ministry of Industry and Information Technology.

The Security Management Center, the Office for Computer Network and Information Security Management of the National Informationization Work Leading Group, and the National Computer Emergency Technology Processing Coordination Center (CNCERT/CC, Internet Emergency Center) are all parts of the same organization with different names. For example, there is a subtle contradiction between Fang Binxing's resume stating "served as deputy chief engineer at the National Computer Network Emergency Technology Processing Coordination Center from 1999 to 2000" and the establishment time of the "Computer Network Emergency Processing Coordination Center." In fact, the personnel of several institutions are basically the same. The Internet Exchange Center under the Security Management Center and the National Internet Exchange Center are different institutions. The provincial branches of the Security Management Center generally affiliate with local communication management bureaus.

The main research strength of the Security Management Center comes from "Harbin Institute of Technology will definitely prosper," where Fang Binxing has a group of students, and the well-connected Institute of Computing Technology of the Chinese Academy of Sciences. These two institutions are the main participants in three major national information security projects and continue to attract talents and provide personnel and technology to the Security Management Center. After Fang Binxing was transferred to Beijing University of Posts and Telecommunications, the proportion of personnel from Harbin Institute of Technology gradually decreased while that from Beijing University of Posts and Telecommunications gradually increased.

CNCERT/CC's domestic "partners" include the China Internet Association, which hosts the China Internet User Anti-Spam Center, a shell organization with no real power; the National Computer Intrusion Prevention and Anti-Virus Research Center and the National Computer Virus Emergency Response Center, which are under the Ministry of Public Security and the Ministry of Science and Technology; the Illegal and Harmful Information Reporting Center, which is within the jurisdiction of the State Council Information Office; and the National Computer Network Intrusion Prevention Center, which is an institution of the Graduate School of the Chinese Academy of Sciences, also directly supporting CNCERT/CC.

Among the emergency support units of CNCERT/CC, the initial leader in the private sector was Green Alliance, which was later replaced by Venustech due to its espionage case. The Security Management Center has some administrative powers for qualification certification and access approval, which may be the reason why private security companies are eager to join. However, private enterprises have not participated in the construction of core national information security projects; many peripheral projects of the Security Management Center are outsourced to private and foreign enterprises, such as access restriction devices like isolators being outsourced to Venustech as auxiliary or backup, or having exchanges with them in network security monitoring.

GFW and the Golden Shield are unrelated
Astute readers should have sensed this from the timeline. In fact, GFW and the Golden Shield are unrelated, and there are many distinctions between the two.

The police system conducts network monitoring through the 11th Bureau of the Ministry of Public Security.

GFW is a sub-project of the "National Information Barrier Project," with direct superiors being the National Informationization Work Leading Group and the Ministry of Information Industry, which is a national defense project personally overseen by the Political Bureau. This project mainly monitors and discovers harmful websites and information, IP address positioning, online confrontation information reporting, tracking harmful short messages, and timely blocking. Jiang Zemin, Zhu Rongji, Hu Jintao, Li Lanqing, Wu Bangguo, and others have inspected this project multiple times.

The "National Information Barrier Project" includes the "National Information Security Management System" project code-named 005, as well as the National Information Security 016 project, etc.

GFW mainly serves as a tool for public opinion intelligence systems, while the Golden Shield primarily serves as a tool for the police system. The main supporters of GFW are high-level officials in the publicity work, such as Li Changchun and Zhang Chunjiang, with the initial main demands coming from the Political Bureau, Political and Legal Affairs Commission, Ministry of Security, and the 610 Office; while the main supporters of the Golden Shield are high-ranking officials in the police system, with the main demands coming from the police department. GFW is externally focused, serving as a network customs; while the Golden Shield is internally focused, serving for investigation and evidence collection. GFW has a short construction time, low cost, and good effectiveness; while the Golden Shield has a long construction time, enormous costs (over ten times that of GFW), and insignificant effectiveness. GFW relies on three national-level international entry and exit backbone network exchange centers to conduct intrusion detection from CRS GSR traffic spectral imaging to its own exchange center, then spreading to some routers placed at ISPs, with a concentrated location and a small number of devices; while the Golden Shield is an internal information network of the police, ubiquitous, and with a massive number of devices. GFW has strong research capabilities, with many top talents and laboratories in domestic information security research serving it, such as the Information Security Key Laboratory of Harbin Institute of Technology, the Software Institute of the Chinese Academy of Sciences, the High Energy Institute, the Third Department of the General Staff of the National Defense Science and Technology University, the 9th Bureau of the Ministry of Security, Beijing University of Posts and Telecommunications, Xidian University, Shanghai Jiao Tong University, North China University of Technology, Beijing Electronic Science and Technology Institute, the Army Engineering University, the Army Armored Corps Engineering Institute, the 30th Institute of the Ministry of Information Industry, and the 56th Institute of the General Staff, etc.; in addition, almost all 985 and 211 universities participate in this project, and some commercial companies also participate in certain peripheral engineering projects, such as Websense, Packeteer, BlueCoat, Huawei, Peking University Founder, Harbor, Venustech, and Digital China also provide some auxiliary equipment. Companies like ZhongSou, Qihoo, and Beijing Dazheng, Yahoo, etc., participate in the search engine security management system. In some provincial and municipal network rooms, the departments involved in access monitoring are varied, including security, public security, discipline inspection, military, etc., with the deployed equipment also being diverse, with regular troops, mixed brands, and foreign aid each fighting their own battles.

However, the research strength of the Golden Shield is relatively weak. The Information Network Security R&D Center of the Third Research Institute of the Ministry of Public Security and the National Computer Intrusion Prevention and Anti-Virus Research Center both lack research strength and research results. In August 2008, the Information Network Security Key Laboratory of the Ministry of Public Security was established to compete with the key laboratory of Harbin Institute of Technology, and Fang Binxing was specially invited to the academic committee of the laboratory. However, this laboratory's research direction on electronic data forensics has little prospect and lacks research results. Fang Binxing, the father of GFW, did not participate in the Golden Shield Project, while Shen Changxiang supported the Golden Shield Project in the Academy of Engineering; in fact, the academic committee list of that key laboratory of the Ministry of Public Security is quite interesting, with Shen Changxiang naturally ranking first, and Fang Binxing, due to his recent fame, was also invited, possibly with the intention of establishing a good relationship with the police system.

GFW Development and Status
The hardware mainly used by GFW comes from Sunway and Huawei, with no Cisco or Juniper; most of the software is self-developed. The reason is simple: for the construction of national information security infrastructure, Fang Binxing has repeatedly emphasized in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System" that "information security should primarily rely on independent intellectual property rights." Moreover, GFW is a confidential national defense project, and GFW does not have spare money to support foreign experts; the fat water does not flow to outsiders. Li Guojie is the director of the Information Engineering Department of the Academy of Engineering, chairman of Sunway Company, and director of the Institute of Computing Technology of the Chinese Academy of Sciences. A large number of server equipment orders for GFW have been given to Sunway. Fang Binxing also allocated large orders for mainframes needed by the Security Management Center to Li Guojie, Lu Xicheng from the National Defense Science and Technology University, and Chen Zuoning from the 56th Institute of the General Staff. Therefore, why does GFW have so much Sunway equipment, why does GFW have so much research strength from the Institute of Computing Technology of the Chinese Academy of Sciences, and why does Fang Binxing have prominent part-time positions in both the Institute of Computing Technology and the National Defense Science and Technology University? It is because Fang Binxing is flexible in his thinking and makes everyone happy.

Some people online ridicule GFW for being arrogant, but in fact, this is blind optimism; the ignorant are fearless. The technology of GFW is world-class, gathering genuine top talents from Harbin Institute of Technology, the Chinese Academy of Sciences, and Beijing University of Posts and Telecommunications, with solid research strength. What dynamic SSL, Freenet, VPN, SSH, TOR, GNUnet, JAP, I2P, Psiphon, and Feed Over Email are all trivial. All methods of circumventing the wall, as long as someone thinks of them, GFW has researched and has laboratory solutions for countermeasures.

For example: serial blocking uses man-in-the-middle attack methods to replace the untrusted CA-signed digital certificates used by both parties in encrypted communication, coordinating certificates between gateways/proxies, and performing decryption detection at the exit gateway, which is known as deep content inspection. Seven-layer filtering of HTTPS requires authentication. When the client accesses the server, the server provides a CA certificate, but some implementations may not provide a CA certificate. For servers that do not provide a CA certificate, the firewall handles it simply by blocking all requests. The default CA issuing authority is checked, and if the certificate is not issued by these authorities (Verisign, Thawte, Geotrust), it is killed without mercy. This occurs during the HTTPS handshake phase between the client and server, filtering out all HTTPS requests without a CA certificate or using an illegal CA certificate. This step is broad-spectrum filtering and is unrelated to the server's IP address.

GFW is mainly an intrusion prevention system, detection-attack dual model.#

All plaintext circumvention schemes at the transport layer are easy to detect and attack immediately; even if the transport layer uses encryption like TLS and cannot be detected in real-time, such schemes aimed at end users are certainly transparent, and no one can stop GFW from also analyzing its network layer detectable features as an end user.

Intrusion detection followed by TCP session reset attacks is a clean and neat method; at the very least, it can also be manually checked to find the network layer characteristics of the circumvention method (just the target IP address is sufficient) and then perform targeted elimination.

If it is one or two enemy countries, GFW can also find clusters to calculate the keys. GFW is a rare research project that has central financial support. Those poor researchers in the basement of Harbin Institute of Technology and the broken building of the Chinese Academy of Sciences can produce results even without money; now with central financial support, they are even more motivated.

GFW can do everything, except P2P, because the anonymity is too good, making it impossible to detect in real-time or find fixed or variable trackable network layer characteristics through static analysis. Even so, two trap nodes can be set up for some minor damage, and the Chinese Academy of Sciences' 242 project "P2P Protocol Analysis and Measurement" has never stopped. Whenever an academic conference is held abroad or someone publishes a paper on the security of Tor at Defcon, it is immediately brought back for research, keeping up with the forefront of academic technology. However, in reality, even such a top technology project as GFW cannot escape the nature of imitation; it is easy to produce something, but making it detailed is not possible.

However, some may wonder why GFW can block everything but does not really block it? My circumvention method is still working well. In fact, GFW has its own operating mode. GFW is, by nature, a purely research and technology department, and for political forces, it is a completely passive tool. GFW has very strict permission management internally, and technology and politics are thoroughly isolated. What to block or unblock is entirely decided by superiors; the party commands the gun, authorizing specialized personnel to operate the keyword list, with a thorough isolation from the technical implementers, who do not know what each other is doing. Therefore, many times, some inexplicable blocks, such as blocking freebsd.org or freepascal.org (which may even be associated with freetibet.org), or listing "package.debian.org/zh-cn/lenny/gpass" which has nothing to do with the wheel as a keyword, are all the whims of bureaucrats fiddling with IE6, and the technical staff would be furious if they knew.

Fang Binxing, in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System," mentioned a principle based on national conditions, saying: "It mainly emphasizes a comprehensive balance of security costs and risks; if the risk is not great, there is no need to spend too much security cost. It is necessary to emphasize ensuring key points, such as hierarchical protection, which is determined based on the importance of the information system, thereby applying appropriate intensity of protection."

Thus, for niche circumvention methods, GFW can only passively discover them and have a general understanding, while the superiors are completely unaware of such methods and therefore will not block them; GFW itself also lacks the authority to block them, or if they know, they are too lazy to spend money and energy to arrange it. The principle of "shooting the bird that sticks its head out" has always been this way.

The current situation is that sensitive data can be safely transmitted through blocking; otherwise, it will be filtered out. For massive network data, it is impossible to analyze it manually, and sensitive data can only be discovered based on filtering technology according to certain characteristics in the data flow. Currently, decryption technology cannot be implemented for massive data flow and encryption technology; it is impossible to use decryption methods as long as the encrypted data stream has no identifiable characteristics. Therefore, filtering technology cannot truly achieve network blocking, so new parameters must be added; they chose quantity, that is, to save a segment of your data for a period.

Currently, the more commonly used circumvention methods are dynamic networks, unrestricted, gardens, etc. Due to the relatively limited and known connection points, saving a segment of data for a period becomes meaningful. Since many people use circumvention software, it is impossible to capture everyone, so it is possible to distinguish key points and frequently used circumvention software users. Of course, you can connect to these known points through proxies to solve this problem. Circumvention software also provides such methods, but requests connecting to known points through proxies may still be intercepted. Fang Binxing turned all the political momentum during the rise of GFW into his own momentum and then discarded GFW.

Currently, GFW is in a stable period, completely a clean government office, with no background, no longer any political or financial interests to seize, and no longer able to engage in new large-scale projects. Even IPv6 has become a troublesome issue for GFW. Fang Binxing lamented in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System": "For example, after the concept of Web 2.0 emerged, even issues like viruses spread more easily; for instance, after IPv6 came out, intrusion detection became meaningless because the protocol cannot be understood, so what is there to detect..."

GFW has never had a status; it has always been a neglected little girl, with the State Council Information Office, Internet Monitoring, Broadcasting and Television, Copyright, and Communication Management Bureau all pressing down on it to do this and that. Therefore, Fang Binxing emphasized a mechanism in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System": "It requires macro-level support from the competent departments." Therefore, if you want to unblock a website, do not look for GFW itself, as it is useless; you need to find GFW's superiors, any of whom will do. The ISP has nothing to do with GFW at all and does not know what GFW is doing; suing the ISP is completely missing the point.

However, GFW is still running well, and its work capacity still has great potential to be tapped, the only thing to fear is DDoS attacks. The scale of GFW can also be estimated from the numbers in the previous timeline, and GFW's current website blocking list has tens of thousands of entries. Network monitoring and monitoring of IM short messages such as MSN, YMSG, and ICQ are also quite perfect. GFW has done relatively well in data mining and protocol analysis, with intelligent recognition and analysis of multimedia data such as audio, video, and images, natural language semantic judgment, pattern matching, P2P, VoIP, IM, streaming media, encrypted content recognition and filtering, serial blocking, etc., which will be the focus in the future. However, GFW does not have self-organizing feedback mechanisms like machine learning to automatically generate keywords, because it does not have the authority to modify keywords, so such technology is unnecessary. Moreover, this technology is also conceptually inflated, with many papers published but immature in practice. Currently, what GFW and the Golden Shield want most is to be able to pick out a small number of poisonous weeds from the vast grass through data mining and similar artificial intelligence technologies.

Fang Binxing mentioned in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System" the "core capability of public opinion control," stating that "first, it must be able to discover and acquire, and then it must have the ability to analyze and guide." How to discover? It relies on the 973 project "Text Recognition and Information Filtering" and the 863 key project "Large-Scale Network Security Incident Monitoring" being researched by the Chinese Academy of Sciences.

The Golden Shield Project spent a lot of money to create it, and the praise is not as good as GFW. The police officers of the 11th Bureau are embarrassed and cannot explain it to the older generation. The technical strength of the police system cannot be compared with GFW, but the police system has money, casually buying tens of thousands of cameras and thousands of blades, connecting them to provincial and municipal network centers, recording everything. The problem is that what is recorded cannot be used; it can only rely on police officers to flip through Excel sheets page by page. Therefore, although it seems that GFW is riddled with holes, the Golden Shield is unfathomable; it is just that the police department is relatively more aggressive compared to GFW, seeing poisonous weeds does not give you an RST but a detention certificate. Instead, GFW most of the time blocks poisonous weeds, while most poisonous weeds are not discovered by the Golden Shield.

National Information Security Discourse Paradigm
A large amount of online propaganda from abroad has left the central government, which has never had any experience with networking, at a loss, helpless, and very anxious. These things are unbearable security threats to the central government, and since these threats occur online, national cybersecurity has naturally risen to the top of the agenda. At the same time, with the wave of informationization, the concept of e-government has emerged, and the central government has decided to respond well to the issue of informationization, thus establishing the National Informationization Work Leading Group. We can see that in the first batch of composition lists, security departments and propaganda departments occupy the majority of seats, and its first subordinate agency is responsible for handling security issues, while the second subordinate agency is responsible for informationization reform, indicating the strong demand for security. It was at this time that Fang Binxing, who has always had unique insights into information security, was transferred to the Security Management Center by Zhang Chunjiang of the Ministry of Information Industry to practice. Fang Binxing's insights into information security coincided with the high-level demand for network security.

Fang Binxing said in his recent speech "Five Levels of Interpretation of the National Information Security Guarantee System": "There must be an information security law; with this core law, you can carry out a series of work." The primary core of the national information security system is a legal guarantee system centered on information security, defining what constitutes "information security" through national will - law. Information security is originally a purely technical and completely neutral term, but through the definition of national will, "inciting... inciting... inciting... inciting... fabricating... promoting... insulting... damaging... others..." is defined as so-called network attacks, network garbage, network harmful information, and network security threats, yet at the implementation level, security is viewed completely technically and neutrally, without considering real political issues. This achieves a complete technical encapsulation while providing users with a highly extensible interface for defining security events. The binding of national security and technical security is full of metaphors, and the welding of ideology and information science is unbreakable. This is the pioneering thinking that Fang Binxing brings to the high-level, and this is the national information security discourse paradigm proposed by Fang Binxing.

This discourse paradigm is so natural and thoroughly encapsulated that almost everyone is unaware of the serious problems that have arisen in China's networking development. Almost all netizens are unaware that the GFW, which brings them great trouble and frustration, is actually the national Internet emergency response center that should be responsible for fighting against black and evil; almost all netizens are unaware that their small plot of land on the Internet, where they trim flowers and plants, is actually a network security attack event for the state; almost all decision-makers are unaware of the powerful side effects that the seemingly immediate firewall has and the great harm it brings to the development of the Internet; almost all decision-makers are unaware of what it means to use such a professional security tool as GFW to conduct network blocking. Ideology cannot bear the unpredictable landscape of networking and can only blindfold its eyes.

In the discussion of the theoretical texts of networking in Chinese, the primary position occupying the most space is network security and network threats. The first subordinate agency of the National Informationization Work Leading Group is responsible for handling security issues. Thus, when the network itself has not yet developed, various restrictions and controls are theoretically imposed on the network; after the network has spontaneously grown, systematic demonization of the network is carried out culturally, and geographically, the network in China is closed off. More seriously, using national information security tools without understanding the essence and side effects of technology is like a child playing with firearms without understanding the consequences. Under the discourse of maintaining security, decision-makers are completely unaware that using GFW for network blocking is akin to using military force to suppress their own network territory, and cutting off network cables is akin to using nuclear weapons on their own network territory.

More sadly, most of the builders of GFW are unaware of what they are doing, and after signing confidentiality agreements, they unconsciously devote themselves to the party-state's cause, flowing like the Yangtze River. Those like Yun Xiaochun, who followed Fang Binxing to carve out a territory, have seen Fang Binxing soar, while they can only work tirelessly on technology, and in the Security Management Center, they are surpassed by the likes of Wang Xiujun and Huang Chengqing. Those poor researchers who initially followed Fang Binxing at Harbin Institute of Technology have gradually gone to companies like Baidu. GFW faces an ethical dilemma similar to that of the Manhattan Project. Science is neutral, but scientists are manipulated by politics. Technical workers only care about and are only allowed to care about how to achieve security and cannot care about how security is defined. They lack academic ethics and cannot practice the principle of "examining and evaluating all possible consequences of their work; once they discover defects or dangers, they should change or even interrupt their work; if they cannot make decisions independently, they should postpone or suspend related research and promptly report to society." As a result, even if they work hard on research, they cannot benefit people's livelihoods, but instead are labeled as "suppressing human rights in China" and "Nazi accomplices," which is undoubtedly a historical tragedy.

This discourse paradigm permeates all aspects of society. Under this discourse, China has the world's most powerful firewall, but China's network construction lags far behind the world's advanced level; China has the world's largest internet addiction treatment industry chain, but China's internet industry can only engage in imitation technology; China has the most internet users in the world, but cannot hear China's voice on the Internet. GFW has achieved self-censorship, making it impossible for people to fly even if they regain freedom, fulfilling its fundamental purpose. Now, even though the technology for DDoS against GFW has matured, tearing down the wall has become meaningless, only allowing the police system's Golden Shield to gain power, leading to more netizens being arrested, and ultimately a new wall being erected. All of this stems from the huge rupture between ideological modernity and postmodernity of networking, as well as the deadly taboo of the "national information security discourse."

The entity of GFW is the Security Management Center (CNCERT/CC), which is a public institution. The political status of a public institution can be considered very low; any of its superiors can issue orders to it for network blocking, while it itself has no subjective initiative, focusing on business and diligently ensuring the construction of national security infrastructure. If we shift our perspective and consider GFW from its own viewpoint, then what GFW does is not mysterious at all; it precisely aligns with its own name. Let's take a look at the exquisite "Malicious Code Monitoring System under Broadband Network Environment." If you access blogger or wordpress, that is a URL attack, and it is even more serious than ordinary phishing attacks and viruses because it endangers the safety of the party and the country. GFW's R&D is no different from that of ordinary network security companies, monitoring and analyzing network traffic, reverse engineering harmful software, and blocking malicious attacks. The only difference is that GFW must also deal with the so-called counter-revolutionary attacks for national security; additionally, GFW has unlimited money, unlimited population, and unbeatable secrets.

GFW and Netizens' Ecosystem
The government, GFW, and netizens form an ecosystem, with the government and GFW coexisting at the top of the food chain, while netizens are at the bottom. Some call it a "cat-and-mouse game." Observing the historical interaction between GFW and netizens can be summarized as an arms race of mutual enhancement of technical levels. Nevertheless, the relationship between the two has never broken the pattern of GFW becoming increasingly adept at actively hunting down netizens while netizens become increasingly skilled at passive evasion. From the initial ordinary HTTP proxies, SOCKS proxies, to encrypted proxy software, to a variety of web proxies, to VPNs, SSH proxies, to P2P networks, and hybrid methods. However, none of these methods have completely escaped GFW's blocking because GFW is very good at offense, while netizens have only been searching for new ways to evade.

The problem with this interactive model is that as the arms race continues, GFW becomes more refined and powerful, while netizens continuously lose their cards, making circumvention increasingly difficult and costly. GFW is a professional in this field (network security), while netizens, despite their collective wisdom, lack effective organization and cannot match GFW. Therefore, if one looks a little further, one will realize that this model is unsustainable for netizens; one day, GFW will surpass the technical baseline of the vast majority of netizens. Thus, the only way out is to change the approach and break through this model.

Responding to GFW#

The fundamental principle for netizens to break through the current dynamic situation is to utilize GFW's characteristic of being good at offense but not at defense. Rather than viewing GFW as a national network violence agency, it is better to see GFW as a network security agency; in fact, it is also a network security agency (CNCERT/CC). Any security system inevitably has vulnerabilities and weaknesses, and the GFW security solution (National Information Security Management System) is no exception. Netizens are not lacking in technology; rather, the technology has not been effectively organized and projected in this direction. In fact, there are many vulnerabilities and weaknesses in GFW, some of which are even theoretically unsolvable, which will be discussed in detail later. As mentioned in the article "Burn After Reading," although GFW is one of the few top research forces in China combined with strong national support, "it cannot escape the nature of imitation - it is easy to produce something, but it is not possible to make it detailed and rigorous."

Furthermore, aside from utilizing GFW's inherent problems, netizens can even consider taking a network self-defense approach to stop GFW's illegal actions. Institutions like GFW have no administrative legislation, no public opinion supervision, and no appeal channels, and are abused by a few individuals with ulterior motives, using the name of national security to block websites unrelated to national security, interfering with and attacking legitimate network communications, "tampering with legitimate data being transmitted in computer information systems, attacking computer information systems to render them unable to operate normally," and have repeatedly caused national network failures, violating Article 286 of the Criminal Law of the People's Republic of China, with particularly serious circumstances and particularly severe consequences, etc.

However, on the other hand, a certain steady state has already formed or is about to form between GFW and netizens. This steady state is a dynamic balance under the struggle between the two sides, which needs to be consciously maintained. An uncontrollable network cannot be tolerated by the government; when the network becomes uncontrollable, the government will not hesitate to cut off all networks (you must know what I am talking about), and the destruction of the steady state means the destruction of the environment. An ideal steady state is for the network to be in a "seemingly" controllable state, allowing GFW to feel a false sense of victory in continuously achieving small blocking successes, while individual netizens each master decentralized circumvention methods. A centralized mass circumvention method (the most typical example is setting static resolution in hosts) will inevitably be discovered by the authorities and blocked by GFW. The next generation of circumvention methods should be decentralized (P2P), niche, diverse, hybrid, and dynamically updated.

References#

There are two important documents to recommend here.

First, Thomas Ptacek et al. published Insertion, Evasion, and Denial of Service: Eluding Network Intrusion Detection in 1998.

Initial Understanding of DNS Pollution#

DNS (Domain Name System) pollution is a method used by GFW to prevent ordinary users from communicating with a target host due to receiving false target host IPs; it is a type of DNS cache poisoning attack (DNS cache poisoning). Its working method is: to perform intrusion detection on DNS queries on UDP port 53 that pass through GFW. Once a request matching a keyword is detected, it immediately masquerades as the target domain name's resolution server (NS, Name Server) and returns false results to the querier. Since typical DNS queries have no authentication mechanism, and DNS queries are usually based on UDP, which is a connectionless and unreliable protocol, the querier can only accept the first correctly formatted result that arrives and discard subsequent results. For netizens who do not understand the relevant knowledge, this means that when querying authoritative servers abroad using the NS provided by the ISP, they are hijacked, and their cache is polluted, so by default, querying the ISP's server will obtain false IPs; while users directly querying foreign NS (such as OpenDNS) may also be hijacked by GFW, thus still unable to obtain the correct IP without preventive measures. However, there is a very simple and effective countermeasure against this attack: modifying the Hosts file. However, the entries in the Hosts file generally cannot use wildcards (e.g., *.blogspot.com), and GFW's DNS pollution performs partial matching rather than exact matching, so the Hosts file also has certain limitations, and netizens attempting to access such domains will still encounter significant difficulties.

Observing DNS Pollution
"Know yourself and know your enemy, and you will never be defeated." In this section, we need to use the packet listening tools mentioned earlier and refer to the DNS hijacking diagnosis section. By entering udp.port eq 53 in the filter column of Wireshark, we can conveniently filter out other irrelevant packets. To further reduce interference, we choose a foreign IP that does not provide domain name resolution services as the target domain name resolution server, such as 129.42.17.103. Run the command nslookup -type=A www.youtube.com 129.42.17.103. If there is a response, it can only indicate that this is GFW's forged response, which is the object we want to observe and study.

Characteristics of Forged Packets
After a series of tight queries, we can find that the IPs returned by GFW come from the following list:

4.36.66.178
203.161.230.171
211.94.66.147
202.181.7.85
202.106.1.2
209.145.54.50
216.234.179.13
64.33.88.161
Regarding these eight special IPs, readers are encouraged to explore two questions: 1. Why specific IPs instead of random IPs, what are the disadvantages of fixed IPs and random IPs; 2. Why these 8 IPs instead of others, why did these 8 IPs fall under GFW's influence? For searching this type of information, in addition to www.google.com, www.bing.com has a dedicated function for searching IP corresponding websites, the method is to enter ip address to search. www.robtex.com is a website dedicated to collecting domain name resolution information. Readers are welcome to leave their thoughts and discoveries.

From the results collected by Wireshark (in fact, a better method is to save the results as a pcap file, or directly use tcpdump, then extract the data to obtain statistics), we categorize the DNS pollution packets sent by GFW based on the fingerprint characteristics of the IP header into two types:

Type One:
ip_id == ____ (is a fixed number, the specific value is left as an exercise).
No "Don't Fragment" option is set.
No service type is set.
For the same source IP and destination IP pair, the polluted IP returned by GFW cycles through the above 8 in the order given. It is unrelated to the source port and related to the source IP destination IP.
TTL return value is relatively fixed. TTL is the "Time to Live" value in the IP header, which decreases by 1 for each router it passes through; IP packets with a TTL of 1 will no longer be forwarded by routers, and most routers will return a message to the source IP stating "ICMP time to live exceeded in transit."
Type Two:
Each packet is sent three times.
No "Don't Fragment" option is set.
The "high throughput" service type is set.
(ip_id + ? * 13 + 1) % 65536 == 0, where ? is an interesting unknown. The ip_id in the same source IP and destination IP pair decreases in units of 13 between consecutive queries, with the minimum and maximum observed ip_id values being 65525 (i.e., -11, overflowed!) and 65535.
For the same source IP and destination IP pair, the polluted IP returned by GFW cycles through the above 8 in the order given. It is unrelated to the source port and related to the source IP destination IP.
For the same source IP and destination IP pair, the TTL return value increases by 1 in sequence. The TTL value when sent by GFW has 64 possible values. Note: The TTL of the packet received by the source IP has been modified by routing, so the observed TTL may not only have 64 possible values; this is due to changes in network topology. The "relatively fixed" in Type One is also added considering occasional changes in network topology; perhaps it can be considered that the initial value when GFW sends is constant.
(The above results guarantee authenticity but do not guarantee timeliness; GFW's characteristics may change at any time, especially in terms of timing characteristics and transmission layer characteristics correlation. In the past six months, GFW's characteristics have changed increasingly frequently in many aspects, which will be mentioned later when discussing TCP blocking.)

Further experiments can be conducted: since the current Type Two's TTL range is an integer multiple of the number of IPs, by controlling the DNS query's TTL to ensure that GFW's return is exactly met (avoiding the irregular changes in TTL observed by the receiver caused by dynamic routing), we can observe whether there is a corresponding relationship between the IP and the remainder of TTL divided by 8, and whether this relationship still holds after changing the source IP and destination IP pair. This relates to GFW's load balancing algorithm and the independence and consistency of the response counter (hit counter). In fact, exhaustively providing all results about GFW is also meaningless; here we only propose such a research method, and if readers are interested, they can continue to explore.

Each query usually receives one Type One packet and three identical Type Two packets. Changing the query command from type=A to type=MX or type=AAAA or other types can see that nslookup prompts that it received a corrupted reply packet. This is because GFW's DNS pollution module is poorly constructed. The ANSWER part of the DNS response packet forged by GFW usually consists of only one RR (i.e., one record), and the RDATA part of this record is one of those 8 polluted IPs. For Type Two, the TYPE value of the RR record is directly copied from the user's query. Thus, the user receives such a peculiar corrupted packet. The characteristics of the UDP payload content of the DNS response packet are:

Type One
The RR record in the ANSWER part of the DNS response packet is indicated by 0xc00c referring to the queried domain name.
The TTL in the RR record is set to 5 minutes.
Regardless of the TYPE of the user's query, the TYPE of the response packet is always set to A (meaning IPv4 address), and CLASS is always set to IN.
Type Two
The RR record in the ANSWER part of the DNS response packet is the full text of the queried domain name.
The TTL in the RR record is set to 1 day.
The TYPE and CLASS values of the RR record are copied from the query sent by the source IP.
Terminology explanation: RR = Resource Record: a record in the DNS data packet; RDATA = Resource Data: the data part of a record; TYPE: the type of query, which can be A, AAAA, MX, NS, etc.; CLASS: generally IN[ternet].

Trigger Conditions
In fact, DNS also has a TCP protocol part. Experiments have found that GFW has not yet hijacked and polluted DNS queries on the TCP protocol. In terms of matching rules, GFW performs substring matching rather than exact matching, and GFW actually matches the domain name by converting it into a string first. This is worth special mention because in DNS, domain names are represented as follows: an integer n1 represents the length of the first part separated by ".", followed by n1 letters, followed by a number, and several letters, until a number is 0 to end. For example, www.youtube.com is represented as "\x03www\x07youtube\x03com\x00". Therefore, it can actually be observed that queries for www.youtube.coma are also hijacked.

Current Situation Analysis
4.36.66.178, keyword. whois: Level 3 Communications, Inc. located in Broomfield, CO, U.S.
203.161.230.171, keyword. whois: POWERBASE-HK located in Hong Kong, HK.
211.94.66.147, whois: China United Network Communications Corporation Limited located in Beijing, P.R. China.
202.181.7.85, keyword. whois: First Link Internet Services Pty Ltd. located in North Rocks, AU.
202.106.1.2, whois: China Unicom Beijing province network located in Beijing, CN.
209.145.54.50, reverse resolution to dns1.gapp.gov.cn, the domain resolution server of the General Administration of Press and Publication? Currently, dns1.gapp.gov.cn is now 219.141.187.13 in bjtelecom. whois: World Internet Services located in San Marcos, CA, US.
216.234.179.13, keyword. reverse resolution to IP-216-234-179-13.tera-byte.com. whois: Tera-byte Dot Com Inc. located in Edmonton, AB, CA.
64.33.88.161, reverse resolution to tonycastro.org.ez-site.net, tonycastro.com, tonycastro.net, thepetclubfl.net. whois: OLM, LLC located in Lisle, IL, U.S.
It can be seen that most of the above IPs are not from China. If a website is set up on this IP, all requests from Twitter and Facebook in China will be directed to here—fortunately, GFW still has HTTP URL keyword TCP blocking—HTTPS requests only put actual pressure on the target IP, equivalent to a DDoS attack initiated by Chinese netizens against this IP. It is unknown whether the victim website or ISP has any claims.

We attempted to use Bing.com's IP reverse search function to search for the above DNS pollution-specific IPs and found some interesting domain names. Obviously, these domains are all victims of DNS pollution.

For example, the unfortunate edoors.cn.china.cn, Ningbo China Door Industry Network, is actually because edoors.cn was DNS polluted. Other victims include * chasedoors.cn.china.cn, American Chase Door Industry (Shenzhen) Co., Ltd.
Also, *.sf520.com seems to be a domestic game private server website. www.sf520.com is also a private server website. It can be seen that the collusion between the domestic administrative system and business is serious; a "national information security infrastructure" can even be used to protect the interests of some online game companies.
In addition, there are some personal blogs. www.99tw.net is also a game website.
There is also www.why.com.cn, which has a good name.
And www.999sw.com Guangdong Shangjiu Biodegradable Plastics Co., Ltd. Biodegradable Resin | Tackifier Masterbatch | High-Efficiency Water Retention Agent | Flood Prevention Zip Code: 523128... What is going on here? It doesn't seem to be linked to any reactionary website. Some people even ask what happened to have so many IP results.
www.facebook.com www.xiaonei.com, what is going on? In fact, it is because someone accidentally connected the two addresses, and the search engine thought this was a link, but in fact, this domain does not exist, but was polluted during resolution, leading to the belief that this domain existed.
The unfortunate www.xinsheng.net.cn—Wuhan Xinsheng Computer Co., Ltd., because www.xinsheng.net was implicated.
Prevention and Utilization of DNS Hijacking
Previously, we have discussed that GFW is an intrusion detection system that only monitors traffic and currently has no ability to cut off network transmission; its "blocking" is merely taking advantage of the weaknesses of the network protocol that can easily be hijacked (Session hijacking). Using connectionless UDP DNS queries is simply being answered by GFW, and the real answer follows behind. Therefore, a natural idea to counter GFW's attack is:

Determine the authenticity based on timing characteristics, ignoring early replies.
Typically, for IPs on both sides of GFW, their RTT (Round-trip time) should be greater than the RTT from the source IP to GFW, and we can try to statistically determine a suitable average of these two RTTs as a standard for authenticity judgment. Additionally, since GFW has not processed TCP DNS requests, we can specify using TCP instead of UDP to resolve domain names. We can also query from a line without GFW to an authoritative NS that has not been polluted, such as the "remote resolution" mentioned at the beginning of the article. However, the two conditions marked in bold are indispensable; for example, the widely circulated claim that OpenDNS can resist DNS hijacking is a misconception because the line to the OpenDNS server goes through GFW.

The essential solution is to add a verification mechanism to the DNS protocol, such as DNSSEC (Domain Name System Security Extensions), allowing clients to perform recursive queries without querying already polluted recursive resolution servers (Recursive/caching name server). However, the downside is that currently not all authoritative name resolution servers (Authoritative name server) support DNSSEC. Unbound provides a recursive resolution program with DNSSEC verification mechanism.

Additionally, GFW's DNS hijacking may also be exploited by hackers, causing serious damage to both international and domestic internet. On one hand, GFW may pollute all DNS queries in some emergencies according to "national security" needs, and may designate the polluted IP to a specific IP, causing a portion of global network traffic to be directly transferred to the target network, instantly paralyzing the target network. Of course, our great motherland solemnly promises "not to be the first to use nuclear weapons"... On the other hand, GFW sends the forged DNS return packets to the source IP address's source port; what if the attacker forges the source IP? This would lead to a famous amplification attack: ten times the traffic of the DNS query sent by the attacker will return to the forged source IP. If the port on the forged source IP has no service running, many systems with poor security configurations will need to return an ICMP Port Unreachable message, and the received information will be appended to this ICMP message; if the port on the forged source IP has a service running, a large amount of illegal UDP data will flood in, causing the service provided by that port on the forged source IP to crash. If an attacker queries at a speed of 1Gbps, a small IDC (like the DNSpod attack incident) or even a regional ISP could be paralyzed (the Storm Video incident). The attacker may also set the TTL so that this traffic just passes through GFW to generate hijacking responses, and is discarded by routing before reaching the actual target, achieving "empty to empty without landing." The attacker may also set the target IP of the attack traffic to be forged as an IP that has normal communication or other associations with the forged source IP, making it more difficult to identify. This effectively turns a national firewall into a national-level reflection amplification denial-of-service attack launchpad.

Most seriously, this type of attack has a very low entry threshold; anyone who can program in C language can write such a program in a few days after reading the documentation for libnet or libpcap. As a set of intrusion prevention systems, GFW is destined to lack the ability to specifically prevent this type of attack because if GFW selectively ignores some DNS queries without hijacking them, netizens will have the opportunity to take advantage of the traffic cover to ensure that genuine DNS communications are not polluted by GFW. Especially with UDP, a connectionless protocol, GFW finds it even more difficult to analyze and respond. "The reverse is the movement of the way, the weak is the use of the way."

References#

Yan Boru, Fang Binxing, Li Bin, Wang Yao. "Detection and Prevention of DNS Spoofing Attacks." Computer Engineering, 32(21):130-132,135. 2006-11.
Graham Lowe, Patrick Winters, Michael L. Marcus The Great DNS Wall of China
KLZ Graduation. Evaluation and Issues of Intrusion Prevention Systems

One of the important working methods of FW is to block at the network layer targeting IP. In fact, GFW adopts a much more efficient access control method than traditional access control lists (ACL) - routing diffusion technology. Before analyzing this new technology, let’s first look at traditional technology and introduce a few concepts.

Access Control Lists (ACL)#

ACL can work at layer 2 (link layer) or layer 3 (network layer). Taking ACL working at layer 3 as an example, the basic principle is as follows: If you want to control (for example, cut off) access to a certain IP address on a router, you just need to add this IP address to the ACL through configuration and specify a control action for this IP address, such as simply dropping it. When a packet passes through this router, before forwarding the packet, it first matches the ACL; if the destination IP address of the packet exists in the ACL, it will operate according to the control action defined for that IP address, such as dropping the packet. In this way, access to this IP can be cut off. ACL can also control packets based on the source address. If ACL works at layer 2, then the object controlled by ACL changes from layer 3 IP addresses to layer 2 MAC addresses. From the working principle of ACL, it can be seen that ACL inserts a matching operation into the normal packet forwarding process, which will definitely affect the efficiency of packet forwarding. If the number of IP addresses to be controlled is relatively large, the ACL list will be longer, and the time to match the ACL will also be longer, which will lower the efficiency of packet forwarding, which is unacceptable for some backbone routers.

Routing Protocols and Route Redistribution#

Before discussing routing redistribution, let’s briefly introduce dynamic routing protocols. Under normal circumstances, various routing protocols such as OSPF, IS-IS, and BGP on routers calculate and maintain their own routing tables, and all routing entries generated by these protocols are ultimately summarized into a routing management module. For a specific destination IP address, various routing protocols can calculate a route. However, when forwarding packets, which protocol's calculated route is used is determined by the routing management module based on certain algorithms and principles, ultimately selecting one route as the actual routing entry.

Static Routing#

In contrast to dynamic routing entries calculated by dynamic routing protocols, there is a type of routing that is not calculated by routing protocols but is manually configured by administrators, which is called static routing. This type of routing entry has the highest priority; when there is a static route, the routing management module will prioritize selecting the static route rather than the dynamic route calculated by routing protocols.

Route Redistribution#

As mentioned earlier, under normal circumstances, each routing protocol only maintains its own routes. However, in some cases, such as when there are two AS (Autonomous Systems), both using OSPF protocol internally, but OSPF cannot communicate between AS, then the routes between the two AS cannot communicate. To allow intercommunication between the two AS, an inter-domain routing protocol BGP must be run between the two AS, and through configuration, the routes calculated by OSPF within the two AS can be redistributed via BGP. BGP will announce the internal routes of the two AS to each other, achieving route intercommunication. This situation is route redistribution of OSPF routes via BGP.

Another situation is that an administrator configures a static route on a router, but this static route can only take effect on this router. If you want it to take effect on other routers, the clumsiest way is to manually configure a static route on each router, which is cumbersome. A better way is to let dynamic routing protocols like OSPF or IS-IS redistribute this static route, thus distributing this static route to other routers, saving the trouble of manually configuring each router.

GFW Routing Diffusion Technology Working Principle#

Earlier, we mentioned "misuse." Under normal circumstances, static routes are given by administrators based on network topology or for other purposes; this route must at least be correct, guiding the router to forward packets to the correct destination. However, in GFW's routing diffusion technology, the static routes used are actually incorrect routes, and they are intentionally configured incorrectly. The purpose is to guide packets originally destined for a certain IP address to a "black hole server," rather than forwarding them to the correct destination. This black hole server can do nothing, causing packets to be silently discarded. More often, it can analyze and statistically gather information on these packets, or even provide a false response.

With this new method, every IP address configured in the ACL can be converted into a piece of intentionally misconfigured static routing information. This static routing information will guide the corresponding IP packets to the black hole server, and through the dynamic routing protocol's route redistribution function, these erroneous routing information can be published to the entire network. Thus, for routers, it is now just a routine packet forwarding action based on this routing entry, without needing to match the ACL, greatly improving the efficiency of packet forwarding. However, this routine forwarding action of the router is forwarding packets to the black hole router, thus achieving both efficiency and the purpose of controlling packets, making the means more sophisticated.

This technology is not used in normal network operations; erroneous routing information will disrupt the network. The needs of normal network operations and control systems differ greatly; the number of IP addresses that need to be blocked will increase. In normal network operations, ACL entries are generally fixed, not changing much and not being numerous, which will not have a significant impact on forwarding. However, this technology directly modifies the backbone routing table frequently; if problems arise, it will cause backbone network failures.

Therefore, GFW has misused routing diffusion technology; normally, no operator would disseminate erroneous routing information everywhere, which is completely a crooked idea. Or, compared to normal network operations, GFW's application of routing diffusion technology is a clever approach. The normal routing protocol function has been abused to such an extent, and it is very practical and efficient; the Tianchao is indeed full of talents in this regard.

Measurement#

In summary, GFW's dynamic routing system is: manually configured (c) sample routers (sr) static routes (r), redistributing this route (r) to the entry routers (or) of each ISP, directing specific network traffic to the black hole server (fs) for recording. Therefore, measurable items include:

The list of blocked IPs: can be collected through a collaborative reporting mechanism from user reports, or obtained by scanning famous sites; (rumor: the capacity of GFW's dynamic routing system is tens of thousands of rules)
ISP entry routers affected by GFW: can be measured through traceroute collaboration across multiple ISPs within a wide area;
The delay from keyword effectiveness to dynamic routing effectiveness: can be observed by establishing a honeypot and submitting it to GFW, then observing its response;
The robustness of the black hole server: filling the black hole server with pseudo-source noise traffic and observing its response.
Reference
Liu Gang, Yun Xiaochun, Fang Binxing, Hu Mingzeng. "A Large-Scale Network Control Method Based on Routing Diffusion." Journal of Communications, 24(10): 159-164. 2003.

Li Lei, Qiao Peili, Chen Xunxun. "Implementation of an IP Access Control Technology." Information Technology, (6). 2001.

In-Depth Understanding of GFW: Internal Structure#

Previously, we conducted a lot of black box testing on GFW. Although most experimental data have been well explained, there are still some data or patterns (irregularities) that have not been reasonably explained. For example, the various timeout durations of TCP connections, such as when Google's port 443 is statelessly blocked, the duration of subsequent states related to the source IP. For example, the continuous variation characteristics of window sizes and TTLs during general TCP connections. These issues have exceeded the pure protocol scope and require further understanding of GFW's internal structure to understand their causes. Therefore, this chapter introduces the implementation and internal structure of GFW.
In general, GFW is a large-scale distributed intrusion detection system built on high-performance computing clusters. Its distributed architecture brings high scalability, successfully transforming the issue of handling massive traffic at backbone points into a problem of purchasing supercomputers to stack processing power. It currently has the capability to perform complex and deep detection on all international network traffic in mainland China, and its processing capacity "still has great potential."

Line Access#

For GFW's position on the network, there is a vague understanding: "Bypass monitoring at three international exits." However, we hope to have a detailed understanding of what happened before the last hop before going abroad.

GFW hopes to couple the link heterogeneity of different lines and has studied coupling technologies for various types of links, including Fast Ethernet, low-speed WAN, optical fiber, and dedicated signals. According to the "Management Measures for International Communication Entry and Exit Bureaus," several major ISPs have their own international entry and exit bureaus, which converge at public international optical cables, such as before the landing station of submarine cables. According to existing information, the Security Management Center (CNNISC) has independent exchange centers, and reports indicate that each ISP connects to its exchange center separately. Thus, several materials can form a consistent explanation: to adapt to the different link specifications of different ISPs, GFW's own exchange center needs to integrate different links, with different ISPs leading to bypass access to GFW. Lines that do not connect to GFW are referred to as "defensive lines" [unreliable source], unaffected by GFW. The types of access lines should primarily be fiber optic lines, so this access method is usually referred to as optical splitting. This is "bypass optical splitting." Additionally, experiments have found that GFW's access points are not necessarily close to the last hop, so the diagram is represented by a dashed line. It is important to note that the location where GFW's response traffic reconnects to the network is difficult to confirm; here it is only assumed to be the same as the access point.

Load Balancing#

Faced with the enormous uneven traffic generated by multiple backbone monitoring lines, traffic cannot be directly connected to the processing cluster but must first be aggregated and then load-balanced into uniform small flows, which are sent to the processing cluster for parallel processing. First, it is necessary to convert the communication interfaces of network devices (Pos, ATM, E1, etc.) into host communication interfaces (FE, GE, etc.) available for nodes. The algorithm for handling load balancing has been carefully considered, aiming to achieve: uniform traffic distribution, maintaining connection constraints for connected protocols, and simplicity of the algorithm. Connection constraints mean that all communications between a pair of address-port pairs must be guaranteed to be scheduled to the same node.

In GFW's article on load balancing, two algorithms are mainly proposed. One is round-robin scheduling; for TCP, when a SYN arrives, it takes the most recently allocated node number modulo and adds 1, storing the connection in a hash table. When subsequent traffic arrives, it can query the hash table to obtain the target node number. Another is hash based on connection parameters; for N output ports, the output port number is H(source address, destination address, source port, destination port) mod N, where this H function can be xor.

In a previous experiment, we encountered a special pattern where load balancing played an important role in explaining its phenomenon, which is detailed below.

An experiment on window values
Experimental steps: Send a specially crafted packet containing keywords through GFW and receive the blocking response packet returned by GFW. Because once the blocking is triggered, connections with the same address pair and destination port will be affected by subsequent blocking, we generally adopt a sequential scanning method by changing the target port. Through some preliminary experiments, we have discovered and confirmed that a certain type (Type Two) of blocking response packets has a linear relationship between TTL and id, and that the window size has a linear relationship with the TTL.

However, in sequential scanning, there is a special pattern that cannot be explained by existing evidence. The further experimental steps are: keeping the source and destination addresses unchanged while sequentially scanning the target port, recording the window of the blocking response packets received during the blocking trigger events. The data is shown in the figure below, with the horizontal axis representing time (seconds) and the vertical axis representing port numbers, with each point representing the window value observed in a blocking trigger event.

It can be clearly seen that there is a linear increasing trend. The image is taken from a local zoom:

It can be seen that there are 13 relatively continuous lines at the same time. This raises several questions: Why are there independent distinguishable lines? What do these lines represent? Why are there 13? Why does each line increase?

Why are there independent distinguishable lines? The phenomenon has a clear sub-pattern that can be further divided, rather than being an overall random quantity, and each sub-pattern has a good continuous increasing property. Therefore, it can be inferred that the internal mechanism producing this phenomenon is not a single entity but multiple independent entities. Further experimental facts show that if the sequential scanning of ports increases by 13 each time, only one relatively continuous line will be produced, excluding other lines. This directly proves the indivisibility and entity nature of the results produced by modulo 13 ports, as well as the independence between congruence classes.

What do these lines represent? We speculate that these 13 lines represent 13 independent entities behind them, each generating blocking responses based on some internal state, and the window value is a direct manifestation of its internal state.

Why are there 13 instead of 1 or 2? At this point, load balancing serves as a good model to explain this fact. If GFW has 13 nodes online, and wishes to evenly distribute traffic to each node, then according to the previous paper, it adopts a modulo method, distributing traffic based on the target port modulo 13, so that packets with the same target port modulo 13 will enter the same node. In fact, an earlier experiment found 15 lines, and similarly, it can be speculated that there are 15 nodes online.

Why does each line increase? During the experiment, it was found that each time blocking occurs, GFW sends two groups of blocking packets with increasing window values to both parties in the connection. Thus, for each party, each blocking will increase the window value by 2. The increase of each line indicates that nodes are continuously generating blocking packets to increase the window value, part of which is triggered by the observer's observation behavior, while another part is caused by normal network traffic. If the data is differentiated and the observer's influence is deducted, it may even estimate the rate at which each node generates blocking.

However, why should the window increase? The motivation behind this is difficult to find a reasonable explanation; perhaps this window value serves as a counter, or perhaps it is to distinguish packets generated by different nodes based on ip.id. In fact, the window value in Type One is almost random but fixed for ip.id, and the increase of the window is not a necessity.

However, further experiments found that if the target port and source address remain unchanged while the target address changes sequentially, the image appears relatively chaotic, and no pattern can be found. Nevertheless, it is still possible to identify the presence of 13 lines simultaneously in a local area, further confirming the speculation of "13 nodes online." The significance of this experiment lies in the decomposition and reduction of phenomena, separating some independent entity structure within GFW, providing further practical evidence for the load balancing algorithm proposed in the paper, and gaining a deeper understanding of GFW's internal structure.

Data Processing#

When data flows through the data bus to reach the terminal node, it needs to be extracted from the physical layer for further analysis by the upper layer, which is called packet capture. The ordinary approach is to notify the kernel once through the network card interrupt to fetch, then control DMA to transfer to the kernel space, and then the user uses read() to let the kernel copy_to_user() to copy the sk_buff data to the user space. However, this copying brings unnecessary overhead. Therefore, GFW designs a circular queue cache to reduce the overhead of frequent interrupt system calls through a half-polling and half-interrupt mechanism, using mmap to achieve zero-copy, directly transferring data from the network card DMA to user space. This greatly improves performance (and coupling).

Link layer data is in hand, and the next step is to hand over the data to the TCP/IP stack. The paper repeatedly mentions libnids (this library we also caught sight of at first, later found it not useful for diagnosis), taking it as a benchmark (and possibly improving it in a way that conforms to national conditions), developing a multi-threaded TCP/IP (automaton). Later, further optimization of automaton decomposition was considered. Later, a two-level connection state recording table was proposed, where a lightweight circular hash table can alleviate a large number of invalid connections and SYN Flood situations, while the second-level table truly stores connection information. The experimental results are consistent with this: the timeout duration after sending SYN is much shorter than that after sending the first ACK. The literature also mentions libnids' half_stream; from the actual situation, GFW's TCP stack indeed has a distinct half-connection characteristic, meaning that a TCP stack in one direction only detects data from the client to the server, or vice versa. This directly results in the fact that even if the server is not online and does not respond, the client can still pretend to perform a three-way handshake and trigger a bunch of RSTs. Looking on the bright side, perhaps this is because the multi-threaded TCP stack does not want to deal with data sharing control issues when restoring full connections. In summary, GFW has a very lightweight TCP/IP stack that can handle most connections that comply with RFC. If users are a bit clever, they can get through; GFW either sits idly by or rewrites the TCP stack.

The TCP/IP stack segments and reassembles the data, and after the flow is reassembled, it is handed over to the application layer for parsing. The application layer consists of many plugin modules, loosely coupled and easy to deploy. Its application layer plugins include "HTTP, TELNET, FTP, SMTP, POP3, FREENET, IMAP, FREEGATE, TRIBOY."

Interestingly, this is the first official confirmation of the adversarial relationship between GFW and Freegate, Freenet, and Triboy. The application layer protocols are all familiar and do not need much explanation, but application layer issues are more numerous than transport layer issues. Several modules have some small problems, such as a certain type of HTTP module only recognizing CRLF as EOL, while LF causes it to freeze. For example, a certain type of DNS module sends DNS interference packets, with 15 or 16 of them having checksum errors, and querying AAAA also returns A, which is worse than turning it off. Most modules are just getting by, just working, and not perfect at all. The problems listed here, according to general software design rules, are just the tip of the iceberg. This leads to the inference that GFW's design philosophy is: better is worse.

However, when it comes to topics that can produce papers, GFW is never vague, especially in pattern matching. The application layer module parses the application layer protocol well, and then checks whether there are keywords somewhere, performing string matching. A lot of papers have been produced, improving the AC algorithm and BM algorithm, just short of doing the assembly work, arriving at a multi-pattern matching algorithm based on finite state automata, particularly suitable for GFW's demand for predefined keywords. In summary, the