Internet Censorship in China

Wikipedia offers an excellent technical overview of the Internet Censorship in China as well as the underlying technical features of the “firewall” (the Golden Shield Project) which is used to censor internet content and its principal architect Fang Binxing.

Overview

This article concentrates on the effects of the internet censorship to work-related aspects, and so it currently does not cover:

political motivations, justifications or evaluations of the internet censorship
advanced strategies to circumvent internet censorship
internet censorship in Chinese social media and in the tainted TOM-Skype (which is phased out now)

Seasonal and Geographic Dependencies

It is important to know that the degree of internet censorship in China is not unified, but depends on:

Public holidays: During public holidays like Chinese New Year or the Golden Week, more restrictions seem to apply and encrypted connections to overseas hosts are slower than otherwise.
Important political events: During power transitions, politbüro sessions or important trials of cadres of the Chinese Communist Party (CCP), tighter restrictions apply, more sites may be blocked, and the overall speed to overseas hosts is throttled.
Network type: Mobile networks often have tighter restrictions than wired networks at home, and VPN clients that work well from a residential connection might be blocked at all in a mobile network.
Operator: Some operators (typically smaller ones) have a more “lenient” approach than others. Even the same operator can have different rules depending on whether an ADSL or an Ethernet connection in a residential apartment is used.
Provinces: Some Chinese provinces have a more “lenient” approach than others.
Entity: Some entities get “special attention” from the Chinese authorities. The Goethe Institut in Beijing, for example, reportedly had experienced tighter restrictions than residential internet accesses in Beijing during the transition from Hu Jintao (胡锦涛) to Xi Jinping (习近平) in 2012.

Approach

Internet censorship in China is not an “all or nothing” approach one as one might expect initially. Rather than that, it is categorized by:

Some pages and services are completely blocked. Example: [1], [2], [3], [4]
Some pages and services are hampered and might work sometimes, but sometimes not. Or they might work on a smartphone, but not on a desktop to the same degree. Examples: [5], [6]
Some pages and services work without problems. Examples: [7], [8]
Overseas VPN providers usually work (although using them in China is a “legal grey area”), but selected ones may experience temporary problems, especially if they have accumulated many clients from within China.
Some hosts may be blocked because they have services that contravene the ideas of the Chinese authorities, but they may be unblocked when the related service has been shut off.

Goal

The goal of this a bit ambiguous approach is to make it troublesome for users in China to access certain services and to incentivize Chinese internet users to the domestic counterparts of internationally known services like:

Baidu (百度) Search rather than Google Search
Baidu Maps (电子地图-百度) rather than Google Maps
Sina Weibo (新浪微博), Tencent Weibo (腾讯微博), Ren Ren (人人网), etc. rather than Google+, Facebook, Twitter

Technical Realization

Basically, there are three layers of filtering which are applied and which have different advantages and disadvantages:

DNS Spoofing

DNS Spoofing: When an uniform resource locator (URL) (e.g. “plus.google.com”) is entered into the address line of a web browser, the user’s computer has to resolve this name into an IP address. This task is performed by a name server via port UDP 53 or TCP 53. Then, the web browser contacts the respective destination (as referenced by its IP address) and requests a web page. The whole procedure of address resolution is hidden from the user. Of course, the name server itself must also be known to the user’s machine by its IP address, either because the machine has been configured to use a specific name server (static IP configuration) or because the IP address has been transferred to the user’s machine in the course of the dynamic host configuration protocol (DHCP). When a user connects his computer to an internet connection inside China using DHCP, then he is assigned a Chinese name server, usually from an Internet Service Provider (ISP). All the Chinese ISPs are given a list of domains which they must not resolve correctly, and their name servers typically hand back bogus or incorrect IP addresses then. An example shall highlight this:

Asking for address resolution of “www.facebook.com” with a Chinese name server (202.96.69.38) results in an incorrect address (93.46.8.89) belonging to an ISP in Milan (Italy):

caipirinha:~ # nslookup www.facebook.com 202.96.69.38 Server: 202.96.69.38 Address: 202.96.69.38#53

Non-authoritative answer: Name: www.facebook.com Address: 93.46.8.89

caipirinha:~ # whois 93.46.8.89 % This is the RIPE Database query service. % The objects are in RPSL format. % % The RIPE Database is subject to Terms and Conditions. % See http://www.ripe.net/db/support/db-terms-conditions.pdf

% Note: this output has been filtered. % To receive output for a database update, use the "-B" flag.

% Information related to '93.46.0.0 - 93.46.15.255'

% Abuse contact for '93.46.0.0 - 93.46.15.255' is 'abuse@fastweb.it'

inetnum: 93.46.0.0 - 93.46.15.255 netname: FASTWEB-DPPU descr: Infrastructure for Fastwebs main location descr: NAT POOL 6 for residential customer POP 3602, public subnet country: IT admin-c: IRSN1-RIPE tech-c: IRSN1-RIPE status: ASSIGNED PA mnt-by: FASTWEB-MNT remarks: In case of improper use originating from our network, remarks: please mail customer or abuse@fastweb.it source: RIPE # Filtered

person: IP Registration Service NIS address: Via Caracciolo, 51 address: 20155 Milano MI address: Italy phone: +39 02 45451 fax-no: +39 02 45451 nic-hdl: IRSN1-RIPE mnt-by: FASTWEB-MNT remarks: remarks: In case of improper use originating remarks: from our network, remarks: please mail customer or abuse@fastweb.it remarks: source: RIPE # Filtered

% Information related to '93.44.0.0/14AS12874'

route: 93.44.0.0/14 descr: Fastweb Networks block origin: AS12874 mnt-by: FASTWEB-MNT source: RIPE # Filtered

% This query was served by the RIPE Database Query Service version 1.70.1 (WHOIS1)

The next example shows the correct address resolution by asking the Google public name server (8.8.8.8). The resolved address is 31.13.73.65 and really belongs to Facebook.

caipirinha:~ # nslookup www.facebook.com 8.8.8.8 Server: 8.8.8.8 Address: 8.8.8.8#53

Non-authoritative answer: www.facebook.com canonical name = star.c10r.facebook.com. Name: star.c10r.facebook.com Address: 31.13.73.65

caipirinha:~ # whois 31.13.73.65 % This is the RIPE Database query service. % The objects are in RPSL format. % % The RIPE Database is subject to Terms and Conditions. % See http://www.ripe.net/db/support/db-terms-conditions.pdf

% Note: this output has been filtered. % To receive output for a database update, use the "-B" flag.

% Information related to '31.13.64.0 - 31.13.127.255'

% Abuse contact for '31.13.64.0 - 31.13.127.255' is 'domain@fb.com'

inetnum: 31.13.64.0 - 31.13.127.255 netname: IE-FACEBOOK-20110418 descr: Facebook Ireland Ltd country: IE org: ORG-FIL7-RIPE admin-c: RD4299-RIPE tech-c: RD4299-RIPE status: ALLOCATED PA mnt-by: RIPE-NCC-HM-MNT mnt-lower: fb-neteng mnt-routes: fb-neteng source: RIPE # Filtered

organisation: ORG-FIL7-RIPE org-name: Facebook Ireland Ltd org-type: LIR address: Facebook Ireland Ltd Hanover Reach, 5-7 Hanover Quay 2 Dublin Ireland phone: +0016505434800 fax-no: +0016505435325 admin-c: PH4972-RIPE mnt-ref: RIPE-NCC-HM-MNT mnt-ref: fb-neteng mnt-by: RIPE-NCC-HM-MNT abuse-mailbox: domain@fb.com abuse-c: RD4299-RIPE source: RIPE # Filtered

role: RIPE DBM address: 1601 Willow Rd. address: Menlo Park, CA, 94025 admin-c: PH4972-RIPE tech-c: PH4972-RIPE nic-hdl: RD4299-RIPE mnt-by: fb-neteng source: RIPE # Filtered abuse-mailbox: domain@fb.com

% This query was served by the RIPE Database Query Service version 1.70.1 (WHOIS3)

Now, the obvious circumvention of DNS Spoofing by the Chinese ISP seems to be to use Google’s public name server. However, the Golden Shield clandestinely reroutes all DNS requests to outside China back to Chinese name servers. Hence, even if you try to access one of Google’s public name server, you and up with a Chinese Google’s public name server.

From a censor’s viewpoint, this approach has the advantage that it is easy to implement and does not require additional infrastructure. It also does not slow down connections between China and overseas hosts. The disadvantage is that it is easy to overcome [9].

IP Blocking

The next layer is a complete blocking of IP addresses or IP address ranges so that these machines simply become unavailable in China. This is often done with overseas VPN endpoints in order to avoid a connection to them from within China. On internet gateways, routing tables can be modified by commands like these:

iptables -t filter -A FORWARD -d 186.192.80.0/20 -j DROP

iptables -t filter -A FORWARD -d 201.7.176.0/20 -j DROP

In this example, all IP traffic to the Brazilian TV operator Rede Globo would be dropped. From a censor’s viewpoint, the advantage is that this might accomplish complete blocking of all outgoing traffic from one country to the destination. However, there are serious drawbacks to this approach:

It might also affect web sites on a shared web hosting service where many domains share a single IP address.
The gateway becomes slower as the filter table grows. For gateways with a high data throughput, this is therefore not a good option.

The Golden Shield therefore uses a different approach:

Traffic is allowed to pass, but during the connection setup, the related connection header data {source_IP, source_port, destination_IP, destination_port} are copied to an inspection server.
If the inspection server determines that this is an “unwanted” connection, it sends reset packets (RST) to both endpoints of the TCP connection, and the endpoints will assume that the TCP connection has been reset [10].

The following blog entry examines this approach:

Machine A (192.168.3.2 via a WLAN router) inside China is trying to connect to a VPN on machine B (178.7.249.240) outside of China, without success.
At first, it looks as if B is rejecting the packets from A, as the log file /var/log/openvpn.log indicates:

Mon Nov 12 09:42:16 2012 TCP: connect to 178.7.249.240:8080 failed, will try again in 5 seconds: Connection reset by peer
Mon Nov 12 09:42:22 2012 TCP: connect to 178.7.249.240:8080 failed, will try again in 5 seconds: Connection reset by peer
Mon Nov 12 09:42:28 2012 TCP: connect to 178.7.249.240:8080 failed, will try again in 5 seconds: Connection reset by peer
Mon Nov 12 09:42:34 2012 TCP: connect to 178.7.249.240:8080 failed, will try again in 5 seconds: Connection reset by peer

But is that really the case? Let us look to the packets in detail:

# tcpdump -v host 178.7.249.240
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
09:46:22.398769 IP (tos 0x0, ttl 64, id 18152, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [S], cksum 0x6fd1 (incorrect -> 0x8408), seq 2461736473, win 4380, options [mss 1460,sackOK,TS val 76503281 ecr 0,nop,wscale 7], length 0
09:46:22.943121 IP (tos 0x0, ttl 49, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [S.], cksum 0x5180 (correct), seq 369907903, ack 2461736474, win 4344, options [mss 1452,sackOK,TS val 39174536 ecr 76503281,nop,wscale 1], length 0
09:46:22.943203 IP (tos 0x0, ttl 64, id 18153, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [.], cksum 0x6fc9 (incorrect -> 0x8ef2), ack 1, win 35, options [nop,nop,TS val 76503826 ecr 39174536], length 0
09:46:22.978993 IP (tos 0x0, ttl 115, id 24791, offset 0, flags [none], proto TCP (6), length 40)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [R], cksum 0x122e (correct), seq 369907904, win 35423, length 0
09:46:28.399410 IP (tos 0x0, ttl 64, id 47187, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [S], cksum 0x6fd1 (incorrect -> 0xbce8), seq 2555496497, win 4380, options [mss 1460,sackOK,TS val 76509282 ecr 0,nop,wscale 7], length 0
09:46:28.884286 IP (tos 0x0, ttl 49, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [S.], cksum 0x4db6 (correct), seq 462715053, ack 2555496498, win 4344, options [mss 1452,sackOK,TS val 39180476 ecr 76509282,nop,wscale 1], length 0
09:46:28.884352 IP (tos 0x0, ttl 64, id 47188, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [.], cksum 0x6fc9 (incorrect -> 0x8b64), ack 1, win 35, options [nop,nop,TS val 76509767 ecr 39180476], length 0
09:46:28.921316 IP (tos 0x0, ttl 102, id 24791, offset 0, flags [none], proto TCP (6), length 40)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [R], cksum 0x659f (correct), seq 462715054, win 4472, length 0
09:46:34.400101 IP (tos 0x0, ttl 64, id 62097, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [S], cksum 0x6fd1 (incorrect -> 0xf2cf), seq 2649257282, win 4380, options [mss 1460,sackOK,TS val 76515283 ecr 0,nop,wscale 7], length 0
09:46:34.922904 IP (tos 0x0, ttl 49, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [S.], cksum 0x2db2 (correct), seq 557101407, ack 2649257283, win 4344, options [mss 1452,sackOK,TS val 39186517 ecr 76515283,nop,wscale 1], length 0
09:46:34.922978 IP (tos 0x0, ttl 64, id 62098, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [.], cksum 0x6fc9 (incorrect -> 0x6b3b), ack 1, win 35, options [nop,nop,TS val 76515805 ecr 39186517], length 0
09:46:34.959184 IP (tos 0x0, ttl 61, id 24791, offset 0, flags [none], proto TCP (6), length 40)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [R], cksum 0xe129 (correct), seq 557101408, win 22427, length 0
09:46:40.400746 IP (tos 0x0, ttl 64, id 32974, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [S], cksum 0x6fd1 (incorrect -> 0x2b7e), seq 2743017357, win 4380, options [mss 1460,sackOK,TS val 76521283 ecr 0,nop,wscale 7], length 0
09:46:40.966840 IP (tos 0x0, ttl 49, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [S.], cksum 0xe3a0 (correct), seq 651499237, ack 2743017358, win 4344, options [mss 1452,sackOK,TS val 39192558 ecr 76521283,nop,wscale 1], length 0
09:46:40.966915 IP (tos 0x0, ttl 64, id 32975, offset 0, flags [DF], proto TCP (6), length 52)
    192.168.3.2.5460 > dslb-178-007-249-240.pools.arcor-ip.net.http-alt: Flags [.], cksum 0x6fc9 (incorrect -> 0x20fe), ack 1, win 35, options [nop,nop,TS val 76521849 ecr 39192558], length 0
09:46:41.003368 IP (tos 0x0, ttl 103, id 24791, offset 0, flags [none], proto TCP (6), length 40)
    dslb-178-007-249-240.pools.arcor-ip.net.http-alt > 192.168.3.2.5460: Flags [R], cksum 0x8ad3 (correct), seq 651499238, win 17099, length 0
...

Now observe the TTL values of the packets which seem to originate from machine B. The real packets [S.], (SYN/ACK) seem to have a TTL of 49, but the [R] (RST) packets have random TTL values. This hints to the fact that the [R] (RST) packets do not come from machine B, but from various machines in a border firewall which is disturbing the setup of the VPN by faking that machine B resets the connection.?

From a censor’s viewpoint, the major advantage is that the gateways do not experience a performance loss as in the case of many “DROP” entries in their IP tables. Furthermore, the attempted connection can be logged on the inspection server and can be archived for “legal purposes”. If there is too much traffic on the gateway, the inspection server may not be able to cope with all inspections. Then, some traffic which otherwise might be blocked may pass the gateway uninterrupted. That situation, however, is more acceptable than a breakdown of the whole gateway which would stop all cross-border internet traffic. This approach is consequently more safe with respect to sudden peaks in internet traffic, especially, if the filters on the inspection server can be scaled according to the traffic. The disadvantage is that this approach can only reset TCP connections and not UDP connections.

Deep Packet Inspection

A system with Deep Packet Inspection looks into the payload of the IP traffic and is thus an intrusive method. By reading the requested web pages and the content of the delivered web page, the system can scan for unwanted keywords or text fragments. The possibilities are basically endless, but the complexity of the filtering algorithms and the computational demands are very high as the whole traffic must pass through inspection servers.

A good example in China is Wikipedia itself: The page Golden Shield Project can be accessed using an https header as then, even the request to the respective wiki site is already encrypted and hence cannot be read by the DPI inspection server. However, calling the page with an http header only [11] will lead to a TCP Reset Attack, and the web page, although it might seem to open up initially, will be reset. The reason is that with the http header, the connection is not encrypted, and the inspection server will encounter unwanted key words in the article itself. If the user is on a domestic or 3G connection, the combination {User_IP_address, Wikipedia_IP_address_block} might be blocked for the subsequent 20 minutes penalizing the user for having accessed the “wrong” content. Theoretically, DPI could also be used to replace “unwanted” content on non-encrypted connections by “wanted” content thereby sending a modified content to the user, different from what the web server actually had sent although so far, no such incident has been reported.

DPI inspection servers usually cannot look into the content of a reasonably well encrypted connection. However, weak encryption, incorrect certificate verification, systems with backdoors and viruses may be lead to serious vulnerabilities and may then be exploited in a Man-in-the-middle attack so that ultimately, DPI systems might be able to read encrypted traffic.

As more and more web traffic is encrypted, fingerprinting is becoming more and more attractive to censors. This technology aims to determine what kind of traffic flows through a gateway. Chinese researchers, for example, aim to detect (encrypted) OpenVPN traffic [12] in order to be able to block OpenVPN at all. Fingerprinting is based either on the recognition of dedicated pattern (signature) or on a statistical analysis of the data flow.

The techniques described above are responsible for the fact that:

all connections from mainland China to overseas destinations are much slower than, for example, from Hong Kong or Macau
encrypted connections from China to overseas destinations are sometimes throttled and often even slower than unencrypted connections

From a censor’s viewpoint, the advantage of these techniques are that the internet in general remains “open”, but that “unwanted” traffic can be blocked. As the list of unwanted topics and words in China changes frequently and also depends on contemporary issues [13], this approach is best suited for such a dynamic censorship demand. Another big advantage is that it allows blocking of domains like WordPress or Xing that do not have dedicated IP address blocks but that are hosted by large content delivery networks (CDN) like Akamai or Amazon web services. Blocking the IP ranges by these large CDN would also block many other web sites in China and might have undesired side effects. The disadvantage of this approach is the penalty on internet access speed from China to overseas sites. It also requires substantial investment in DPI equipment and into the configuration of that equipment.

DPI and fingerprinting are also used by non-Chinese ISPs in order to “optimize” their traffic (means: slow down or disturb unwanted traffic in order to maximize their revenue stream). Example: Skype traffic is blocked by the German ISP Congstar as evidenced in [14]

Links

Posted on: 2017-06-14Gabriel Rüeck