banner
leaf

leaf

It is better to manage the army than to manage the people. And the enemy.
follow
substack
tg_channel

User profiling is so important! So what are the processes and methods for user profiling?

The concept of user personas was proposed by Alan Cooper, the father of interaction design. He stated that user personas are virtual representations of real users, built on a series of real data, serving as models of target users. Remember that user personas express most of our users through virtual representations; the intelligence analyst wants to be more direct about this.

How should founders create user personas? One key point to remember is that you need to know what your key users and core users look like. Are they male or female? What do they like? Or can you describe your core user in one sentence? User personas are even considered the nuclear weapon of internet companies.

For instance, Tencent, Baidu, and Alibaba are referred to as BAT. I believe the core capability of BAT is their ability to create user personas based on big data. Let me share a joke: everyone knows Tencent is strong in product development. If you create a product that catches Tencent's attention, they can quickly surpass you with their own product. Why? Because Tencent has a very powerful user mining capability.

For example, Tencent's technology is divided into T1, T2, T3, T4, and T5. T5 is equivalent to a chief scientist, with basically one or two people. T4 has quite a few people at Tencent, dozens of them. What is T4? Tencent calls it the T4 expert group, and those who enter T4 are generally technical experts who have operated on hundreds of millions of users. When Tencent encounters a problem, they turn to the T4 expert group, which specializes in user personas...

User personas are so powerful, so strong, so much like a nuclear weapon. Here, I want to talk about the second core point: how to do it? A founder is not a product manager; how can they create effective user personas? They need to find seed users.

Many people ask, what are seed users? Users are stratified. What types of users are there? There are target users, and among target users, there are core users; and among core users, there are seed users. Seed users are like seeds; they are opinion leaders among users, those who have a voice among users, and even key figures among core users.

When creating user personas, it is essential to find seed users. In fact, finding seed users is the first step for almost all companies when developing products. For example, what are Xiaomi's seed users? Xiaomi is currently a major player in domestic smartphone sales, and its seed users are enthusiasts.

However, Huawei's sales are also among the top in the country. So what are Huawei's mainstream users? Are they the same as Xiaomi's? No, they are different. Huawei's seed users are business elites.

Looking at OPPO, OPPO's sales are also among the top in the country. Does OPPO's user persona match theirs? No, it does not. OPPO's user persona is young women. Therefore, finding seed users is very important; hence, seed users hold the key to success.

image

image

image

  1. What are user personas?

image

image

image

User personas are user models of target groups based on a series of real data, which abstract corresponding labels based on user attributes and behavioral characteristics, forming a virtual image. They mainly include basic attributes, social attributes, behavioral attributes, and psychological attributes.

It is important to note that user personas are derived from clustering analysis of a group of users with common characteristics, and therefore are not aimed at any specific individual.

image

image

User label set

image

image

image

  1. Steps to create user personas

image

image

image

(1) Clarify the purpose of the persona

Confirming the purpose of the persona is a fundamental and critical step. It is essential to understand what operational or marketing effects are expected from building user personas, so that planning can be made regarding the depth, breadth, and timeliness of data during the label system construction, ensuring that the underlying design is scientific and reasonable.

(2) Data collection

Only user personas generated based on objective and real data are effective. When collecting data, various dimensions need to be considered, such as industry data, overall user data, user attribute data, user behavior data, and user growth data, which can be obtained through industry research, user interviews, user information filling and questionnaires, and data collection from platform front-end and back-end.

(3) Data cleaning

Regarding the data collected, there may be non-target data, invalid data, and false data, so it is necessary to filter the raw data.

(4) Feature engineering

Feature engineering can transform raw data into features, involving some transformation and structuring work. In this step, it is necessary to eliminate outliers in the data (for example, in e-commerce apps, users may use flash sales to obtain a mobile phone for a few cents, but their usual shopping price is above a thousand yuan) and standardize the data (for example, the currencies used by consumers include RMB and USD, which need to be unified) and standardize the judgment labels.

The technologies used in persona construction include data statistics, machine learning, and natural language processing (NLP), as shown in the figure. Specific methods for constructing personas will be detailed in later sections of this chapter.

image

Technologies for constructing user personas

(5) Data labeling

In this step, the obtained data is mapped to the constructed labels, and various user features are combined. The choice of labels directly affects the richness and accuracy of the final persona, so data labeling needs to be combined with the functions and characteristics of the app itself. For example, e-commerce apps need to refine labels related to price sensitivity, while information apps need to describe content features from as many perspectives as possible using labels.

image

The prioritization method is mainly based on the difficulty of construction and the dependency relationships of various labels, as shown in the figure.

image

Priority of constructing various labels

(6) Construct user personas

The labels are divided into three categories:

The first category is demographic attributes

Demographic attributes include age, gender, education level, life stage, income level, consumption level, and industry affiliation.

Gender

Male

Female

Unknown

Age

Under 12

12-17

18-19

20-24

25-29

30-34

35-39

40-44

45-49

50-54

55-59

60-64

65 and above

Unknown

Monthly Income

Below 3500 yuan

3500-5000 yuan

5000-8000 yuan

8000-12500 yuan

12500-25000 yuan

25001-40000

Above 40000 yuan

Unknown

Marital Status

Single

Married

Divorced

Unknown

Industry

Advertising / Marketing / Public Relations

Aerospace

Agriculture / Forestry / Chemical

Automobile

Computer / Internet

Construction

Education / Student

Energy / Mining

Finance / Insurance / Real Estate

Government / Military / Real Estate

Service Industry

Media / Publishing / Entertainment

Medical / Insurance Services

Pharmaceutical

Retail

Telecommunications / Network

Travel / Transportation

Others

Education Level

Junior High School and Below

High School

Vocational School

Associate Degree

Bachelor's Degree

Master's Degree

Doctorate

Demographic labels

The second category is interest attributes

Before constructing user interest personas, it is necessary to conduct content modeling of user behaviors. To ensure that interest personas have a certain degree of accuracy and good generalization, we will construct a hierarchical interest label system, using several granular labels simultaneously to match, ensuring both the accuracy and generalization of the labels.

How to construct a hierarchical interest label? Simply put, look at what content and things users are interested in, extract, label, and statistically analyze the content and things they are interested in.

The third category is geographic attributes

The excavation of the permanent residence is based on the user's IP address information. By analyzing the user's IP address, it corresponds to the respective city, and by counting the cities where the user's IP appears, the permanent city label can be obtained.

The user's permanent city label can not only be used to count the distribution of users in various regions but also to identify business travelers, tourists, etc., based on the user's travel trajectory between different cities. As shown in the figure is an example of the travel trajectory of a crowd.

image

Travel trajectory of a crowd

GPS data is generally collected from mobile devices, but many mobile apps do not have permission to access user GPS information. Apps that can obtain user GPS information mainly include Baidu Maps, Didi Chuxing, and other navigation apps. Additionally, the collected user GPS data is relatively sparse.

Baidu Maps uses this method combined with time period data to construct GPS labels for users' homes and companies. In addition, Baidu Maps also uses GPS information to count traffic flow on various roads and conduct traffic condition analysis, as shown in the real-time traffic map of Beijing, where red indicates congested routes.

image

Real-time traffic map of Beijing

(7) Generate the persona

After the data runs in the model, the final generated persona can be presented in visual forms such as the one shown below. User personas are not static; thus, the model needs to have a certain degree of flexibility to adjust and modify the persona based on the user's dynamic behavior.

image

Information Collection#

Privacy#

Packet capture information

Topics actively participated in (discussions about social events and experiences)

Favorite emojis and emoji expressions, groups and channels joined

Speech (identity, life, profession, lifestyle, unit, complaints, income, values, stance, etc.)

Writing style (expression style, sentence structure, punctuation, etc.)

Screenshot content (fonts, application pages, icons in the notification bar, etc.)

Shared links and images (references)

Photos (people, objects, locations, iconic objects, weather, lighting, identity information, etc.)

Social activity photos (name, event time, poster, slogan)

Regional characteristics (local specialties, cigarettes, totems, plants, terrain)

Voice (accent, dialect, age, environmental noise)

Shared files (metadata, invisible watermarks, original image exif information, file source, content)

Account information (avatar, username, signature/profile, password, same information used across different platforms)
(Various domestic platforms have begun to display IP location information one after another. Is there a project that collects these displayed location product domain names without a global scope, allowing one-click copying and adding these domain names to protect privacy?)
Solution👇

Bilibili IP Location API#

host, api.bilibili.com, Location IP

Zhihu IP Location API#

ip-cidr, 103.41.167.0/24, Location IP

Weibo IP Location API#

host-suffix, api.weibo.cn, Location IP

Tieba IP Location API#

host, www.baidu.com, Location IP

Toutiao IP Location API#

host-suffix, toutiaoapi.com, Location IP

Douyin IP Location API#

host-keyword, core-c-lq, Location IP
host-keyword, core-lq, Location IP
host-keyword, normal-c-lq, Location IP
host-keyword, normal-lq, Location IP
host-keyword, search-quic-lq, Location IP
host-keyword, search-lq, Location I

How to Infer Specific Locations from a Photo | Introduction to Network Tracing#

Introduction#

Before starting the serious tutorial, a few points need to be clarified:

  1. This article will introduce a reasoning game called "Network Tracing," which infers the specific location of a photo based solely on a single image and limited hints. It can be considered a form of Open-Source Intelligence (OSINT) [1] that involves legally collecting data and information from publicly available resources.
  2. This article will not cover how to obtain and analyze "off-site information," such as "locals can tell at a glance," or how to gather identity and residence information from the questioner's historical content or social media. This article does not encourage the use of "doxxing" or other actions that may infringe on others' privacy in "Network Tracing."
  3. The author is merely an enthusiast of "Network Tracing" and has no financial relationship with the social platforms and tools mentioned. Additionally, the author is an amateur player, and the following content is a summary of personal experiences, serving as a quick-start guide rather than a rigorous professional tutorial. It is hoped that this article can help some interested readers get started with this game and also raise awareness of the privacy risks that may arise from posting photos on public channels.

Can a single photo reveal your location? | An Introduction to Network Tracing "Network Tracing" is one of the most impactful forms of open-source investigation because it appears highly dramatic: a single image can pinpoint a location accurately. However, this drama stems from people's underestimation of the amount of information contained in a single image and the scale and breadth of open-source information on the internet.

Note: This article aims to popularize the process of "how ordinary people can infer real locations from a single photo" and hopes to raise awareness among readers. If readers conduct exploration and research based on this article, they should respect others' privacy and relevant laws.

In 2011, a post titled "How I Inferred Wang Luodan's Address" went viral. The author of the post used several of Wang Luodan's Weibo posts, his understanding of Beijing, and Google Earth to infer Wang Luodan's previous address in just over 40 minutes. (Wang Luodan was a popular actress at the time, starring in the hit workplace drama "Du Lala's Promotion," which reveals the author's age.) Many people exclaimed "amazing" while also worrying that they might be investigated, stating they would never dare to post anything online again.

image

Related reports. Image from Sohu Media

Ten years later, in 2021, with the introduction of many enthusiasts and creators, a detective game called "Network Tracing" [note 1] entered the public eye: under the condition of having only one image and a few hints, experts could find the location of the photo using only a connected computer without leaving their homes, and some could even determine the time of the shoot. Nowadays, netizens exclaim "wow, that's impressive," while also worrying that they might be investigated, stating they would never dare to post anything online again.

image

The Network Tracing section of the Chao Fan community. Image from Chao Fan Community

image

Bilibili UP master "I am EyeOpener" is one of the more influential introducers of Network Tracing. Image from Bilibili

The history of the internet is a "cycle of perseverance," but the cycle is a spiral ascent. In the past ten years, the number of global internet users has doubled, and the number of web pages has quadrupled. Although we haven't made much progress, this investigative technique has matured with the support of massive internet information. Its formal name is Open Source Investigations (OSI) or Open Source Intelligence (OSINT) [note 2], referring to the technique of conducting investigations using open-source information on the internet.

"Network Tracing" is one of the most impactful forms of open-source investigation because it appears highly dramatic: a single image can pinpoint a location accurately. However, this drama stems from people's underestimation of the amount of information contained in a single image and the scale and breadth of open-source information on the internet. Are you worried that your photos might expose your privacy? Are you curious about how detectives unravel the clues to determine the photographer's location? Today, after reading this article, you can also unveil the mystery of Network Tracing, become a network detective, and become your own expert in online content security.

How to Play Network Tracing#

The Chao Fan community is a social website similar to Tieba, where users gather based on interests. Its Network Tracing section is highly influential in the community. Every day, many users post their photos here, challenging "detectives." The moderator team regularly holds Network Tracing point competitions, and winners receive exquisite trophies. (This is not an advertisement; this is a statement. The author has not registered yet.)

image

Content from the Network Tracing section of the Chao Fan community. Image from Chao Fan Community

Not all images are suitable for becoming a riddle. In the Chao Fan community, riddle images are concentrated in categories such as urban buildings, transportation (especially airplanes and high-speed trains), roads, scenic spots, etc., and are primarily long-range shots. If you take a picture of an ornament on your desk or a small flower by the roadside, detectives will find it very difficult to extract effective information from the image content.

The riddles of Network Tracing can also take the form of panoramic images, videos, and other multimedia formats. The "Panoramic City Explorer" launched by Baidu Maps, as mentioned in the article, is based on panoramic images.

The basic idea of Network Tracing can be divided into the following three steps:

  • Extract: Carefully observe the image and extract all valid information from it. No matter how small or vague it is, do not overlook it;
  • Analyze: Use your knowledge and internet tools to analyze the information obtained and narrow down the search range;
  • Verify: Use internet tools to conduct the search until you have searched through the range obtained in the analysis phase. If unsuccessful, return to the first two steps and try again.

Extracting and analyzing information is the key to Network Tracing and is also where the fun lies. This relies on the detectives' broad knowledge base, strong internet information retrieval skills, and long-term experience accumulation.

Network Tracing detectives tend to arrive at answers through logical reasoning rather than brute-force cracking; the more challenging the reasoning process, the greater the sense of accomplishment when arriving at an answer. Considering the complexity of reality, this reasoning process is not strict; it is more based on probabilistic assumptions derived from life experiences.

What is Hidden in the Image?#

To become a qualified Network Tracing detective, the first step is to learn how to look at images and uncover hidden information within them. Generally speaking, a single image can contain the following types of information: textual information, infrastructure information, and natural geographic information.

Textual Information#

Textual information is the fastest and simplest way to infer geographic locations. Compared to other types of information, textual information has significant advantages:

  • May directly reveal the location: Textual information such as road signs, government buildings, station names, and house numbers are strongly associated with geographic locations and can easily become giveaway clues.
  • No professional threshold: You may need certain professional knowledge and comparative analysis processes to determine the species of plants or the model of an airplane, but interpreting textual information requires none of that; you just need to be able to read.
  • Easy to search: You can directly search for text in search engines. While many search engines support image searches, their accuracy cannot compare to that of text.

Therefore, Network Tracing detectives do not overlook any textual information in the image, even if it is blurry.

For example, given the following image and asked about the photographer's location:

image

This is a photo of a Sha County snack shop. However, directly searching for "Sha County snacks" is not a good idea—there are tens of thousands of Sha County snack shops nationwide. By carefully observing the details in the image, several pieces of textual information can be found: the adjacent sign has "记," the reflection on the windows has "王府" and "旺基," the house number is " 香榭 " and "23,"and the advertisement on the electric vehicle's mudguard reads" 星桥莫拉克专卖店."

image

Electric vehicles rarely cross cities, so the license plate and the advertisement on the mudguard can be used to infer the city where the photo was taken. The city name on the license plate is blurry, but it can be seen to have two characters, so we start with the advertisement on the mudguard.

Searching for "星桥" nationwide, excluding vague matches like "三星大桥," leaves us with 12 possible locations: Xingqiao Street in Hangzhou, Xingqiao Village in Huzhou, Xingqiao Village in Sanming, Xingqiao Village in Fuzhou, Xingqiao Village in Ziyang, Xingqiao Village in Guang'an, Xingqiao Village in Guangyuan, Xingqiao Town in Chongqing, Xingqiao Village in Lijiang, Xingqiao Village in Shaoyang, Xingqiao Village in Zhuzhou, and Xingqiao Village in Xianning. Based on the reflections on the windows, this area appears to have dense commercial activities, which does not resemble an ordinary rural area.

image

Nationwide "Xingqiao" (partial). Image from Baidu Maps

The advertisement also provides the phone number for "莫拉克专卖店." It is well-known that the first three digits of mobile phone numbers represent the carrier, and the middle four digits represent the area code, so the first seven digits of the phone number are sufficient to determine the number's origin. This may not necessarily be the photographer's location, but it is likely to be true.

image

The phone number is somewhat blurry, but the visible digits in the first seven are "1508*64," with the fifth digit resembling 3, 5, or 8. A search reveals that 1508364 belongs to Xinyu, Jiangxi, 1508564 belongs to Zunyi, Guizhou, and 1508864 belongs to Hangzhou, Zhejiang. Comparing with the search results for Xingqiao, only Hangzhou overlaps. Thus, we can tentatively assume that the photographer is in Hangzhou and proceed to the next search.

Next, we notice the house number " 香榭 " and "23."The content of the house number may refer to a road name, community name, or village name. Considering the dense commercial activity nearby, it is more likely to be a road name. The content after" 香榭 "is obscured, but based on its proportional position, it should be a character like" 路 "or" 街."

image

Searching for "香榭路" in Hangzhou, we indeed find a road named 香榭,which belongs to Xingqiao Street.

image

Hangzhou's Xiangxie Road. Image from Baidu Maps

In this area, searching for Sha County snacks leads us to a "suspected target":

image

Suspected Sha County snack shop. Image from Baidu Maps
image

Unfortunately, the street view is outdated, and we did not find a similar shop. However, the architectural style and road sign format match.

image

Panoramic view of Xiangxie Road. Image from Baidu Maps

On Meituan, we can find this shop, with the house number "香榭路 23-1," and the shop image matches the riddle image. Thus, we confirm that the photographer's location is near the entrance of the Sha County snack shop at No. 23-1 Xiangxie Road, Linping District, Hangzhou, Zhejiang Province.

image

Sha County Snack Tiandu City Store. Image from Meituan

The above is a giveaway question for Network Tracing, as it only requires analyzing textual information to arrive at the answer.

Infrastructure Information#

From large urban areas to small garbage bins, infrastructure encompasses municipal, transportation, and architectural fields. The basis for conducting Network Tracing based on infrastructure is twofold:

  • Identifiability: As products of industrial society, the appearance of infrastructure serving the same function is often similar, allowing us to discern "what this is." Identifying large facilities like ports, airports, and stadiums is crucial for determining locations.
  • Regional Variability: Influenced by national and regional policies, climatic conditions, and economic geography, infrastructure can differ from one place to another. This allows us to infer "where this is."

Here are some commonly used types of infrastructure information:

  • Landmark Buildings: Landmark buildings generally possess a certain uniqueness, allowing them to be located using image searches. If they are imitations, it is not difficult to find them using news reports.
  • Urban Areas: The skyline and bird's-eye view of central urban areas, urban villages, and urban-rural junctions differ, and the size of the city can also affect these urban landscapes.
  • Houses: Houses generally face south and can be used to determine direction. Rural houses in different regions have different styles, such as red-tiled roofs, white walls with black tiles, cave dwellings, and courtyard houses, which can help infer the region.
  • Roads: Different types of railways and highways have their unique facilities, such as railway contact networks, slopes, and isolation nets. Railway stations, highway toll booths, overpasses, and traffic signs are also important clues. Uniquely styled streetlights may also serve as breakthroughs in solving riddles.
  • Vehicles: License plates can help infer the country of origin, and some can even be further narrowed down to administrative divisions. If cars drive on the left, countries where cars drive on the right can be ruled out, and vice versa. City buses and taxis usually have uniform or series paint jobs.
  • Trains and Airplanes: The shape details of trains and airplanes can determine their models. Train and airplane schedules can be queried online. Special paint jobs can also reveal important information. Based on the angle of the photo taken on the airplane, it is possible to roughly judge whether the airplane is taking off or landing.
  • Special Facilities: Meteorological stations, radar stations, stadiums, ports, and docks often have special facilities, such as stadium-specific lighting and dock gantry cranes. Identifying these special facilities requires relevant background knowledge.

Infrastructure information is the most common and primary type of information in Network Tracing; this article cannot cover everything but will provide a brief overview. Here, we introduce a typical case of determining a location based on infrastructure information, which comes from the blog of open-source information expert NixIntel. This expert's blog provides rich material for domestic Network Tracing bloggers.

image

The second riddle image, from Swapfiets company

This is an advertisement photo released by Swapfiets, and we need to find the location of the photo. NixIntel extracted the following information from the image:

  • This is a city with tall buildings.
  • The tracks on the road indicate that the city operates trams.
  • Part of the license plate is visible, formatted as PJ-620-*.
  • The lamp post has black and white stripes.
  • The buildings on the left side of the road have prominent tall white columns.

image

NixIntel visited the company's official website and learned that the company was operating in the Netherlands, Germany, Denmark, and Belgium at that time. To determine which country it is in, the license plate can be used. The WorldLisencePlates website contains the license plate styles of various countries, and the styles of the four countries are as follows:

image

Comparison of license plates from the four countries. Image from WorldLisencePlates

After comparison, the Dutch license plate style is the closest, so we will search in the Netherlands first. If it is not the Netherlands, it is not a big deal; we can go back and choose again.

Once the country is determined, is there a way to narrow it down to a province or city? Looking back at the clues, the tram seems promising, as not all cities have trams. Checking the Wikipedia page for tram systems in the Netherlands reveals that only five cities in the Netherlands currently operate trams: Delft, Utrecht, Rotterdam, Amsterdam, and The Hague.

image

Wikipedia entry for trams in the Netherlands, image from Wikipedia

The tall white columns of the building come into play here; it is highly likely to be among these five cities. The Phrio website contains large buildings from around the world, which can be filtered by city and includes images. The page for Delft looks like this:

image

Delft page on Phrio. Image from NixIntel blog

No obvious matching buildings were found in Delft, as its buildings are generally not as large as those in the advertisement photo. Utrecht has several larger commercial buildings, but still no matches. Rotterdam, Amsterdam, and The Hague are much larger cities, and the answer is likely among them. Large cities must have many tall buildings, and here are Rotterdam's buildings:

image

Overview of tall buildings in Rotterdam. Image source same as above

After browsing, a familiar building stands out, with prominent tall white columns. It is called the Unilever Building:

image

Unilever Building. Image source same as above

Entering street view, the familiar black and white lamp post, tram tracks, and road surface confirm that the photo was taken here.

image

Rotterdam street view. Image from Google Earth

This case well illustrates the power of open-source information on the internet. Without using specialized knowledge, we extracted a few information points and were able to explore using diverse internet resources to arrive at an answer. This is the superpower that the internet era grants each of us.

Natural Geographic Information#

Common natural geographic information includes light and shadow, weather, terrain, and vegetation. Extracting and interpreting natural geographic information requires a broad and deep accumulation of natural geographic knowledge, as well as intuition based on that knowledge. In many famous Network Tracing cases, the key step is often just a statement from an expert saying, "I feel like this area," which is difficult to convey in words.

Common types of natural geographic information include:

  • Terrain: Water bodies (rivers, lakes, reservoirs, oceans), mountains (snow cover), soil color, etc.
  • Vegetation: Plants usually have specific distribution areas. When the target range is unclear, plant information can assist in exclusion. However, due to the widespread introduction of species, this exclusion is not very reliable.
  • Light and Shadow: Shadows can provide a rough direction, helping to determine the direction of travel or the road. The Suncalc website can help determine shadow length, position, or time. It is usually not difficult to tell whether it is day or night in the image, which helps to exclude some schedules that do not match the day-night state of the image.
  • Weather: Weather is a common auxiliary piece of information. Based on historical weather changes in the area, the date range of the photo can be inferred.
  • People: This can be considered geographic information. Based on the ethnicity of the people in the image, the location of the photo can be guessed.

This section uses a post from the Chao Fan community as an example. This question was solved collaboratively by two prominent users in the Chao Fan community, Anshan Wu Yanzu and Cat (hereinafter referred to as "Cat"). The image for the question is shown below, asking for the name of the mountain range below the airplane.

image

The third riddle image. Image from Chao Fan Community

Anshan Wu Yanzu's judgment of this image is:

Based on the weather and the shape and vegetation of the mountains, it can be inferred that it should be north of Beijing (including the three northeastern provinces and parts of Inner Mongolia).

Based on the red-tiled roofs of the distant houses and the presence of what appears to be corn crops in front of the door, it can be basically determined that it is in the northeastern region.

image

This judgment process is more based on experience, but the range of the northeastern region is still quite large. This is also a characteristic of inferring based on natural geographic information: it requires rich experiential knowledge but cannot narrow the range down to a very small area.

Cat further provided two points of judgment:

The railway on the left has streetlights and a station nameplate, suggesting that the photo was taken near a railway station.

The distant houses should be oriented north-south, and since the shadow north of the return line cannot be on the south side, the inferred orientation is as follows:

image

The left railway runs approximately north-south, while the railway crossing runs approximately east-south. The intersection is within 500 meters of the station.

At this point, all the information in the image has been extracted. While it is feasible to manually search all railway crossings in the northeastern region, the time cost is too high and prone to omissions. Is there a tool that can replace humans in doing this? Yes! Introducing a groundbreaking search tool in the field of open-source investigation: Overpass Turbo. In short, it is a map search engine that can search for all locations that meet the specified conditions based on the user's specified location. In China, it has fewer points of interest, but railway-related information is relatively complete.

Don't get too excited too soon; the following news may be daunting—using it requires learning code. Overpass Turbo uses a set of query statements known as Overpass API.

image

In this case, the core code we used is as follows, provided by Cat. I tried to introduce high-speed rail conditions to narrow the range but found that the maxspeed field was missing, so I used the original code here. Due to space limitations, only a brief annotation is provided; interested readers can search for tutorials to learn.

// Search for railway bridges longer than 1 kilometer within the area, stored in w1
way[railway = rail][bridge](if: length() > 1000)({{bbox}}) -> .w1;
// Search for non-bridge railways longer than 1 kilometer that intersect with w1 (distance = 0), stored in w2
way(around.w1: 0)[railway = rail][!bridge](if: length() > 1000) -> .w2;
// Give all railway stations within 500 meters of w1 and 20 meters of w2
node(around.w1: 500)(around.w2: 20)[railway = station];

The northeastern region is large and can be searched in two or three rounds. The results are as follows, with circles indicating hits:

image

image

Overpass Turbo search results. Image from Chao Fan Community

Based on the analysis of the railway's direction mentioned earlier, a station that meets the conditions can be filtered out: the Tahuangqi Station.

image

image

Tahuangqi Station. Image from Chao Fan Community, Gaode Maps

This case does not rely solely on natural geographic information, but the judgment of the region significantly reduced the search workload. With Overpass Turbo, rapid large-scale screening becomes possible.

Off-site Information#

When the information in the image is insufficient to determine the location, detectives must obtain off-site hints. If any of the following items involve privacy and legal issues, please ensure that they are used with the consent of the questioner or the parties involved, or with authorization from official departments.

  • Image EXIF Information: If the questioner has published the original image and the online platform has not removed the EXIF information, this information can be used to directly locate the shooting location.
  • Questioner's Historical Records: Check the content the questioner has posted on public social platforms, including personal homepages and comments. Some people use the same avatar or username across different public social platforms, posting similar content, making it easy to search across platforms.
  • Social Network Relationships: The questioner's friend network may also expose their identity. Friends with whom they frequently interact may share similar life experiences, interests, or belong to the same organization, and the content posted by friends is likely related to them.

Will I Never Dare to Post Anything Online Again?#

Network Tracing often raises privacy concerns. To alleviate public doubts, the Chao Fan community and the Twitter account @Quiztime mainly focus on questioners posting their own photos. However, there are inevitably some ill-intentioned individuals who secretly investigate others. Therefore, everyone should be cautious when posting content, assuming that all their images could potentially expose their shooting locations.

  • Is the platform a public one? Before viewing the content I posted on that platform, do I need to add me as a friend or get my consent? Information that is accessible to everyone requires great caution.
  • If the shooting location is known, will it involve core privacy? Showing places you have visited or public places generally has little impact; however, if the shooting location is related to your and your friends' residences or workplaces, you must ensure that the image does not contain information that can be investigated as mentioned above, and the text does not involve descriptions of commuting or transportation.
  • Avoid posting images related to national security, such as weapons, military, etc.

By paying attention to the above points, you will generally not end up like Wang Luodan, who had her home exposed.

If the image does not involve core privacy but you also do not want to be investigated for the shooting location, you should pay attention to:

  • Avoid posting multiple images of the same location, as this can provide ample information for open-source investigations.
  • Avoid posting images with a lot of textual information.
  • Avoid posting images containing special infrastructure information and natural geographic information.
  • Avoid posting original images.

I believe that after reading this article, readers now understand the basic gameplay of Network Tracing and can analyze which images contain important clues that may expose their location, thus becoming their own experts in online content security.

Coach, I Want to Learn#

Under the premise of respecting privacy and security, Network Tracing is a beneficial puzzle game. It can expand players' knowledge, enhance their understanding of reality and the internet, and train their reasoning abilities and self-information-gathering skills.

This article focuses on extracting image information, while online resources are mentioned only in passing. In my view, knowing what information can be searched is more important than how to search, and this is the biggest obstacle for most people participating in Network Tracing—failing to realize that key information exists within the image. Once this hurdle is overcome, you can use image searches to obtain further information or filter through websites that specialize in this type of information. If you do not know what websites to use, you can search or ask questions in dedicated forums; these are all problems that can be gradually solved through experience.

What forums can I use for communication? Which experts' blogs can I visit? What resources can help me? These are the Network Tracing questions left for you: I have already provided many hints; now is the time to exercise your ability to gather information independently.

I wish you a smooth journey in your online exploration!

References#

During the creation of this article, the following articles were referenced, and I would like to thank the original authors:

References#

  1. ^ Similar terms include open-source investigations (Open Source Investigations, OSI) and online open-source investigations (Online Open Source Investigations, OOSI).
  2. ^ The most feared are those with ulterior motives; a user on Renren used just two images and 40 minutes to infer the address of a Beijing celebrity https://page.om.qq.com/page/OzDezp5M825FCotpeuYPEl6w0
  3. ^https://m.weibo.cn/status/3886914195127757
  4. ^ I am EyeOpener's personal space - Bilibili https://space.bilibili.com/43645887/channel/seriesdetail?sid=90709
  5. ^ The personal space of "Exploration Address" - Bilibili https://space.bilibili.com/1960160215
  6. ^ The personal space of "Universe Encyclopedia" - Bilibili https://space.bilibili.com/93569847
  7. ^ The personal space of "Night Point Short Video" - Bilibili https://space.bilibili.com/1078123935
  8. ^ Verif!cation Quiz Bot (@quiztime) - Twitter https://twitter.com/quiztime
  9. ^ GeoConfirmed - War Ukraine https://www.google.com/maps/d/viewer?mid=10YK14-QB25penu8jeS4hBVarzGKZsVgj&ll=48.104096492535504%2C31.957569662788224&z=6
  10. ^ There are also similar groups on Douban; interested readers can visit the "Let's Play Network Tracing Group": https://www.douban.com/group/725884/
  11. ^ Flying over the Yangtze River, bridge construction, architectural photography - HuiTu https://www.huitu.com/photo/show/20180218/204610197016.html
  12. ^ The first high-precision soil color map in China http://www.ssa.ac.cn/?p=7955
  13. ^ Based on the shadows in the image, find the specific location of this airplane (mountain range) https://www.bilibili.com/video/BV1LG4y1a79k
  14. ^ National Standard GB 17733-2008 https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A4BC390727C25D327CF14ADE1C0F27A3
  15. ^ National Standard GB 50180-2018 https://baigongbao.oss-cn-beijing.aliyuncs.com/2020/09/29/AGZeRrtGrN.pdf
  16. ^ Solving POI locations using open map data (domestic) https://invited-aquarius-173.notion.site/POI-f7b3c76127404e43ac4a462c40afcc1e
  17. ^ About - NixIntel https://nixintel.info/about/
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.