Close Menu
    Facebook LinkedIn YouTube Instagram X (Twitter)
    Blue Tech Wave Media
    Facebook LinkedIn YouTube Instagram X (Twitter)
    • Home
    • Leadership Alliance
    • Exclusives
    • Internet Governance
      • Regulation
      • Governance Bodies
      • Emerging Tech
    • IT Infrastructure
      • Networking
      • Cloud
      • Data Centres
    • Company Stories
      • Profiles
      • Startups
      • Tech Titans
      • Partner Content
    • Others
      • Fintech
        • Blockchain
        • Payments
        • Regulation
      • Tech Trends
        • AI
        • AR/VR
        • IoT
      • Video / Podcast
    Blue Tech Wave Media
    Home » This data scientist wants to build an archive about the history of internet measurement
    Internet Governance

    This data scientist wants to build an archive about the history of internet measurement

    By 霏February 3, 2024No Comments7 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    • Jim Cowie, co-founder and Chief Data scientist at DeepMacro, invites the creation of an online library about internet measurement.
    • He believes there are three steps to perfecting a task: save, narrative, and explore.

    Jim Cowie, co-founder and Chief Data Scientist at DeepMacro, recently posted an article titled Thinking about Internet history on the APNIC website. He has over 25 years of experience as a data storyteller in Internet measurement and recently launched the Internet History Initiative, with the idea of building an internet library for future historians, piecing together the recorded history of the Internet.

    Also read: What is APNIC? Inside the backbone of Asia’s internet

    Curate history to interpret it and make it accessible and meaningful for future scholars.

    Jim Cowie, co-founder and Chief Data Scientist at DeepMacro

    Cowie argues that if we want to ensure that the story of the Internet is preserved in a quantifiable way for future generations of scholars, and that data is brought together to protect it from irreversible damage, we basically have three collective tasks to accomplish before we all forget how it works:

    • Preserve history by collecting irreplaceable records of how the Internet evolved.
    • Collate history to explain it and make it accessible and meaningful to future scholars.
    • Explore history and create tools and visualizations that everyone can enjoy and celebrate.

    Step 1: Save

    So what should we keep?

    In addition to active measurements, we need to keep a record of registry data – to whom these network resources have been assigned on every day in history, from ARIN, RIPE NCC, and APNIC – and any information we can find about the DNS name associated with each IP address on a given day. These are collective clues to what all these Internet hosts are doing, and also provide clues that they may be located on Earth.

    Refactor the Internet into a point-in-time database

    Finally, all of this DNS and registry data is very ephemeral, meaning it can change daily without any warning. If we later want to build credible indicators, such as the density of Internet hosts in a given area, then we have to track the time of each brief observation. Recall that in the 2010s, the exhaustion of the available IPv4 pool triggered a wave of sales and international reallocation of network address blocks, so that (for example) a block of network addresses that once hosted DSL customers in Romania might disappear from the Internet for a while, only to reappear in a data center in Saudi Arabia to service web pages. The geography of the Internet changes quickly, so we not only need a geographic map of all IP addresses and the purpose of each IP address. We also need to know what this map has looked like on a daily basis over the past few decades as the hosts and resources associated with each IP address have moved and changed in functionality.

    Finally, all of this DNS and registry data is very ephemeral, meaning it can change daily without any warning. If we later want to build credible indicators, such as the density of Internet hosts in a given area, then we have to track the time of each brief observation.

    Recall that in the 2010s, the exhaustion of the available IPv4 pool triggered a wave of sales and international reallocation of network address blocks, so that (for example) a block of network addresses that once hosted DSL customers in Romania might disappear from the Internet for a while, only to reappear in a data center in Saudi Arabia to service web pages. The geography of the Internet changes quickly, so we not only need a geographic map of all IP addresses and the purpose of each IP address. We also need to know what this map has looked like on a daily basis over the past few decades as the hosts and resources associated with each IP address have moved and changed in functionality.

    Step 2: Narrative

    Once we have successfully preserved all of our endangered digital datasets, we can begin to manage and tell about them. Most Internet measurement research has focused on operational issues in the here and now – monitoring slowdowns and shutdowns within and between providers, and figuring out how the Internet bypasses corrupted routed traffic. The question of historical evolution is often secondary. We can find new ways to look at the Internet through the lens of history to get past this “operational trap.”

    Part of the reason we do this is to encourage slower-growing, less-diverse parts of the Internet to grow faster, and it’s true that the national regulatory environment (and the central role of state providers in many economies) can prompt some parts of the Internet to behave in economy-specific ways. But Jim Cowie hopes that for the sake of future historians, we can find better ways to maintain geographic intuition, rather than falling into some kind of cognitive trap that sees a national Internet footprint as just another sovereign border to defend.

    Some of these “workload fragments” are very specific in time and place for those who want to understand Internet connections that are consistent with historical events. For example, what was it like for academic users in China to use Google search in 2009? What was it like for a mobile user in Cairo to want to access Wikipedia in 2011? What did the financial sector in South America look like in connection with Bloomberg and Reuters in the 2000s? How diverse will the Ethereum node in 2020 or the Mastodon server in 2023 be in terms of hosting relative to Internet consumers around the world? Some of these parts are very relevant – we might be able to map the embeddings of hosts in the Internet and visualize the connections between providers that support a given part of the workload.

    Step 3: Explore

    The reason we strive to preserve and organize the history of the Internet as a technological product is to help the public understand how the Internet works its magic.Today’s Internet works incredibly well, in large part because of the specific conditions under which it grew and developed, under multi-stakeholder governance rather than under a multilateral treaty system that often values decentralized openness and innovation, while centralized authorities may be more inclined to prioritize security, predictability, and control.
    Once we have saved the history of the Internet and we have recruited thoughtful scientists who can help us quantify some of the social benefits (net social costs) of the Internet, we will need tools to help tell those stories. Mostly visualizations, perhaps immersive walkthroughs, and certainly the kind of interactive exhibits that data journalists use to inform and entertain. “Our investment in providing these datasets will open the door to larger collaborations with artists, journalists and visual storytellers.”

    That’s what Jim Cowie wants to get started with. We can confidently predict that just as the Internet has changed society, society will certainly continue to change the Internet through some competing combination of top-down regulation with bottom-up innovation and popular demand.

    For those who care about the future of the Internet, the race is now on to become better librarians of Internet history so that we can preserve and tell the great things about the Internet.

    APNIC RIRs
    霏
    • Instagram
    • LinkedIn

    Fei is a journalist at BTW Media, specialising in internet governance and IT infrastructure, with a focus on interviewing leaders in the technology industry. Holding a Master of Science degree from the University of Edinburgh, Fei is currently working in Europe. If you have the latest industry trends that you’d like to share with BTW Media, please feel free to reach out via email at f.wang@btw.media.

    Related Posts

    What happens when AFRINIC reclaims your IPs?

    July 11, 2025

    AFRINIC turmoil threatens service continuity, operators warn

    July 11, 2025

    Ghana Dot Com: A digital trailblazer in West Africa

    July 11, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    CATEGORIES
    Archives
    • July 2025
    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023

    Blue Tech Wave (BTW.Media) is a future-facing tech media brand delivering sharp insights, trendspotting, and bold storytelling across digital, social, and video. We translate complexity into clarity—so you’re always ahead of the curve.

    BTW
    • About BTW
    • Contact Us
    • Join Our Team
    TERMS
    • Privacy Policy
    • Cookie Policy
    • Terms of Use
    Facebook X (Twitter) Instagram YouTube LinkedIn

    Type above and press Enter to search. Press Esc to cancel.