The World Wide Web: Crash Course Computer Science #30

00:11:36
https://www.youtube.com/watch?v=guvsH5OFizE

Summary

TLDRThis CrashCourse episode, hosted by Carrie Anne, explores the World Wide Web, distinguishing it from the Internet. While both terms are often used interchangeably, the Internet serves as the infrastructure carrying data, whereas the World Wide Web is a massive application that runs on it. Pages on the web are interconnected through hyperlinks, an idea originally conceptualized by Vannevar Bush in 1945. Each hypertext page requires a unique URL, and communication is facilitated via the HTTP protocol. Hypertext Markup Language (HTML) is used to create web pages, with Tim Berners-Lee credited for inventing the web and its foundational standards in 1990. The episode also addresses the birth of web browsers and search engines, highlighting the importance of innovations like Google's backlink algorithm. Finally, it discusses the ongoing debate about Net Neutrality, which concerns the principle that all data on the internet should be treated equally, without preference given to specific providers or content types.

Takeaways

  • 🌐 The World Wide Web is distinct from the Internet but runs on it.
  • 🔗 Hypertext allows users to navigate seamlessly between linked pages.
  • 📜 URLs provide unique addresses for web pages.
  • 🔄 HTTP facilitates the request and transfer of web content.
  • 🔍 HTML is the language used to structure web content.
  • 🌱 Tim Berners-Lee developed the web as an open standard.
  • 🕸️ Early search engines led to innovations like Google's backlink ranking.
  • 🏛️ The web's open nature enabled widespread browser and platform development.
  • 📢 Net Neutrality debates emphasize equal treatment of internet data.
  • 📚 Understanding the history of the web aids in grasping its future challenges.

Timeline

  • 00:00:00 - 00:05:00

    Carrie Anne explains the distinction between the World Wide Web and the Internet. The web, often confused with the Internet, is a massive distributed application atop the Internet infrastructure, accessed via web browsers. Hyperlinks, an essential part of the web, allow easy navigation across pages, an idea conceptualized by Vannevar Bush in 1945. Each hypertext page requires a unique URL, and web browsers use HTTP to request pages from web servers, translating them via domain names to IP addresses.

  • 00:05:00 - 00:11:36

    The video continues to explain HTML and the evolution of browsers. HTML, a markup language created in 1990, allows the creation of hyperlinks, lists, and other web page elements. Tim Berners-Lee developed the first web browser, and web server, launching the World Wide Web in 1991. As browsers evolved, early ones like Mosaic initiated embedding graphics. The flourishing web led to search engines like JumpStation, which indexed web content. Google's algorithm later innovated how search results are ranked, emphasizing "backlinks". The video concludes with a discussion on Net Neutrality, debating equal treatment of internet packets versus potential preferential treatment by ISPs.

Mind Map

Video Q&A

  • What is the World Wide Web?

    The World Wide Web is a huge distributed application running on millions of servers worldwide, accessed using a web browser. It operates on top of the internet.

  • How is the World Wide Web different from the Internet?

    The Internet is the infrastructure that carries data across applications, while the World Wide Web is one of those applications, including others like Skype or Minecraft.

  • What is hypertext?

    Hypertext is text that contains hyperlinks, allowing you to easily navigate from one topic to another. Web pages today are a common form of hypertext documents.

  • What is an URL?

    A URL (Uniform Resource Locator) is the unique address of a hypertext page on the web.

  • What is HTTP and how does it work?

    HTTP (Hypertext Transfer Protocol) is the protocol used to request and transfer webpages. A basic command in HTTP is "GET," used to request web pages from a server.

  • What is HTML?

    HTML (Hypertext Markup Language) is a language used to create web pages by marking up text files with hypertext elements.

  • Who invented the World Wide Web?

    The World Wide Web was invented by Tim Berners-Lee in 1990 while working at CERN in Switzerland.

  • What is Net Neutrality?

    Net Neutrality is the principle that all internet packets should be treated equally, without preferential treatment based on the source or type of data.

  • Why was Google able to succeed as a search engine?

    Google succeeded due to their innovative algorithm which ranked pages based on backlinks from other websites, providing a more reliable indication of web page quality.

  • What was the first web browser?

    The first web browser was written by Tim Berners-Lee in 1990 and was simply called the WorldWideWeb browser.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:03
    Hi, I’m Carrie Anne, and welcome to CrashCourse Computer Science.
  • 00:00:05
    Over the past two episodes, we’ve delved into the wires, signals, switches, packets,
  • 00:00:10
    routers and protocols that make up the internet.
  • 00:00:12
    Today we’re going to move up yet another level of abstraction and talk about the World
  • 00:00:16
    Wide Web.This is not the same thing as the Internet, even though people often use the
  • 00:00:20
    two terms interchangeably in everyday language.
  • 00:00:21
    The World Wide Web runs on top of the internet, in the same way that Skype, Minecraft or Instagram do.
  • 00:00:27
    The Internet is the underlying plumbing that conveys the data for all these different applications.
  • 00:00:31
    And The World Wide Web is the biggest of them all – a huge distributed application running
  • 00:00:35
    on millions of servers worldwide, accessed using a special program called a web browser.
  • 00:00:40
    We’re going to learn about that, and much more, in today’s episode.
  • 00:00:43
    INTRO
  • 00:00:53
    The fundamental building block of the World Wide Web – or web for short – is a single
  • 00:00:57
    page.
  • 00:00:58
    This is a document, containing content, which can include links to other pages.
  • 00:01:01
    These are called hyperlinks.
  • 00:01:03
    You all know what these look like: text or images that you can click, and they jump you
  • 00:01:06
    to another page.
  • 00:01:08
    These hyperlinks form a huge web of interconnected information, which is where the whole thing
  • 00:01:12
    gets its name.
  • 00:01:13
    This seems like such an obvious idea.
  • 00:01:15
    But before hyperlinks were implemented, every time you wanted to switch to another piece
  • 00:01:18
    of information on a computer, you had to rummage through the file system to find it, or type
  • 00:01:22
    it into a search box.
  • 00:01:24
    With hyperlinks, you can easily flow from one related topic to another.
  • 00:01:28
    The value of hyperlinked information was conceptualized by Vannevar Bush way back in 1945.
  • 00:01:33
    He published an article describing a hypothetical machine called a Memex, which we discussed
  • 00:01:37
    in Episode 24.
  • 00:01:39
    Bush described it as "associative indexing ... whereby any item may be caused at will
  • 00:01:44
    to select another immediately and automatically."
  • 00:01:47
    He elaborated: "The process of tying two things together is the important thing...thereafter,
  • 00:01:52
    at any time, when one of those items is in view, the other [item] can be instantly recalled
  • 00:01:57
    merely by tapping a button."
  • 00:01:59
    In 1945, computers didn’t even have screens, so this idea was way ahead of its time!
  • 00:02:04
    Text containing hyperlinks is so powerful, it got an equally awesome name: hypertext!
  • 00:02:09
    Web pages are the most common type of hypertext document today.
  • 00:02:12
    They’re retrieved and rendered by web browsers which we'll get to in a few minutes.
  • 00:02:15
    In order for pages to link to one another, each hypertext page needs a unique address.
  • 00:02:20
    On the web, this is specified by a Uniform Resource Locator, or URL for short.
  • 00:02:25
    An example web page URL is thecrashcourse.com/courses.
  • 00:02:29
    Like we discussed last episode, when you request a site, the first thing your computer does
  • 00:02:33
    is a DNS lookup.
  • 00:02:34
    This takes a domain name as input – like “the crash course dot com” – and replies
  • 00:02:38
    back with the corresponding computer’s IP address.
  • 00:02:40
    Now, armed with the IP address of the computer you want, your web browser opens a TCP connection
  • 00:02:45
    to a computer that’s running a special piece of software called a web server.
  • 00:02:49
    The standard port number for web servers is port 80.
  • 00:02:52
    At this point, all your computer has done is connect to the web server at the address
  • 00:02:55
    thecrashcourse.com
  • 00:02:57
    The next step is to ask that web server for the “courses” hypertext page.
  • 00:03:01
    To do this, it uses the aptly named Hypertext Transfer Protocol, or HTTP.
  • 00:03:05
    The very first documented version of this spec, HTTP 0.9, created in 1991, only had
  • 00:03:11
    one command – “GET”.
  • 00:03:13
    Fortunately, that’s pretty much all you need.
  • 00:03:15
    Because we’re trying to get the “courses” page, we send the server the following command
  • 00:03:19
    – GET /courses.
  • 00:03:21
    This command is sent as raw ASCII text to the web server, which then replies back with
  • 00:03:25
    the web page hypertext we requested.
  • 00:03:27
    This is interpreted by your computer's web browser and rendered to your screen.
  • 00:03:31
    If the user follows a link to another page, the computer just issues another GET request.
  • 00:03:35
    And this goes on and on as you surf around the website.
  • 00:03:38
    In later versions, HTTP added status codes, which prefixed any hypertext that was sent
  • 00:03:43
    following a GET request.
  • 00:03:45
    For example, status code 200 means OK – I’ve got the page and here it is!
  • 00:03:49
    Status codes in the four hundreds are for client errors.
  • 00:03:51
    Like, if a user asks the web server for a page that doesn’t exist, that’s the dreaded
  • 00:03:56
    404 error!
  • 00:03:57
    Web page hypertext is stored and sent as plain old text, for example, encoded in ASCII or
  • 00:04:01
    UTF-16, which we talked about in Episodes 4 and 20.
  • 00:04:05
    Because plain text files don’t have a way to specify what’s a link and what’s not,
  • 00:04:09
    it was necessary to develop a way to “mark up” a text file with hypertext elements.
  • 00:04:13
    For this, the Hypertext Markup Language was developed.
  • 00:04:16
    The very first version of HTML version 0.a, created in 1990, provided 18 HTML commands
  • 00:04:22
    to markup pages.
  • 00:04:23
    That’s it!
  • 00:04:24
    Let’s build a webpage with these!
  • 00:04:25
    First, let’s give our web page a big heading.
  • 00:04:28
    To do this, we type in the letters “H 1”, which indicates the start of a first level
  • 00:04:32
    heading, and we surround that in angle brackets.
  • 00:04:35
    This is one example of an HTML tag.
  • 00:04:38
    Then, we enter whatever heading text we want.
  • 00:04:40
    We don’t want the whole page to be a heading.
  • 00:04:42
    So, we need to “close” the “h1” tag like so, with a little slash in the front.
  • 00:04:45
    Now lets add some content.
  • 00:04:47
    Visitors may not know what Klingons are, so let’s make that word a hyperlink to the
  • 00:04:51
    Klingon Language Institute for more information.
  • 00:04:53
    We do this with an “A” tag, inside of which we include an attribute that specifies
  • 00:04:57
    a hyperlink reference.
  • 00:04:58
    That’s the page to jump to if the link is clicked.
  • 00:05:00
    And finally, we need to close the A tag.
  • 00:05:03
    Now lets add a second level heading, which uses an “h2” tag.
  • 00:05:06
    HTML also provides tags to create lists.
  • 00:05:09
    We start this by adding the tag for an ordered list.
  • 00:05:12
    Then we can add as many items as we want, surrounded in “L i” tags, which stands
  • 00:05:16
    for list item.
  • 00:05:17
    People may not know what a bat'leth is, so let’s make that a hyperlink too.
  • 00:05:21
    Lastly, for good form, we need to close the ordered list tag.
  • 00:05:24
    And we’re done – that’s a very simple web page!
  • 00:05:27
    If you save this text into notepad or textedit, and name it something like “test.html”,
  • 00:05:31
    you should be able to open it by dragging it into your computer’s web browser.
  • 00:05:35
    Of course, today’s web pages are a tad more sophisticated.
  • 00:05:38
    The newest version of HTML, version 5, has over a hundred different tags – for things
  • 00:05:42
    like images, tables, forms and buttons.
  • 00:05:44
    And there are other technologies we’re not going to discuss, like Cascading Style Sheets
  • 00:05:48
    or CSS and JavaScript, which can be embedded into HTML pages and do even fancier things.
  • 00:05:54
    That brings us back to web browsers.
  • 00:05:56
    This is the application on your computer that lets you talk with all these web servers.
  • 00:06:00
    Browsers not only request pages and media, but also render the content that’s being
  • 00:06:03
    returned.
  • 00:06:04
    The first web browser, and web server, was written by (now Sir) Tim Berners-Lee over
  • 00:06:09
    the course of two months in 1990.
  • 00:06:10
    At the time, he was working at CERN in Switzerland.
  • 00:06:13
    To pull this feat off, he simultaneously created several of the fundamental web standards we
  • 00:06:18
    discussed today: URLs, HTML and HTTP.
  • 00:06:21
    Not bad for two months work!
  • 00:06:23
    Although to be fair, he’d been researching hypertext systems for over a decade.
  • 00:06:27
    After initially circulating his software amongst colleagues at CERN, it was released to the
  • 00:06:30
    public in 1991.
  • 00:06:32
    The World Wide Web was born.
  • 00:06:34
    Importantly, the web was an open standard, making it possible for anyone to develop new
  • 00:06:38
    web servers and browsers.
  • 00:06:39
    This allowed a team at the University of Illinois at Urbana-Champaign to create the Mosaic web
  • 00:06:43
    browser in 1993.
  • 00:06:45
    It was the first browser that allowed graphics to be embedded alongside text; previous browsers
  • 00:06:50
    displayed graphics in separate windows.
  • 00:06:52
    It also introduced new features like bookmarks, and had a friendly GUI interface, which made
  • 00:06:56
    it popular.
  • 00:06:57
    Even though it looks pretty crusty, it’s recognizable as the web we know today!
  • 00:07:01
    By the end of the 1990s, there were many web browsers in use, like Netscape Navigator,
  • 00:07:05
    Internet Explorer, Opera, OmniWeb and Mozilla.
  • 00:07:08
    Many web servers were also developed, like Apache and Microsoft’s Internet Information
  • 00:07:11
    Services (IIS).
  • 00:07:13
    New websites popped up daily, and web mainstays like Amazon and eBay were founded in the mid-1990s.
  • 00:07:18
    A golden era!
  • 00:07:19
    The web was flourishing and people increasingly needed ways to find things.
  • 00:07:23
    If you knew the web address of where you wanted to go – like ebay.com – you could just
  • 00:07:27
    type it into the browser.
  • 00:07:28
    But what if you didn’t know where to go?
  • 00:07:30
    Like, you only knew that you wanted pictures of cute cats.
  • 00:07:33
    Right now!
  • 00:07:34
    Where do you go?
  • 00:07:35
    At first, people maintained web pages which served as directories hyperlinking to other
  • 00:07:39
    websites.
  • 00:07:40
    Most famous among these was "Jerry and David's guide to the World Wide Web", renamed Yahoo
  • 00:07:44
    in 1994.
  • 00:07:45
    As the web grew, these human-edited directories started to get unwieldy, and so search engines
  • 00:07:50
    were developed.
  • 00:07:51
    Let’s go to the thought bubble!
  • 00:07:52
    The earliest web search engine that operated like the ones we use today, was JumpStation,
  • 00:07:57
    created by Jonathon Fletcher in 1993 at the University of Stirling.
  • 00:08:01
    This consisted of three pieces of software that worked together.
  • 00:08:04
    The first was a web crawler, software that followed all the links it could find on the
  • 00:08:07
    web; anytime it followed a link to a page that had new links, it would add those to
  • 00:08:11
    its list.
  • 00:08:12
    The second component was an ever enlarging index, recording what text terms appeared
  • 00:08:16
    on what pages the crawler had visited.
  • 00:08:18
    The final piece was a search algorithm that consulted the index; for example, if I typed
  • 00:08:22
    the word “cat” into JumpStation, every webpage where the word “cat” appeared
  • 00:08:26
    would come up in a list.
  • 00:08:28
    Early search engines used very simple metrics to rank order their search results, most often
  • 00:08:32
    just the number of times a search term appeared on a page.
  • 00:08:35
    This worked okay, until people started gaming the system, like by writing “cat” hundreds
  • 00:08:40
    of times on their web pages just to steer traffic their way.
  • 00:08:43
    Google’s rise to fame was in large part due to a clever algorithm that sidestepped
  • 00:08:47
    this issue.
  • 00:08:48
    Instead of trusting the content on a web page, they looked at how other websites linked to
  • 00:08:52
    that page.
  • 00:08:53
    If it was a spam page with the word cat over and over again, no site would link to it.
  • 00:08:57
    But if the webpage was an authority on cats, then other sites would likely link to it.
  • 00:09:01
    So the number of what are called “backlinks”, especially from reputable sites, was often
  • 00:09:05
    a good sign of quality.
  • 00:09:07
    This started as a research project called BackRub at Stanford University in 1996, before
  • 00:09:12
    being spun out, two years later, into the Google we know today.
  • 00:09:15
    Thanks thought bubble!
  • 00:09:16
    Finally, I want to take a second to talk about a term you’ve probably heard a lot recently,
  • 00:09:20
    “Net Neutrality”.
  • 00:09:21
    Now that you’ve built an understanding of packets, internet routing, and the World Wide
  • 00:09:25
    Web, you know enough to understand the essence – at least the technical essence – of
  • 00:09:29
    this big debate.
  • 00:09:30
    In short, network neutrality is the principle that all packets on the internet should be
  • 00:09:34
    treated equally.
  • 00:09:35
    It doesn’t matter if the packets are my email or you streaming this video, they should
  • 00:09:38
    all chug along at the same speed and priority.
  • 00:09:41
    But many companies would prefer that their data arrive to you preferentially.
  • 00:09:45
    Take for example, Comcast, a large ISP that also owns many TV channels, like NBC and The
  • 00:09:50
    Weather Channel, which are streamed online.
  • 00:09:52
    Not to pick on Comcast, but in the absence of Net Neutrality rules, they could for example say that
  • 00:09:57
    they want their content to be delivered silky smooth, with high priority…
  • 00:10:01
    But other streaming videos are going to get throttled, that is, intentionally given less
  • 00:10:04
    bandwidth and lower priority. Again I just want to reiterate here this is just conjecture.
  • 00:10:09
    At a high level, Net Neutrality advocates argue that giving internet providers this
  • 00:10:13
    ability to essentially set up tolls on the internet – to provide premium packet delivery
  • 00:10:17
    – plants the seeds for an exploitative business model.
  • 00:10:20
    ISPs could be gatekeepers to content, with strong incentives to not play nice with competitors.
  • 00:10:25
    Also, if big companies like Netflix and Google can pay to get special treatment, small companies,
  • 00:10:30
    like start-ups, will be at a disadvantage, stifling innovation.
  • 00:10:34
    On the other hand, there are good technical reasons why you might want different types
  • 00:10:37
    of data to flow at different speeds.
  • 00:10:39
    That skype call needs high priority, but it’s not a big deal if an email comes in a few
  • 00:10:43
    seconds late.
  • 00:10:44
    Net-neutrality opponents also argue that market forces and competition would discourage bad
  • 00:10:49
    behavior, because customers would leave ISPs that are throttling sites they like.
  • 00:10:53
    This debate will rage on for a while yet, and as we always encourage on Crash Course,
  • 00:10:57
    you should go out and learn more because the implications of Net Neutrality are complex
  • 00:11:01
    and wide-reaching.
  • 00:11:02
    I’ll see you next week.
Tags
  • World Wide Web
  • Internet
  • Hypertext
  • URLs
  • HTTP
  • HTML
  • Net Neutrality
  • Web Browsers
  • Search Engines
  • Tim Berners-Lee