Overview of Web Applications
CPSC 330 - Spring 2004

Web history

html tags

  <img src="...">
  <a href="..."> name</a>
  ...

url schemes

  http        http://servername[:port][/pathname[#html_anchor]][?arguments]
  ftp
  nntp                         ^ argument1&argument2&...  spaces given by "%20"
  mailto
  telnet

  defaults  ~user     -> /home/user/public_html
            directory -> look for index.html

typical protocol stack

    +-------------------------------------------------------+
    |                          user                         |
    +------+------+------+------+------+------+------+------+
    |      | PING | DHCP | DNS  | MIME | SMTP | POP3 | HTTP |
    |      +------+------+------+------+------+------+------+
    | ARP  | ICMP |     UDP     |            TCP            |
    |      +------+-------------+---------------------------+
    |      |                       IP                       |
    +------+------------------------------------------------+
    |                   local area network                  |
    +-------------------------------------------------------+

http

steps in accessing a web page

  web client - browser        name server     web server
  --------------------        -----------     ----------

  (1) DNS query using UDP --> (port 53)

                          <-- DNS response
                              message gives
                              IP address

  (2) HTTP get using TCP -------------------> (port 80)

      get
      <header name> : <header data>
      <blank line>

                         <------------------- HTTP response message

                                              <response line>
                                              <response headers>
                                              <blank line>
                                              <response body>
                                              <version> <status code> <message>
                                                        200           OK
                                                        404           not found

see also

proxies and gateways

  proxy                                        gateway
  * client side                                * server side
  * can serve some requests but                * translate protocols
      rewrites and forwards most               * authenticates
  .-------------------.                        .-------------------.
  | client     proxy  |                        | gateway    server |
  | .----.     .----. |                        | .----.     .----. |
  | |    |---->|    |------>              ------>|    |---->|    | |
  | |    |<----|    |<------              <------|    |---->|    | |
  | `----'     `----' |                        | `----'     `----' |
  `-------------------'                        `-------------------'

web caching


CNN 9/11 case study


[http://www.tcsa.org/lisa2001/cnn.txt]

    CNN.com: Facing A World Crisis
    William LeFebvre, CNN Internet Technologies

    Who are we?  CNN.com (Turner Broadcasting)
    - 50 web sites (CNN.com, cartoon network, etc)
    - 200 servers

    Network Bandwidth (on 9/11)
    - 2 OC-12 1,244 Mbps
    - 7 OC-3  1,085 Mbps
    - Total   2,329 Mbps

    Hardware
    - Standard web server: Sun 420R 4x4 (4 CPU, 4GB RAM)
    - CNN.com normally used a 15 server pool
    - Load balancers front-end all web services

    Typical Loads
                Peak     Total        Total
       Date    Hits/min  Hits/min   Page Views
    - 9/11/00     220K     148M        11.8M    One year prior
    - 11/8/00   1,217K     722M       139.4M    Day after Election Day
    - 9/10/01     156K     104M        14.4M    Day before

      11/8/00 was the record page views to date

    Managing Unexpected Loads
    - swing servers (move servers from one web service to another)
    - add additional servers
    - reduce page complexity (remove advertisements, pictures, text)

    Reducing Page Complexity
    - Three page styles (standard, split, ultra light)
      - Standard
      - Split (half page with link to more info)
      - Untra light (minimal information, with links for more info)

    On the morning of Sept 11, there were 10 servers providing CNN.com.

    Loads on 9/11-12
                Peak     Total        Total
       Date    Hits/min  Hits/min   Page Views
    - 9/10/01     156K     104M        14.4M    Day before
    - 9/11/01   1,110K     411M       132.4M    Day of
    - 9/12/01     948K     797M       304.8M    Day after

    On 9/11, the peak demand was estimated at 1.8M hits per minute, or
    20 times normal.


[http://www.dvwebvideo.com/2000/0500/gordon0500.html]

CNN Interactive uses a custom-designed live-capture system that encodes five different streaming media files in a single pass. This is accomplished by taking an analog feed out of the Media 100, then splitting it five ways to separate PCs: two for RealNetworks streams and two for Windows Media at 28kbps and 80kbps data rates. The fifth machine creates an AVI file that's converted separately to QuickTime. All render at a frame size of 176x132 pixels


Security (TBD)


[CPSC 330 homepage]

mark@cs.clemson.edu