This project has been partially
funded by a gift from IBM
Executive Overview
IBM and North Carolina
State University have developed a cloud computing platform
referred to as the Virtual Computing Lab (VCL). VCL
is now an open-source Apache project (http://vcl.apache.org/index.html).
The main goal of VCL is to make available dedicated, custom
compute environments to users. The focus for VCL is educational
institutions. The compute environments can range from
something as simple as a virtual machine running a specific set of
productivity software to a cluster of powerful physical servers
running complex HPC simulations. NCSU and other schools in
North Carolina rely on VCL for managing compute resources for
faculty, staff, and students (https://vcl.ncsu.edu/).
The School of Computing relies on VM
technology to support research and pedagogy. However,
there is not a standard method for supporting VMs at
Clemson. The result is each faculty tends to put together
their own VM solution that supports his/her research and course
needs. In Computer Science, typically courses (especially upper
level undergraduate or graduate courses) require very specific
computing environments. Requirements might range from specific
VM images containing course specific software and systems to HPC
compute nodes to clustered systems offering a networking testbed
environment. Therefore, it is not clear if one VM
technology can meet the needs of a CS department. These
issues motivate this project. We evaluate VCL for use in
Clemson's School of Computing. Our starting point is to
build a VCL cloud and use it in one or more courses in the
2013/2014 academic year. To
focus this phase of the project, we concentrate on applying
VCL to our networking courses.
In this web page, we
provide the following sections:
Latest Project
Status and Results
Project
Description, Methodology, Results and Analysis
Summary of Results
Helpful Links
1.0 Latest Project Status and Results:
The project is currently
underway. We are tracking to the following schedule:
A working VCL
system will be available by 8/15/2013.
Material created
for the course will be available off the course web page as it
becomes available (the course web page link will be added
here)
A paper describing
and documenting the trial will be developed and submitted to
an appropriate conference.
2.0 Project Description, Methodology, Results and Analysis
2.1
Introduction
IBM and North
Carolina State University have developed a cloud computing
platform referred to as the Virtual Computing Lab
(VCL). VCL is now an open-source Apache project (http://vcl.apache.org/index.html).
The main goal of VCL is to make available dedicated, custom
compute environments to users. The compute environments
can range from something as simple as a virtual machine
running productivity software to a cluster of powerful
physical servers running complex HPC simulations. NCSU
and other schools in North Carolina rely on VCL for managing
compute resources for faculty, staff, and students (https://vcl.ncsu.edu/).
From a student's perspective, VCL provides access to a large
number of images. For example, a student can access to a
machine that runs a particular variant of Linux or Windows
with specific applications. From a Faculty's
perspective, an instructor can tailor specific images for use
in classes. This can address the issue of managing
control of licensed software to a subset of the University
population. It also addresses the sometimes time consuming
task of preparing systems for use by classes that require
customized software and setup.
For Computer Science or Computer Engineering classes that
involve the study of networks, distributed systems, operating
systems, or computer security, VCL provides an intriguing
technology for supporting education. In this project, we
conduct a small scale trial of VCL in the School of Computing
at Clemson University. In this document, we describe in more
detail the trial. We summarize the specific project goals,
methods, milestones, as well as results.
2.2
Background
Cloud Computing:
Cloud computing is a highly over-used term. The technology
and ideas surrounding Cloud computing have been evolving quickly
over the last several years making it further challenging to
define the concept. At the highest level, cloud
computing provides services to users. The nature of the service,
the target user community, and the economics that drive the use
and adoption of the system all contribute to a diverse range
of cloud computing systems.
Recent trends and advances in VM technology have sparked
significant interest in an Infrastructure as a Service (IaaS)
solution. Open source systems that provide IaaS systems such as
OpenStack, CloudStack, and Eucalyptus are widely used.
Further, the intersection of software defined networking with
IaaS (as well using other approaches) is now integrating aspects
of the network to IaaS models. For example,
OpenStack's Neutron project provides a "Networking as a Service"
between virtual interface devices that are managed by other
OpenStack components [1,2].
Academic research in this area is exploding. We focus on
two areas. First, we are interested in related research that
explore the resource allocation problem in a 'multi-tenant'
cloud environment. Works such as [3-6] (to name a few) are
developing and exploring architectures and issues that arise in
clouds that provide data center services. Second,
as our initial focus is on support for networking and systems
courses, we are interested in creating large scale testbeds that
can be used either for research or education.
The United States government has been tremendously successful in
funding exploration and development of infrastructure programs
to stimulate research and development of computer networks and
large scale systems. The Internet was originally motivated by
the need to connect military sites and equipment in a robust
manner. The Internet2 and the National Lambda Rail were
motivated to facilitate large scale research in high performance
computing.Large
scale Internet testbeds such as Emulab and PlanetLab have been
funded in part by the government and are widely used by the
academic community[7-11]. Recently
software defined networking has been proposed to enable large
scale research on production (campus) networks[6]. The NSF has invested
significantly in SDN deployments through the Global Environment
for Network Innovations (GENI) project [12-14]. Most recently, the idea of container-based
emulation has been introduced in an effort to reproduce
network experiments running real code on an emulated network
using lightweight VM techniques [15-17].
In summary, the academic research community has developed a
wide array of modeling, simulation, prototyping tools that are
available to the researcher including:
Simulators: commercial (Opnet, Matlab/Simulink), open
source (ns2, ns3, Omnet, J-Sim), and obviously many home-grown
platforms
For a detailed discussion of VCL, please refer to the Apache
project site (http://vcl.apache.org/index.html). As
mentioned earlier in this document, the main goal of VCL is to
make available dedicated, custom compute environments to users. It
therefore provides a compute cloud. Figure 1 illustrates the VCL
web portal. With proper credentials and access rights, a user can
reserve one or more computing resources.
Figure 1. User Portal
VCL uses the following abstractions:
User
level
Images: An image is the collection of the software
along with the operating system. It effectively is the hard disk
image of a computer system.
Reservation: When a user wants to use resources (e.g.,
software, specific OS) managed by VCL, he/she needs to make a
reservation by using the VCL web interface.
Computer
level:
Virtual Machine Hosts: VCL supports a bare metal
computer or a hypervisor. The supported hypervisors include KVM,
VMware ESXi 4.x/5.x and VirtualBox.
Computers: The concept of computers in VCL includes
physical computers and virtual machines. The physical computer
can be installed with VMware or KVM.
Management Nodes: this machine houses the user
interface, the database, and stores the images users request.
A user accesses the main VCL web page (ours is located at https://vcl-mgmt1.clemson.edu/vcl/index.php
). After logging in, the user can create a reservation and
select a compute environment. This translates to specifying an
image that will be run on a Computer. If successful, the user will
be given the IP address and login instructions. Once
complete, the student must backup their data to their own machine
as the modified VM image is destroyed once the session is over.
Project
Objectives and Goals
The School of
Computing relies on VM technology to support research and
pedagogy. However, there is not a standard method
for supporting VMs at Clemson. The result is each
faculty tends to put together their own VM solution that
supports his/her research and course needs. In Computer
Science, typically courses (especially upper level
undergraduate or graduate courses) require very specific
computing environments. Requirements might range from
specific VM images containing course specific software and
systems to HPC compute nodes to clustered systems offering
a networking testbed environment. Therefore, it is
not clear if one VM technology can meet the needs of a CS
department. These issues motivate this
project. We evaluate VCL for use in Clemson's School
of Computing. Our starting point is to build a VCL
cloud and use it in one or more courses in the 2013/2014
academic year. To
focus the project, we concentrate on applying VCL to our
networking courses.The
main objective is to trial VCL in the School of
Computing. A second goal is to share our results with
others at Clemson. A third goal is to understand
the relationship between VCL and other cloud management
platforms such as OpenStack and with other enabling
technologies such as OpenFlow and how perhaps these
different systems might be used together to support
pedagogy.
Project description
We have targeted our department's entry level graduate course
in computer networking (referred to as CPSC851) for the VCL trial.
We will use VCL to provide simulation, emulation, and testbed
experiences allowing students to 1)gain hands on experience
building and validating actual network systems; 2)apply knowledge
learned in the class related to network performance analysis using
actual network systems. The class (roughly 50 students) will
be divided into groups of 5. Each group will have access to
three environments (images). As illustrated in Figure 2, the
scenarios include: 1)NS2 simulation nodes;
2)Network Experiment scenarios; 3)MiniNet scenario.
Figure 2. Scenarios of Interest
NS-2 Simulation Scenario: The open source ns2
simulator is the defacto simulation tool used by the Internet
research community. It is a discrete-event simulator
that faithfully reproduces a wide set of TCP/IP protocols and
applications. The simulator used in the class will
be patched with several modules for teaching purposes. This image
will be made available to students at the beginning of the semester.
Network Experiment Scenario: Each group will be given
access to a simple three node system. This is illustrated in Figure
2b by the ellipse showing Node1 and Node2 interconnected by a
router. The students will learn basic network administration and
networking skills. The group's account on the three virtual nodes
will allow sudo access to basic administration commands that are
necessary to setup the network environment and to monitor system
configuration and performance. The students will develop or
utilize existing open source network performance and
security software on the testbed.
MiniNet Scenario: As described in [15], MiniNet-HiFi is a of
container-based emulation tool that has been introduced in
an effort to reproduce network experiments running real code on
an emulated network using VM technology. Building on the
knowledge obtained from the previous scenarios, we plan to use
MiniNet to introduce Software Defined
Network and OpenFlow.
System Details:
Figure 3 illustrates the VCL system that will be used.
The Manager node contains the User interface web site. It houses the
database and the images. We use two relatively powerful PCs
for running VMs. The Dell Optiplex 901C machines each have 16 GBytes
of memory and 4 cores. We will allow the system to run up to
16 (KVM) VMs at any given time. Rather than have the students
save and update their images, we assume that students will reserve
their systems for the entire semester. As required in VCL, all
hosts in the system have two NICs, one is connected to 'public
network' and another one is connected to 'private network'. The user
or administrator can login to the system from the campus network.
C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, S. Lu, "Dcell: A
Scalable and Fault-Tolerant Network Structure for Data
Centers", in Proceedings of ACM SIGCOMM, 2008.
Alan Shieh, Srikanth Kandula, Albert Greenberg, and Changhoon
Kim, Seawall: Performance Isolation in Cloud Datacenter
Networks, in 2nd USENIX Workshop on Hot Topics in Cloud
Computing, USENIX, 22 June 2010
H. Rodriques, J. Santos, Y. Turner, P. Soares, D. Guedes,
"Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant
DataCenter Networks", Usenix WIOV, 2011.
D. Drutskoy, E. Keller, J. Rexford, "Scalable Network
Virtualization in Software-Defined Networks", IEEE
Internet Computing Magazine, March 2013.
L. Peterson, T. Anderson, D. Culler, T.
Roscoe, “A Blueprint for Introducing Disruptive Technology
into the Internet”, ACM SIGCOMM CCR, January 2003.
Chun, B., D. Culler, T. Roscoe, A.
Bavier, L. Peterson, M. Wawrzoniak, and M. Bowman. “PlanetLab:
An Overlay Testbed for Broad-Coverage Services,” ACM Computer
Communications Review, vol. July 2003.
Bavier, A., N. Feamster, M. Huang, L.
Peterson, J. Rexford. “In VINI Veritas: Realistic and
Controlled Network Experimentation,” Proc. of ACM SIGCOMM,
2006.
E. Eide, L. Stoller, J. Lepreau, “An
Experimentation Workbench for Replayable Networking Research”,
Proceedings of the Fourth USENIX Symposium on Networked
Systems Design and Implementation (NSDI’07), April 2007.
M. Hibler, R. Ricci, L. Stoller, J.
Duerig, S. Guruprasad, T. Stack, K. Webb, J. Lepreau,
“Large-scale Virtualization in the Emulab Network Testbed”,
Proceedings of the USENIX Annual Conference, June 2008.
R. Sherwood, G. Gibb, K.-K. Yap, G.
Appenzeller, M. Casado, N. McKeown, and G. Parulkar. Can the
Production Network Be the Testbed? In Proceedings of the
Symposium on Operating System Design and Implementation
(OSDI), October 2010.
Global Environment for Network
Innovations, project web site: http://www.geni.net
I. Baldine, et. Al., “ExoGENI: A
Multi-Domain Infrastructure-as-a-Service Testbed”, Proceedings
of the 8th International ICST Conference on Testbeds and
Research Infrastructures for the Development of Networks and
Communities (TridentCom’12), June, 2012.
MiniNet, http://mininet.org/, 2013.
N. Handigol, B. Heller, V. Jeyakumar, B. Lantz, N.
McKeown, "Reproducible Network Experiments Using
Container-Based Emulation", Proceedings of ACM CoNEXT'12,
December 2012.
N. Handigol, B. Heller, V. Jeyakumar, B. Lantz, N.
McKeown, "Mininet Performance Fidelity Benchmarks",
Unpublished report, available at
http://hci.stanford.edu/cstr/reports/2012-02.pdf ,
October, 2012.