Linux Device Support
for the IBM - APE 25
Alice Flower
Robert Geist
Mike Westall
Department of Computer Science
Clemson University
Clemson SC 29634-1906
rmg@cs.clemson.edu
westall@cs.clemson.edu
Outline
Introduction
Developing a standalone ATM device driver
-
Development strategy
-
Driver initialization
-
Transmit buffer management
-
Receive buffer management
Native ATM networking in Linux
-
Data structures and bindings
-
Binding the standalone driver to the protocol.
IP over ATM
-
Classical IP over ATM
-
LAN Emulation
Performance Data
ATM
Is it........
A radically new and innovative networking technology that
will
Permit fixed and variable bit rate real-time traffic
to co-exist painlessly with non-real- time bulk data transfer.
-
Provide virtually unlimited scalability, satisfying all future
networking requirements of humankind.
or is it...........
The answer to the question:
Name one thing that illustrates better than even the US
Internal Revenue Code the ill effects of design by large committees
of competing interests.
ATM
What can/will ATM provide'
-
A high performance LAN or Intranet solution.
-
High bandwidth virtual paths in the Internet infrastructure.
-
World-wide end-to-end services that obviate IP??
ATM services
-
Connection-oriented "best-effort" transport for 53 byte cells
-
Permanent or switched connections (PVCs and SVCs)
-
QOS guarantees for constant and variable bit rate streams.
-
Allocation of "leftover" bandwidth to variable bit rate streams
-
Transport layer services called ATM Adaptation Layers "AAL's"
AAL1 - Constant bit rate (PCM voice)
AAL5 - Everything else
AAL5 Frames are like IP datagrams in that:
size is variable and maximum size is 64KB.
AAL5 Frames are not like IP datagrams in that:
they are not transmitted as an entity within the subnet..
The
IBM APE 25
APE 25 Overview
Supports AAL1 (CBR) and AAL5
-
Adapter provided segmentation and reassembly
-
Scatter/Gather facility on both send and receive (not yet
used by us)
-
Each connection endpt (VPI, VCI) is called a Logical
Channel (LC)
Receive traffic management
-
Eight received frame queues are provided
-
Eight bits in the SISR indicates which queues are non-empty
-
Each LC is bound to a specific receive queue
-
Driver could (but we don't) give priority based service.
The IBM APE 25
Transmit traffic management
The APE-25 provides eight transmission queues (TMQs).
Associated with each TMQ is
-
A peak transmission rate (bps)
-
A priority (0/1 = low/high) (not related to the CLP)
-
Zero or more Logical Channels (LCs)
Admission control is via a token
bucket algorithm
-
Peak bit rate of an LC is the peak bit rate of its TMQ
-
Each transmission consumes one cell credit
-
Credits may be consumed at the peak rate
-
Credits are renewed at the sustainable rate
Specifying the sustained cell rate and the credit limit
(CL)
peak to sustainable
ratio (PSR) = peak rate / sustained rate.
burst tolerance specifier (BTS) = PSR * CL/8
Oversubscription algorithm:
-
Service high priority TMQ's first
-
Among TMQ's of equal priority service higher rate first.
-
Your mileage may vary!
The APE 25 Device
Driver
Development
strategy
-
Use loadable modules to minimize:
-
kernel rebuilds
-
system reboots
-
Start with a standalone driver
-
chararacter device
-
ioctl interface
TCLP0_RCELL 0108 - 0069
TCLP1_RCELL 010a - 0000
THB_ER_CELL 010c - 0011
-
Use conditionally included printk's
extensively
Develop in a PVC (initially loopback)
environment
The APE 25 Device
Driver
Device driver organization
-
Initialization
Generic device driver initialization
Acquire / Initialize PCI data
Register as a Linux character device
Register interrupt handler
Driver component initialization
Connection management structures
Transmit management structures
Receive management structures
NIC state initialization
Load microcode
Load miscellaneous state registers
Enable the NIC
The APE 25 Device Driver
Driver Initialization
-
Register as a character device for ioctl service
/* Register as a bogo character device..
for */
/* ioctl access for testing. */
printk("IBM APE25: Module initialization\n");
rc = unregister_chrdev(APE25_MAJOR, "ape25");
if (rc = register_chrdev(APE25_MAJOR, "ape25",
&atm_fops))
{
printk( "APE25: cannot
register; rc %d\n", rc);
return 1;
}
-
PCI device identification
for (pci_dev = pci_devices; pci_dev; pci_dev = pci_dev->next)
{
if (vendor==PCI_VENDOR_ID_IBM &&
device==PCI_DEVICE_ID_ATM_APE25)
{
ape->pci_bus = pci_dev->bus->number;
ape->pci_devfn = pci_dev->devfn;
ape->pci_vendor = pci_dev->vendor;
ape->pci_device = pci_dev->device;
found = 1;
break;
}
}
pcibios_write_config_word(bus, devfn, PCI_COMMAND, 0x117); pcibios_write_config_word(bus,
devfn, PCI_STATUS, 0xfd7f); pcibios_write_config_byte(bus, devfn, PCI_LATENCY_TIMER,
0x40);
-
Hook the APE 25's Interrupt
rc = request_irq(ape->pci_irq, ape25_irq,
SA_INTERRUPT, "ape25", ape);
-
Reset the APE25
WT_CNTLREG(ape, MODE_REG, RESET_MODE);
i = 1;
while (i > 0)
{
if (!(RD_CNTLREG(ape, MODE_REG) & RESET_MODE))
break;
i += 1;
}
-
IPL the pico processors
-
Allocate and initialize the DMA buffer pools
-
Initialize the XRAM data structures (VP Table, LCI Table,
LC Table, Cell buffer pools, and Transmit Request Queues
-
Initialize remaining APE 25 Regs
WT_CNTLREG(ape, IDLE_CELL_HDR_HI, 0);
WT_CNTLREG(ape, IDLE_CELL_HDR_LO, 1);
WT_CNTLREG(ape, ERROR_REG_MASK, 0x8cff);
WT_CNTLREG(ape, OAM_MASK, 0xecea);
WT_CNTLREG(ape, OAM_B1_MASK1, 0xff40);
WT_CNTLREG(ape, OAM_B1_MASK2, 0xff41);
WT_CNTLREG(ape, PCI_ERROR_MASK, 0xFFFF);
WT_CNTLREG(ape, CERR_CNT, 0x00FF);
WT_CNTLREG(ape, SERR_CNT, 0x00FF);
WT_CNTLREG(ape, FR_TIMEOUT_VAL, 0x0032);
WT_CNTLREG(ape, ACONFIG_REG, 0xffff);
-
Activate
the APE 25
#define OS2_MODE 0x1149
WT_CNTLREG(ape, MODE_REG, OS2_MODE);
Transmit Data Flow & Buffer
Management
The APE 25 Device Driver
Producer/Consumer perspective on buffer management
| List |
Producer |
Consumer |
| Transmit Ready |
Driver send function |
APE-25 |
| Transmit Complete |
APE-25 |
Driver Xmit IRQ |
| Free |
Driver Xmit IRQ |
Driver Send function |
Buffer List management
-
List elements are buffer headers (TFD's) architected by
the APE - 25
-
Single forward link structure
-
Link value of "1" indicates end-of-list
-
External head and tail pointers
-
Every list must be initialized with a single "placeholder"
Mutual exclusion mechanism
-
Single producer and single consumer per list
-
Producer adds to the tail and may alter
the tail pointer
the link field of the last element
-
Consumer removes from the head and may alter
the head pointer
-
Consuming the last (and only) element is illegal!
The APE 25 Device Driver
Synchronization mechanism
Must prevent:
Consuming from an empty list
Producing into a "full" one (a linked list is never
full)
| |
APE-25 |
Send Routine |
Xmit IRQ |
| Producing |
Not a problem |
Must check
APE-25 TRQ state register |
Not a problem |
| Consuming |
Not our problem |
Must check
for available frame buffer |
Shouldn't occur but can be avoided
by checking the list |
Possible options for the send routine
-
Busy-wait - both conditions are "self-correcting"
-
Sleep - and be awakened by the Xmit IRQ
-
Drop the frame
The APE 25 Device Driver
Transmit details
|
31.............24
|
23.............16
|
15.............8
|
7................0
|
|
Next TFD (Used internally by
APE 25)
|
|
TCL Next TFD
|
|
LCI
|
Parm/Status
|
Buf Count
|
|
Control Field
|
SDU Length
|
|
Data buffer 0 Address
|
|
Buf 0 Length
|
|
|
Data buffer 1 Address
|
|
Buf 1 Length
|
|
|
Additional Data Buffers
|
| |
|
The Transmit Frame Descriptor (TFD)
Standalone driver buffer management
-
TFD's are permanently associated with a single buffer
-
Buffers of size 1 KB are allocated by default.
-
No LC can own more than 8 buffers at any time.
-
The chaining capabilities of the APE 25 are not used.
The FDL (Free Descriptor List)
Transmit buffers available for use when an application
performs an ioctl senddata.
-
All but one buffer is on this list after initialization
-
Buffers consumed from front of list by atm_send5u
-
Buffers produced to end of list by TC Int Handler
-
The last and only buffer is never consumed.
-
No mutex required between driver and interrupt handler
The TCL (Transmit Complete List)
Buffers that have been transmitted by the APE 25 but
not yet reclaimed by the driver.
-
Buffers consumed from front of list by TC Int Handler
-
Transmitted buffers produced to end of list by APE 25
-
Last buffer is never removed => NO MUTEX needed
Complete transmit buffer lifecycle
Receive Data Flow & Buffer
Management
The APE 25 Device Driver
Receive Buffer Lists and Lifecycle
1 - Free list:
Buffers available for use by APE 25.
2 - APE-25 Reassembly
Buffers presently owned by the APE 25
Queue
for the reassembly of frames
3 - Recv Ready Lists:
Buffers containing assembled frames. The APE
25 defines 8 logical receive queues.
4 - Per LC Ready Lists:
Buffers containing assembled frames for a
particular LC
| List |
Producer |
Consumer |
| Reasm-Queue |
APE-25 |
APE-25 |
| RRLn |
APE-25 |
Driver Recv IRQ |
| LC RBLm |
Driver Recv IRQ |
Driver Recv function |
| Free |
Driver Recv function |
APE-25 |
The APE 25 Device Driver
Receive mutual exclusion mechanism
Analogous to the transmission mechanism
Receive synchronization mechanism
| |
Recv
Routine |
Recv
IRQ |
| Producing
to |
Free List |
no problem |
LC RBLm |
no problem |
| Consuming
from |
LC RBLm |
sleep() wakeup by Recv IRQ |
RRLn |
Should not occur.. but must be
checked |
The APE 25 Device Driver
The Receive Buffer Header (RBH)
|
31.............24
|
23.............16
|
15.............8
|
7................0
|
|
Data Buffer Address
|
|
Next RBH Address
|
|
Buffer Length
|
Status
|
|
LCI
|
Control
|
|
SDU Length
|
OAM / CPL0 / TUC
|
The Receive Free List (RFL)
Buffers available for the
APE 25 to use for frame assembly.
-
Fixed size data buffers are used.
-
The multibuffer frame capability of APE 25 is not used.
-
Data buffers are bound to RBH's and assigned to the (RFL)
during system initialization.
-
Buffers are consumed from front of list by APE 25 each time
a frame assembly starts.
-
Buffers are produced onto the end of the list by atm_recv5u
after a frame is copied back to user space.
The Receive Ready Lists (RRL's)
Buffers containing assembled frames not yet processed
by the driver.
The APE 25 defines 8 logical receive queues.
-
During initialization a free RBH is assigned to each RRL
-
None of these RBH's is attached to a data buffer
-
Frames are produced onto RRL ends by APE 25
-
Frames are consumed from RRL front by receive int handler
and then..
-
Produced onto the end of the target LC Queue
Mutex problems in consuming buffers:
Suppose three frames have been received for RRL_3 and
All are destined for LC 32
Problem: APE 25 owns (the next RBH field of ) RBH 7 but
the driver needs to consume DB-16
Solution: Driver also needs to consume the dummy RBH-4...
so a rebinding is performed
The per LC received buffer lists
Buffers demultiplexed by the driver by target VC but
not yet consumed by the application:
-
APE 25 has no interaction with these lists
-
The APE 25s same buffer management strategy is carried over
-
A free "dummy" RBH is assigned to each LC list at initialization
-
A data buffer is not attached to the RBH
Per LC Received buffer list management
After 3 frames have been received for LC 32
-
As frames arrive they are appended to the destination LC's
received buffer list by the driver's receive interrupt handler.
-
Frames are removed from the front of the list via ioctl receive
data requests.
-
After data is copied to application the RBH and the data
buffer are returned to the free list.
-
The rebinding technique described earlier is used here.
Reviewing the buffer lifecycle
1 - APE 25 consumes a buffer via the SRFL
register
2 - APE 25 produces on to RRL_n
via the RRLn_LFDA register
3 - Driver (interrupt service routine) consumes from RRL_n
via ape->srrl[n] and produces to LC_RBL_k via
ape->slcrbl[k]
4 - Driver (atm_recv5u) consumes from LC_RBL_k via ape->elcrbl[k]
and produces to the free list via ape->erfl.
The Free RBH List
The driver manages a pool of 256 RBH's
-
RBH's used in the receive processing above are allocated
from this this...during system initialization
-
Its principal use is in conjunction with Linux / ATM support
The Linux ATM Protocol
Overview:
-
Werner Almesberger of the LRC coordinates development
-
Current version is 0.36 ( < production release )
-
Services
Native ATM (Almesberger)
Socket based API
Connection oriented
Best effort delivery of AAL5 frames
PVC or SVC
Signaling (atmsigd: Almesberger)
UNI 3.0, 3.1, 4.0?
ILMI (ilmid: S. Schumate)
Classical IP over ATM (atmarpd: Almesberger)
Lan Emulation (zeppelin: M. Kiiskila)
The
Linux ATM Protocol
The Device Driver Interface:
From init_module() the driver
must call
struct atm_dev *atm_dev_register(
char *type,
struct atmdev_ops *ops,
unsigned long flags)
The first parameter is the device name .... "ape25"
The second is a pointer to the device driver operations
vector table:
static struct atmdev_ops atm_ops =
{
ape25_open, /* open */
ape25_close, /* close */
ape25_ioctl, /* ioctl */
ape25_getsockopt, /* getsockopt */
ape25_setsockopt, /* setsockopt */
ape25_send, /* send */
:
:
-
For receive operations, the address of a *push() function
is provided by the protocol in the VCC
structure.
-
Thee most significant change to the driver is moving
to the use of standard skbuffs
The Linux ATM Protocol
Receive Buffer Management:
1 - APE 25 consumes a RBH-SKBUFF via the SRFL
register
2 - APE 25 produces on to RRL_n
via the RRLn_LFDA register
3 - Driver (interrupt service routine) consumes from RRL_n
via ape->srrl[n]. RBH is produced to free RBH
list. Skbuff is "pushed" to protocol.
4 - Protocol returns skbuff to driver. Driver consumes
RBH from free RBH list and produces matched pair to the free list via ape->erfl.
The Linux ATM Protocol
TCP/IP over ATM
-
Classical IP over PVC's (no signaling required.. but won't
scale)
-
Classical IP over SVC's (IEFT)
-
LAN Emulation (ATM Forum)
Theoretically, no change to
the device driver should be required.
Our problems included:
-
Sleeping on lack of transmit resources
-
Failure of ilmid to convey valid (vpi, vci) range to the
switch.
Performance
| Protocol |
% of available
bit rate delivered |
| Standalone driver & Native
ATM |
> 97% for large frames |
| TCP/IP (mtu = 2KB) |
> 96% across the board |
In conclusion
Scope of the undertaking
| ; count |
Module |
Function |
| 298 |
atmape25.c |
Protocol glue |
| 833 |
atmbase.c |
Initialization, FLIH |
| 254 |
atmdma.c |
Standalone buffer allocation |
| 229 |
atmioctl.c |
Standalone ioctl interface |
| 789 |
atmrecv.c |
Receive manager |
| 600 |
atmxmit.c |
Transmission manager |
| 361 |
atmxram.c |
Connection manager |
| 3364 |
Total |
|
Time: One to two person-months given
-
Designer/Implementer with very gray hair/beard
-
But no previous with Linux drivers or ATM drivers
Watch out for:
-
Missing / erroneous information in the NIC spec.
-
Assumptions relating to atomicity
WT_CNTLREG(x, y)
-
Assumptions relating to non-preemptability
if (xx == yy)
sleep();
To be done:
-
Export QoS support to the protocol
-
Linux 2.1x compliant buffer management