# **Performance Evaluation of EPICS on PowerPC**

J.Odagiri, A.Akiyama, N.Yamamoto and T.Katoh KEK, High Energy Accelerator Research Organization Oho 1 - 1, Tsukuba, Ibaraki, JAPAN 305

# Abstract

A portion of software components of the Experimental Physics and Industrial Control System (EPICS)[1] has been ported to a VME single board computer based on a PowerPC microprocessor. We have studied 1) the performance of a database scanning, 2) the Channel Access performance and 3) interrupt latency on the PowerPC based CPU board.

The CPU load of the VME single board computer arising from the database scanning was measured using the standard benchmark database developed by APS/ANL. The transaction process time required for putting and getting a value through Channel Access was studied as a performance indicator of Channel Access. Interrupt latency was measured to study overall responsiveness to an external event in EPICS on the PowerPC based board. The results of measurements were compared with the results on a 68040 based CPU board.

## 1 Introduction

In 1995, EPICS was adopted as a software platform for the KEKB accelerator control system[2]. In a control system based on EPICS, high level applications do not access directly to physical devices but each device is controlled or monitored through a run time database on a intelligent Input/Output Controller (IOC)[3] distributed around the system. Therefore, the performance of the IOC affects the overall performance of the system.

Nowadays, VME single board computers based on a highperformance CPU are available. The VxWorks real-time kernel[4, 5] required as a platform for EPICS is supported on many of these boards. For better performances and continuity for the future, it is necessary to examine the possibility to adopt such kind of high-performance boards.

We chose a PowerPC based model as one of the highperformance boards and studied performances of EPICS on the architecture. EPICS version 3.12 was used for this evaluation. EPICS performances on a 68040 based board, which is being used for the application developments of our system, were also measured to compare the results.

Table I summarizes basic specifications of the boads and the versions of VxWorks used for the evaluation.

In the following section, we will discuss the performances of the database scanning and Channel Access from a host computer. Interrupt latency is discussed in section 3. 
 Table I

 Summary of tested boards and software

|              | PowerCore-6604 | SYS68K/CPU-40 |
|--------------|----------------|---------------|
| Manuf.       | Force          | Force         |
| CPU Type     | PowerPC 604e   | MC68040       |
| Clock Rate   | 200 MHz        | 25 MHz        |
| RAM          | 16 MB          | 16 MB         |
| L2 Cache     | 512 KB         | -             |
| VxWorks Ver. | 5.3(Beta)      | 5.2           |

## 2 Performance of EPICS software

## 2.1 EPICS software organization

EPICS software components running on an IOC[3] consists mainly of:

- Run time database and associated routines to support dynamic activity of the database such as Scanners, Monitors, etc.
- Channel Access Server through which the IOCs communicate with each other and with a high-level workstation over the network in a media independent manner
- Three layers of routines called record support, device support and driver support which allow the IOC to access physical devices

Figure 1 shows an illustration of the EPICS software components running on the IOC.



Fig. 1 Architecture of software running on the IOC in EPICS

In this paper, attention will be paid mainly to performances of software components commonly required for running EPICS, not depending on types of hardware interfaces. Performances of physical device access will not be discussed. To evaluate performances of software components running on the IOC, the following were measured:

- The CPU load arising from a scanning of the standard benchmark database
- Transaction time required for putting and getting a value to/from the IOC through Channel Access.

#### 2.2 Database scanning performance

A standard benchmark database was developped by ANL/APS [6]. The structure of the database is shown in Figure 2.



Fig. 2 Structure of the benchmark database (The arrows indicate process links of EPICS database records)

A main fan-out record is processed by a periodic scanning mechanism. The main fan-out record processes two sub-fanout records three times per each scanning. Then, each subfan-out record processes five calculation records. Finally, each calculation record processes a chain of nine linked analog input records. About three hundred record processes take place in total by a scanning of the database. The I/O fields of all records are software links so that any access to physical devices does not occur by the record processing.

During the measurement, all of the calculation records and analog input records were being monitored by MEDM[1], a GUI tool running on a workstation, through Channel Access. This monitoring increased the CPU load arising from the record processing and caused another CPU load as a result of communication with MEDM.

The CPU load was measured for scanning periods of 1.0, 0.5, 0.2, 0.1 second. VxWorks spy function[4, 5] was used to measure the CPU load. The result of measurements is shown in Table II. The PowerPC based board shows the CPU performance about ten times as high as that of the 68040 based board taking an inverse of the CPU load as an index of the performance.

# 2.3 Channel access performance

The main routines of Channel Access[7] are:

- ca\_search used to establish and maintain a virtual circuit between a client and a server
- ca\_put used to write a value to a channel
- ca\_get used to read a value from a channel

Table II CPU load arising from the scanning of the benchmark database

|            | PowerCore-6604 | SYS68K/CPU-40 |
|------------|----------------|---------------|
| 1.0 second | 0.60 %         | 5.8 %         |
| 0.5 second | 1.2 %          | 12 %          |
| 0.2 second | 3.0 %          | 27 %          |
| 0.1 second | 6.1 %          | 56 %          |

The transaction time was measured for ca\_search, ca\_put, ca\_get, and for alternative execution of ca\_put and ca\_get. Table III summarizes the transaction time measured for those processes. The CPU load of the IOC and network traffic during the transaction were also measured (Table IV). If we take 1/(Transaction time \* CPU load) as an index of the performance, the performance of the PowerPC based board is about ten times as high as that of the 68040 based board. This result is consistent with the previous result of the database scanning.

Table III Transaction time required for Channel Access

|               | PowerCore-6604 | SYS68K/CPU-40   |
|---------------|----------------|-----------------|
| ca_search     | 953 micro-sec. | 1189 micro-sec. |
| ca_put        | 23 micro-sec.  | 109 micro-sec.  |
| ca_get        | 58 micro-sec.  | 118 micro-sec.  |
| ca_put&ca_get | 76 micro-sec.  | 246 micro-sec.  |

Table IV CPU load and network traffic (in the form of a ratio to 10MBit/sec) during the transaction

|               | PowerCore-6604 |         | SYS68K/CPU-40 |         |
|---------------|----------------|---------|---------------|---------|
|               | CPU            | Network | CPU           | Network |
| ca_search     | 98             | 17 %    | 68 %          | 20 %    |
| ca_put        | 52 %           | 75 %    | 100 %         | 18 %    |
| ca_get        | 18 %           | 50 %    | 82 %          | 25 %    |
| ca_put&ca_get | 30 %           | 60 %    | 88 %          | 20 %    |

# 3 Interrupt latency

Responsiveness to an interrupt from an external source is another important issue for real-time systems. We measured interrupt latency at two levels, i.e. at hardware/OS (VxWorks) level and EPICS application level (post event)[3].

#### 3.1 Latency at hardware/OS level

An Interrupt Request (IRQ) signal on the VME backplane was driven through an I/O board. An instruction which

causes access to a data register on the I/O board was executed at the entrance of the Interrupt Service Routine (ISR) for the I/O board. The Data Acknowledge (DTACK) signals associated with:

- reading out the interrupt vector from the interrupt vector register on the I/O board
- access to the data register on the I/O board caused by the instruction in the ISR

were observed using an oscilloscope. Fig. 3.1 shows the observed signals for the PowerPC based board. The first DTACK signal corresponds to reading out of the interrupt vector and the second corresponds to access to the data register.

The time between the rising edge of the IRQ signal and that of each DTACK signal was measured as the latency. The result is shown in Table V.



Fig. 3 The IRQ and DTACK signals on the VME backplane (Upper: IRQ signal, Lower: DTACK signals, 1 microsec./div. time scale)

Table V Interrupt latency at hardware/OS levels

|        | PowerCore-6604 | SYS68K/CPU-40  |
|--------|----------------|----------------|
| Vector | 1.0 micro-sec. | 2.1 micro-sec. |
| ISR    | 6.0 micro-sec. | 5.4 micro-sec. |

It should be noted that the CPU does not yet know about the interrupt when the first DTACK signal goes back. The VMEbus interface chip on the CPU board returned the first DTACK signal before the CPU accepted the interrupt. Therefore, the time between the first and second DTACK signals includes some communication latency in the CPU board in addition to the time required for switching the context to the ISR.

## 3.2 Latency at EPICS post event processing

Two records were loaded to the IOC in advance. Each record causes access to a data register on the I/O board when it is

processed. Then, an EPICS post event call which processes those two records was issued at the entrance of the ISR. The DTACK signal associated with each record processing was observed. Table VI shows the latency of those two record processes. If we subtract the time required to start the processing of the ISR from the values listed in the upper row of Table VI, the result also indicates that the CPU performance of the PowerPC based board is about ten times as high as that of the 68040 based board.

Table VI Latency at EPICS post event processing

|        | PowerCore-6604  | SYS68K/CPU-40  |
|--------|-----------------|----------------|
| first  | 12.5 micro-sec. | 73 micro-sec.  |
| second | 19.5 micro-sec. | 122 micro-sec. |

#### 4 Conclusion

The EPICS software components running on the IOC were ported onto the PowerPC based VME single board computer without any significant problem. The measurements on the performances of the database scanning and Channel Access showed about ten times CPU performance improvement over the 68040 based board.

As to the interrupt latency, the PowerPC based board required slightly longer time to switch a context to the ISR compared to the 68040 based board. However, as expected, it showed better performance at the EPICS post event processing where the software overhead was dominant. These results encouraged us to adopt PowerPC based CPU boards for the KEKB accelerator control system.

#### Acknowledgements

We would like to thank Mr. M.Hakuta (LOGIC HOUSE corporation) for his offering the PowerPC based CPU boards for the evaluation.

## References

- L.R.Dalesio et al., "EPICS Architecture", in Proceedings of ICALEPCS'91, KEK, Tsukuba, Japan, 1991, pp. 278-282.
- [2] T.Katoh et al., "Present Status of the KEKB Control System", in these proceedings.
- [3] J.B.Anderson and M.R.Kraimer, "EPICS Input/Output Controller(IOC) Application Developer's Guide".
- [4] "VxWorks Programmer's Guide 5.3", WindRiver Systems, Inc..
- [5] "VxWorks Reference Manual 5.3", WindRiver Systems, Inc..
- [6] M.Kraimer, private communication.
- [7] J.O.Hill, "EPICS R3.12 Channel Access Reference Manual".