**2.1 For computing-experiment**

### **2.1.1 e-Science research environment**

We define a computing-experiment tool as an e-Science research environment. In order to study particle physics, we can access the environment anytime and anywhere even if we are not on-site an accelerator laboratory. A virtual laboratory enables us to perform research as if we were on-site (Cho, 2008).We apply e-Science components to the CDF experiment.

#### **2.1.1.1 Data production**

The purpose of data production is to take both on-line shifts and off-line shifts anywhere. On-line shifts have been conducted through the use of a remote control room at KISTI and off-line shifts have been conducted via the sequential access through metadata (SAM) data handling (DH) system at KISTI. The remote control room is built to help non-US CDF members to fulfill their shift duties as a Consumer Operator (CO) part of the CDF data taking shift crew. The remote control room facilitates various monitoring applications that the CO has to monitor for a given eight hour shift. We have been operating the CDF remote control room at KISTI since July 22, 2008. A real Data Acquisition (DAQ) has been recorded at the remote control room at KISTI between August 1 and August 8, 2008. The CDF detector is an experimental apparatus for recording electrical events produced by the accelerator at an enormous rate. This apparatus is comprised of several components that perform different functions including a detector with millions of data channels transmitted to a corresponding number of electronic readout devices. The operation of an apparatus with this degree of complexity needs to be collaboratively controlled by researchers. In general, each shift crew takes an eight hour shift so that three shift crews will cover 24 hours. In the CDF experiment, the shift crew consists of three people with different missions. First, the Science Coordinator (SciCo) is responsible for the entire shift session and must have a lot of experience. The second person is the Ace shifter, who is an expert on the control of all detector components and electronic readout devices. The third person is the CO who has been trained in interpreting the meaning of the data being monitoring. UNIX processes intercept the on-line data transmitted from the front-end readout electronics and generate various plots that represent the quality of the data taken by the detector. These plots help the CO to determine whether or not the data collection is continuing as expected. Accordingly, the CO advises the Ace shifter to interrupt the detector operation in order to correct any problems.

Although the CO's monitoring task involves on-line data collection, this can be performed in a remote location due to its mostly monitoring-related nature. These remote control rooms are located at the Pisa University in Italy, the University of Tsukuba in Japan, and KISTI in Korea. In Korea, there are about 30 collaborators from six institutions, most of which have to fulfill CDF duties by taking detector operation shifts. All the plots that the consumers generate are accessible via web browsers where all the monitoring can be done. The CO has to not only monitor any plots generated by consumers but also must monitor

The e-Science Paradigm for Particle Physics 79

We perform another type of remote data handling shift at KISTI. Whereas the remote control room implements an on-line version of remote data handling, there is a second shift that implements an off-line version of remote data handling. This second type of shift is actually in the form of a SAM DH shift. This shift also occurs eight hours per day for seven days. These shifts do not need to cover the entire twenty four hours with three shifts per day since they are off-line. Furthermore, one can take the shift in the daytime of his or her time zone if participating in the shift schedule outside of the USA. The CDF SAM DH is called off-line since the data handled in this case includes data inbound to the tape from SAM stations in reconstruction farms and vice versa. The off-line data transfers in CDF are between SAM stations and mass storage system (MSS). In Fermilab, MSS consists of a Storage Resource Manager (SRM), dCache, and the Enstore system. The dCache software was the result of joint project between Fermilab in Batavia, USA and DESY (Deutches Elecktronen SYnchrotron laboratory) in Hamburg, Germany. dCache is a front-end for disk caching and provides end-users with the functionalities of reading cached files and writing files to and from Enstore indirectly via dCache. The Enstore system is a direct interface to files on tape for end-users. End-users can refer to SAM stations of CAF and farm machines. In the present context, the SAM stations in the CDF Analysis Farm (CAF) and farm clusters use an Application Programming Interface (API) provided by dCache to read files from and write files to the tapes via dCache and the Enstore systems. Thus, the mission of the CDF SAM shift includes monitoring the Enstore system, the dCache system, and SAM stations of

Data processing is accomplished using a High-Energy Physics (HEP) data grid. The objective of the high-energy physics data grid is to construct a system to manage and process high-

For data processing, Taiwan has the only WLCG Tier-1 center and Regional Operation Center in Asia since 2005. ASGC has also been serving as the Asia Pacific Regional Operational Center to maximize grid service availability and to facilitate extension of e-Science (Lin & Yen, 2009). In Japan, a Tier-2 computing center supporting the A Toroidal LHC Apparatus (ATLAS) experiment has been running at the University of Tokyo. There is another Tier-2 center at Hiroshima University for the A Large Ion Collider Experiment (ALICE) (Matsunaga, 2009). At KEK, collaborating institutes operate a grid site as members of the WLCG. These institutes try to use their grid resources for the Belle and Belle II experiments. The Belle II experiment, which will start in 2015, will use distributed

We explain the history of data processing for the CDF experiment. The CDF is an experiment on the Tevatron, at Fermilab. The CDF group ran its Run II phase between 2001 and 2011. CDF computing needs include raw data reconstruction, data reduction, event simulation, and user analysis. Although very different in the amount of resources needed, they are all naturally parallel activities. The CDF computing model is based on the concept of a Central Analysis Farm. The increasing luminosity of the Tevatron collider has caused the computing requirement for data analysis and Monte Carlo production to grow larger than available dedicated CPU resources. In order to meet demand, CDF has examined the possibility of using shared computing resources. CDF is using several computing processing systems, such as CAF, Decentralized CDF Analysis Farm (DCAF), and grid systems. The

energy physics data and to support the high-energy physics community (Cho, 2007).

the CDF analysis farm (CAF) and the CDF experiment farm.

**2.1.1.2 Data processing** 

computing resources.

the consumers themselves. However, the policy imposed by the Department of Energy (DOE) in the United States prohibits any remote researcher outside of Fermilab from executing any control-related UNIX command. Instead, control-related execution must be initiated by a person on-site. At the same time, all transmissions of control commands have to be encrypted using Kerberos. Thus, we can solve this problem by having an on-site crew send a graphic user interface (GUI) named "consumer controller" to the remote monitor via the Kerberized secure shell port. The CDF II experiment has been taking data from June 30, 2001 to September 30, 2011. Fig. 2 shows the CDF main operation center and remote control room at KISTI. As shown in Fig. 3, we have taken remote shifts (24 days per year on average) successfully.

Fig. 2. The CDF main operation center and remote control room at KISTI.

Fig. 3. The CDF remote control used at KISTI.

78 Particle Physics

the consumers themselves. However, the policy imposed by the Department of Energy (DOE) in the United States prohibits any remote researcher outside of Fermilab from executing any control-related UNIX command. Instead, control-related execution must be initiated by a person on-site. At the same time, all transmissions of control commands have to be encrypted using Kerberos. Thus, we can solve this problem by having an on-site crew send a graphic user interface (GUI) named "consumer controller" to the remote monitor via the Kerberized secure shell port. The CDF II experiment has been taking data from June 30, 2001 to September 30, 2011. Fig. 2 shows the CDF main operation center and remote control room at KISTI. As shown in Fig. 3, we have taken remote shifts (24 days per year on

Fig. 2. The CDF main operation center and remote control room at KISTI.

Fig. 3. The CDF remote control used at KISTI.

average) successfully.

We perform another type of remote data handling shift at KISTI. Whereas the remote control room implements an on-line version of remote data handling, there is a second shift that implements an off-line version of remote data handling. This second type of shift is actually in the form of a SAM DH shift. This shift also occurs eight hours per day for seven days. These shifts do not need to cover the entire twenty four hours with three shifts per day since they are off-line. Furthermore, one can take the shift in the daytime of his or her time zone if participating in the shift schedule outside of the USA. The CDF SAM DH is called off-line since the data handled in this case includes data inbound to the tape from SAM stations in reconstruction farms and vice versa. The off-line data transfers in CDF are between SAM stations and mass storage system (MSS). In Fermilab, MSS consists of a Storage Resource Manager (SRM), dCache, and the Enstore system. The dCache software was the result of joint project between Fermilab in Batavia, USA and DESY (Deutches Elecktronen SYnchrotron laboratory) in Hamburg, Germany. dCache is a front-end for disk caching and provides end-users with the functionalities of reading cached files and writing files to and from Enstore indirectly via dCache. The Enstore system is a direct interface to files on tape for end-users. End-users can refer to SAM stations of CAF and farm machines. In the present context, the SAM stations in the CDF Analysis Farm (CAF) and farm clusters use an Application Programming Interface (API) provided by dCache to read files from and write files to the tapes via dCache and the Enstore systems. Thus, the mission of the CDF SAM shift includes monitoring the Enstore system, the dCache system, and SAM stations of the CDF analysis farm (CAF) and the CDF experiment farm.
