**2.1 Blackfin microcomputer**

The Blackfin microcomputer is a 16-bit fixed-point processor based on the micro-signal architecture (MSA) core, developed in cooperation by Analog Devices and Intel. Low-cost and high performance features make Blackfin suitable for computationally applications including video equipment and third-generation cell phones (Sorin Zoican, 2008). The Blackfin microcomputer may have embedded Ethernet and controller area network. The Figure 1 illustrated the Blackfin general architecture (Analog Devices, 2006).

The MSA is designed to attain high-speed digital signal performance and top power efficiency. This architecture combines the best capabilities of microcontroller and DSP processor into a single programming model. A dynamic power management circuit continually monitors the running software and dynamically adjusts the voltage and the frequency at which the core runs. As a result, power consumption and performance for realtime applications, such node in sensor networks, are optimized.. The Blackfin core combines dual multiply-accumulate (MAC) units, an orthogonal reduced instruction-set computer (RISC) instruction set, single instruction, multiple data (SIMD) programming capabilities, and multimedia processing features into a unified architecture. As shown in Figure 1, the Blackfin BF5xx processor includes system peripherals such as parallel peripheral interface (PPI), serial peripheral interface (SPI), serial ports (SPORTs), general-purpose timers, universal asynchronous receiver transmitter (UART), real-time clock (RTC), watchdog timer, and general-purpose input/output (I/O) ports. The Blackfin processor has a direct memory access (DMA) controller that efficiently transfers data between external devices/memories and the internal memories without processor involvement. Blackfin processors offer L1 cache memory for fast accessing of both data and instructions.

Blackfin processors have well-to-do peripheral supports, memory management unit (MMU), and RISC-like instructions, usually found in many high-end microcontrollers. These processors have high-speed buses and highly developed computational units that support variable-length arithmetic operations in hardware. These features make the Blackfin processors appropriate to replace other high-end DSPs and MCUs. The Blackfin processor 4 Real-Time Systems, Architecture, Scheduling, and Application

Making an embedded system for networking applications require detailed knowledge of hardware and software resources needed to develop it. Architectural elements such as computational units, address units, control units and memory management are presented for a correct evaluation of the computational power of Blackfin family processors used in complex networking applications that include digital signal processing like as voice-over IP system. The LWIP protocol stack benefits of many operating system primitives. Well understanding of the operating system kernel functioning, such as scheduling, semaphore, memory management, is necessary for writing the software needed in networking applications. The LWIP implementation for Blackfin family must be detailed while the networking application uses the protocol stack resources (functions, memory allocation).

This section shows the elements that help achieve an embedded system with LWIP capabilities: the Blackfin microcomputer, as hardware support, the Visual DSP Kernel (VDK) real-time operating system kernel, as software support and LWIP protocol stack.

The Blackfin microcomputer is a 16-bit fixed-point processor based on the micro-signal architecture (MSA) core, developed in cooperation by Analog Devices and Intel. Low-cost and high performance features make Blackfin suitable for computationally applications including video equipment and third-generation cell phones (Sorin Zoican, 2008). The Blackfin microcomputer may have embedded Ethernet and controller area network. The

The MSA is designed to attain high-speed digital signal performance and top power efficiency. This architecture combines the best capabilities of microcontroller and DSP processor into a single programming model. A dynamic power management circuit continually monitors the running software and dynamically adjusts the voltage and the frequency at which the core runs. As a result, power consumption and performance for realtime applications, such node in sensor networks, are optimized.. The Blackfin core combines dual multiply-accumulate (MAC) units, an orthogonal reduced instruction-set computer (RISC) instruction set, single instruction, multiple data (SIMD) programming capabilities, and multimedia processing features into a unified architecture. As shown in Figure 1, the Blackfin BF5xx processor includes system peripherals such as parallel peripheral interface (PPI), serial peripheral interface (SPI), serial ports (SPORTs), general-purpose timers, universal asynchronous receiver transmitter (UART), real-time clock (RTC), watchdog timer, and general-purpose input/output (I/O) ports. The Blackfin processor has a direct memory access (DMA) controller that efficiently transfers data between external devices/memories and the internal memories without processor involvement. Blackfin

Figure 1 illustrated the Blackfin general architecture (Analog Devices, 2006).

processors offer L1 cache memory for fast accessing of both data and instructions.

Blackfin processors have well-to-do peripheral supports, memory management unit (MMU), and RISC-like instructions, usually found in many high-end microcontrollers. These processors have high-speed buses and highly developed computational units that support variable-length arithmetic operations in hardware. These features make the Blackfin processors appropriate to replace other high-end DSPs and MCUs. The Blackfin processor

**2. Background** 

**2.1 Blackfin microcomputer** 

uses a modified Harvard architecture, which allows multiple memory accesses per clock cycle. The Blackfin processor instruction set is optimized so 16-bit operation codes are the frequently used instructions. Complex DSP instructions are encoded into 32-bit operation codes like multifunction instructions. Blackfin microcomputers bear a limited multi-issue facility, where a 32-bit instruction can be issued in parallel with two 16-bit instructions. The programmer can use several core resources in a single instruction cycle. The Blackfin architecture supports instructions that control vector operations. We take advantage of these instructions to carry out concurrent operations on multiple 16-bit values (add, subtract, multiply, shift, negate, pack, and search instructions).

Fig. 1. a) Overall Blackfin architecture; b) The Blackfin peripherals

Networking Applications for Embedded Systems 7

A program sequencer controls the instruction execution flow, which includes instruction alignment and instruction decoding. The Blackfin processor supports two loops with two sets of loop counters, loop top and loop bottom registers to handle looping. Hardware

Blackfin sequencer manages events that include: interrupts (hardware and software), exceptions (error situation or service related). Despite the Blackfin processor has a Harvard architecture, it has a single memory map shared between data and instruction memory. Instead of using a single large memory, the Blackfin processor supports a hierarchical memory model as shown in Figure 4. The L1 data and instruction memory are placed on the chip and are generally smaller but faster than the L2 external memory, which has a superior capacity. As a result, transmission data from memory to registers in the Blackfin processor is

set in a hierarchy from the slowest memory (L2) to the fastest memory (L1).

Fig. 3. Blackfin arithmetic instructions

counters calculate the loop condition.

Fig. 4. Blackfin memory model

The Blackfin microcomputers family includes dual-core processors, such as the ADSP-BF561 microcomputer. Besides to other features, dual-core processors append a new facet to application development. Each dual-core Blackfin processor has two Blackfin cores, A and B, each with internal L1 memory. The two cores have a common internal memory shared between them. The cores share access to external memory. Each core functions autonomously: they have their reset address, Event Vector Table, instruction and data caches. On reset, core A starts running from its reset address, whereas core B is disabled. Core B starts running when it is enabled by core A. When core B starts running, it starts running its application, from its reset address. Having one application per core the complete potential of the dual-core Blackfin processor is exploiting. Effectively, two single core applications are building autonomously, and run in parallel on the processor. The common memory areas, both internal and external, are each subdivided into three areas: a section dedicated to core A, a section dedicated to core B, and a shared section.

Figure 2 shows that the core architecture consists of three main units: the address arithmetic unit, the data arithmetic unit, and the control unit.

Fig. 2. The Blackfin core

The arithmetic unit supports SIMD operation and it has Load/Store architecture. The assembler instruction syntax is algebraic; it is insightful and makes it simple to understand what the instruction does. Figure 3 illustrates several of arithmetic instruction efficiently executed in a single cycle. The video ALUs offer parallel computational power for video operations — quad 8-bit add/subtract, quad 8-bit average, SAA (Subtract-Absolute-Accumulate). Quad 8-bit ALU instruction takes one cycle to complete.

6 Real-Time Systems, Architecture, Scheduling, and Application

The Blackfin microcomputers family includes dual-core processors, such as the ADSP-BF561 microcomputer. Besides to other features, dual-core processors append a new facet to application development. Each dual-core Blackfin processor has two Blackfin cores, A and B, each with internal L1 memory. The two cores have a common internal memory shared between them. The cores share access to external memory. Each core functions autonomously: they have their reset address, Event Vector Table, instruction and data caches. On reset, core A starts running from its reset address, whereas core B is disabled. Core B starts running when it is enabled by core A. When core B starts running, it starts running its application, from its reset address. Having one application per core the complete potential of the dual-core Blackfin processor is exploiting. Effectively, two single core applications are building autonomously, and run in parallel on the processor. The common memory areas, both internal and external, are each subdivided into three areas: a section

Figure 2 shows that the core architecture consists of three main units: the address arithmetic

The arithmetic unit supports SIMD operation and it has Load/Store architecture. The assembler instruction syntax is algebraic; it is insightful and makes it simple to understand what the instruction does. Figure 3 illustrates several of arithmetic instruction efficiently executed in a single cycle. The video ALUs offer parallel computational power for video operations — quad 8-bit add/subtract, quad 8-bit average, SAA (Subtract-Absolute-

Accumulate). Quad 8-bit ALU instruction takes one cycle to complete.

dedicated to core A, a section dedicated to core B, and a shared section.

unit, the data arithmetic unit, and the control unit.

Fig. 2. The Blackfin core

Fig. 3. Blackfin arithmetic instructions

A program sequencer controls the instruction execution flow, which includes instruction alignment and instruction decoding. The Blackfin processor supports two loops with two sets of loop counters, loop top and loop bottom registers to handle looping. Hardware counters calculate the loop condition.

Blackfin sequencer manages events that include: interrupts (hardware and software), exceptions (error situation or service related). Despite the Blackfin processor has a Harvard architecture, it has a single memory map shared between data and instruction memory. Instead of using a single large memory, the Blackfin processor supports a hierarchical memory model as shown in Figure 4. The L1 data and instruction memory are placed on the chip and are generally smaller but faster than the L2 external memory, which has a superior capacity. As a result, transmission data from memory to registers in the Blackfin processor is set in a hierarchy from the slowest memory (L2) to the fastest memory (L1).

Fig. 4. Blackfin memory model

Networking Applications for Embedded Systems 9

Execution of the processes will be based on the priority (dynamically allocated) associated with each process. In VDK, three methods of scheduling can be defined: cooperative scheduling, uniform time division (round robin) and preemptive scheduling. Cooperative scheduling is involved for processes with the same priority. Each process takes the DSP processor and passes it, on own initiative, to the next process. The running process will yield the processor by calling a primitive system; the suspended process will be placed in the ready queue in the last position. The round robin scheduling assigns to processes with equal

Semaphores, events and I/O flags are used to achieve communication and synchronization between processes. A process waiting a signal may continue execution (whether the signal is available) or can enter the blocking state whether the signal is not available. The process will unlock when the signal becomes available or when a timer expires. Semaphores are global variables in the system; therefore, they are available to any process. A semaphore is a data

The semaphore may be posted or it may be pending. Pending the semaphore means testing its value with zero. If the semaphore is zero then the process increases the semaphore value and access the shared resource, otherwise the process waits. When the process has finished using the shared resource it posts the semaphore. Posting a semaphore means that its value is decremented and the resource is make available. Processes interact with semaphores by

The VDK kernel has all the necessary features to support the LWIP protocol stack

Fig. 5. The real-time kernel state diagram

priority equal length time quantum for execution.


real-time kernel routines for both semaphores pending and post.

development (memory management, semaphores and process scheduling).

structure that can be used to:

The justification following hierarchy memory is based on three principles: (1) the principle of building the common case fast, where code and data that have to be accessed repeatedly are stored in the fastest memory; (2) the principle of locality, where the program reuse instructions and data that have been used in recent times; and (3) the principle of smaller is faster, where smaller memory has faster access time.

All the shown features, make the Blackfin microcomputers family an ideal candidate to develop the embedded real-time system, which can run complex applications including networking applications with LWIP support.
