Overview
The Digital Radio Mondiale (DRM) is a set of digital audio broadcasting technologies designed to work over the bands currently used for AM broadcasting, particularly shortwave. Digital Radio Mondiale is also the name of the international non-profit consortium designing and implementing the platform. The DRM can fit more channels than AM, at higher quality, into a given amount of bandwidth, using various MPEG-4 codecs. The principle of DRM is that bandwidth is the limited element, and computer processing power is cheap. Modern CPU-intensive audio compression techniques enable more efficient use of available bandwidth, at the expense of processing resources. The DRM ETSI ES 201 980 V3.1.1 (2009-08) can deliver FM-comparable sound quality on frequencies below 30 MHz (long wave, medium wave and short wave), which allow for very-long-distance signal propagation. DRM has been designed especially to use portions of older AM transmitter facilities such as antennas, avoiding major new investment. DRM is robust against the fading and interference which often plague conventional broadcasting on these frequency ranges. As a digital medium, DRM can transmit other data besides the audio channels (datacasting) as well as RDS-type metadata or program-associated data as Digital Audio Broadcasting (DAB) does. Unlike most other DAB systems, DRM uses in-band on-channel (IBOC) technology and can operate in a hybrid mode called Single Channel Simulcast, simulcasting both analog signal and digital signal.
Design & Implementation
Back in 2005, when this project started, there were no low-cost single chip solution providing enough processing power to implement affordable consumer DRM radio receiver. At that time the existing prototypes integrated a combination of high-end DSP, FPGA and microcontrollers for the execution of the DRM demodulation and decoding. Two open source projects implementing a software defined DRM radio received were available: Dream (C++ implementation) and Diorama (MATLAB and C mixed implementation). We took the latter as reference model for the DRM baseband receiver implementation on the Diopsis platform, which integrates two cores in a single silicon chip: a RISC microcontroller (ARM) and a VLIW 40-bit floating point DSP (mAgic).
After the breakdown and profiling of the DRM receiver, excluding the audio AAC decoder, we identified the multi stage decoder and the viterbi decoder as the most demanding components. These contribute to approximately 90% of the computational load. The first Diopsis generation, the D740, presented insufficient computational power, due to the 50MHz ARM7, and limitations for the on chip data bandwidth. The DRM receiver development work targeted as implementation platform the following generation of Diopsis, the D940HF, although yet in design phase. This presents a faster 200MHz ARM9, a an enhanced mAgic DSP architecture, improved peripherals, better memory management, and larger data bandwidth with the internal cache and external dynamic memory.
We partitioned the DRM receiver mapping on the DSP the various synchronizations (time, frequency, frame), the channel equalization, the baseband OFDM demodulation, the channel decoding (multi stage decoder and viterbi decoder), and the ADC/DAC routines handling. The rest, including the AAC decoder, the multimedia contents decoder, the user interface and system synchronization, is mapped onto the ARM side of the Diopsis. This minimizes dependancies and synchronization constraints of the routines running into the two cores. The porting of the the original Diorama code required a significative optimization in order to generate a DSP executable satisfying the worst case real-time constraints. However the efficient DSP C compiler did not required any manual editing of parallel assembly routines. The optimization were applied at C code level (parallelization, unroll) and compiled into 128-bit VLIW instruction that fully exploit the available hardware resources. In the code snippet below there are the diorama original version and the mAgic DSP optimized version of the viterbi decoder, representing the most critical part of the DRM decoder.
//********************************************** // Diorama original viterbi decoder source code //********************************************** void viterbi_decoder_diorama (float *new_metrics,float *old_metrics,float *symbols_acc,int *path_reg_local, int *CODER_OUTPUT){ float metric_path2, metric_s1, metric_s2, metric_inc; int butterfly; for (butterfly = 0; butterfly < NOOFBF; butterfly++) { metric_s1 = old_metrics[butterfly]; metric_s2 = old_metrics[butterfly + NOOFBF]; metric_inc = symbols_acc[CODER_OUTPUT[butterfly]]; new_metrics[2 * butterfly] = metric_s1 + metric_inc; path_reg_local[2 * butterfly] = 0; metric_path2 = metric_s2 - metric_inc; /* Add */ if (metric_path2 > new_metrics[2 * butterfly]) { /* Compare */ new_metrics[2 * butterfly] = metric_path2; path_reg_local[2 * butterfly] = 1; /* Select */ } new_metrics[2 * butterfly + 1] = metric_s1 - metric_inc; path_reg_local[2 * butterfly + 1] = 0; metric_path2 = metric_s2 + metric_inc; /* Add */ if (metric_path2 > new_metrics[2 * butterfly + 1]) { /* Compare */ new_metrics[2 * butterfly + 1] = metric_path2; path_reg_local[2 * butterfly + 1] = 1; /* Select */ } } return; } //************************************************************************************ // mAgic DSP optimized viterbi decoder source code //************************************************************************************ long chess_storage(DATA%2) _mvone[2]={1,1}; long chess_storage(DATA%2) _mvzero[2]={0,0}; void viterbi_decoder_mAgic (float *new__,float *old__,float *sym__,int *path__, int *coder__){ int path_reg_index, butterfly, idx1, idx2, idx3, idx4, idx5; float metric_path2; float metric_inc_i,metric_inc_i_1,metric_inc_i_2,metric_inc_i_3; float metric_inc_q; float temp1_i, temp1_i_1,temp1_i_2, temp1_i_3; float temp1_q; float temp2_i; float temp2_q; float temp3_i, temp3_i_1,temp3_i_2, temp3_i_3; float temp3_q; float temp4_i; float temp4_q; float temp5_i; float temp5_q; float temp6_i; float temp6_q; int nuovpath_reg_i; int nuovpath_reg_q; int * restrict c_out = coder__; float * restrict oldmetrics = old__; float * restrict oldmetrics_1 = old__+32; float * restrict newmetrics = new__; float * restrict s_acc = sym__; path_reg_index=0; idx1 = 0; idx2 = 0; idx3 = 0; idx4 = 0; idx5 = 32; long one=1; long zero=0; _v_long vone=_v_long(1,1); _v_long vzero=_v_long(0,0); _v_float vfzero=_v_float(0,0); _v_float v_temp5=vfzero, v_temp6=vfzero; _v_float v_temp5_1=vfzero, v_temp6_1=vfzero; _v_float v_temp5_2=vfzero, v_temp6_2=vfzero; _v_float v_temp5_3=vfzero, v_temp6_3=vfzero; _v_float * restrict v_newmetrics=(_v_float * restrict)newmetrics; _v_long * restrict v_nuovpath=(_v_long * restrict)path__; static _v_long chess_storage(VRF112) swap1; static _v_long chess_storage(VRF113) swap1_1; static _v_long chess_storage(VRF114) swap1_2; static _v_long chess_storage(VRF115) swap1_3; static _v_long chess_storage(VRF116) swap2; static _v_long chess_storage(VRF117) swap2_1; static _v_long chess_storage(VRF118) swap2_2; static _v_long chess_storage(VRF119) swap2_3; float t1_p,t2_p,t1_m,t2_m; float t1_p_1,t2_p_1,t1_m_1,t2_m_1; float t1_p_2,t2_p_2,t1_m_2,t2_m_2; float t1_p_3,t2_p_3,t1_m_3,t2_m_3; int chess_storage(chess_register) cout; int chess_storage(chess_register) cout_1; int chess_storage(chess_register) cout_2; int chess_storage(chess_register) cout_3; static bool chess_storage(FLACKCOND0) cond; for (butterfly = 0; butterfly < 8; butterfly++) { swap1=vzero*vone; swap2=vone*vone; swap1_1=vzero*vone; swap2_1=vone*vone; swap1_2=vzero*vone; swap2_2=vone*vone; swap1_3=vzero*vone; swap2_3=vone*vone; cout=*c_out++; cout_1=*c_out++; cout_2=*c_out++; cout_3=*c_out++; metric_inc_i = s_acc[cout]; temp1_i = *oldmetrics++; temp3_i = *oldmetrics_1++; metric_inc_i_1 = s_acc[cout_1]; temp1_i_1 = *oldmetrics++; temp3_i_1 = *oldmetrics_1++; metric_inc_i_2 = s_acc[cout_2]; temp1_i_2 = *oldmetrics++; temp3_i_2 = *oldmetrics_1++; metric_inc_i_3 = s_acc[cout_3]; temp1_i_3 = *oldmetrics++; temp3_i_3 = *oldmetrics_1++; t1_p=_ADDSUB(temp1_i,metric_inc_i,t1_m); t2_p=_ADDSUB(temp3_i,metric_inc_i,t2_m); _SETLEFT(v_temp5,t1_p); _SETRIGHT(v_temp5,t1_m); _SETLEFT(v_temp6,t2_m); _SETRIGHT(v_temp6,t2_p); t1_p_1=_ADDSUB(temp1_i_1,metric_inc_i_1,t1_m_1); t2_p_1=_ADDSUB(temp3_i_1,metric_inc_i_1,t2_m_1); _SETLEFT(v_temp5_1,t1_p_1); _SETRIGHT(v_temp5_1,t1_m_1); _SETLEFT(v_temp6_1,t2_m_1); _SETRIGHT(v_temp6_1,t2_p_1); t1_p_2=_ADDSUB(temp1_i_2,metric_inc_i_2,t1_m_2); t2_p_2=_ADDSUB(temp3_i_2,metric_inc_i_2,t2_m_2); _SETLEFT(v_temp5_2,t1_p_2); _SETRIGHT(v_temp5_2,t1_m_2); _SETLEFT(v_temp6_2,t2_m_2); _SETRIGHT(v_temp6_2,t2_p_2); t1_p_3=_ADDSUB(temp1_i_3,metric_inc_i_3,t1_m_3); t2_p_3=_ADDSUB(temp3_i_3,metric_inc_i_3,t2_m_3); _SETLEFT(v_temp5_3,t1_p_3); _SETRIGHT(v_temp5_3,t1_m_3); _SETLEFT(v_temp6_3,t2_m_3); _SETRIGHT(v_temp6_3,t2_p_3); cond=_GT(v_temp6,v_temp5); swap1 = _m_vswap(cond,swap1,swap2,swap2); cond=_GT(v_temp6_1,v_temp5_1); swap1_1 = _m_vswap(cond,swap1_1,swap2_1,swap2_1); cond=_GT(v_temp6_2,v_temp5_2); swap1_2 = _m_vswap(cond,swap1_2,swap2_2,swap2_2); cond=_GT(v_temp6_3,v_temp5_3); swap1_3 = _m_vswap(cond,swap1_3,swap2_3,swap2_3); *v_newmetrics++ = _MAX(v_temp5,v_temp6); *v_nuovpath++ = swap1*vone; *v_newmetrics++ = _MAX(v_temp5_1,v_temp6_1); *v_nuovpath++ = swap1_1*vone; *v_newmetrics++ = _MAX(v_temp5_2,v_temp6_2); *v_nuovpath++ = swap1_2*vone; *v_newmetrics++ = _MAX(v_temp5_3,v_temp6_3); *v_nuovpath++ = swap1_3*vone; } return; }
Prototype
The DRM receiver prototype is based on the D940HF evaluation board, sampling the down-converted DRM signal in with one of the on-board ADC. The DRM signal is received with the L101E active wide-band loop antenna (100 KHZ 0.05–30 MHz) and down-converted at 12KHz central frequency with the RF front CiaoRadio H101. The largest DRM signal bandwidth is 20KHz, thus the down-converted signal can be sampled at 48KHz with any audio ADC. In the video below there’s a reception test performed in Frascati of the Radio RAI DRM signal broadcasted from Santa Palomba on 846KHz. The laptop visible in the video is running only the control software for the H101 front end tuning and the SSH shell to access the embedded Linux running on the Diopsis evaluation board, for the sole purpose of starting the decoder and displaying the receiver information. The H101 down-converted DRM signal is sent via USB to the laptop, and then reproduced at the laptop audio output (it sounds like this). Then the DRM signal is sent to one of the the D940HF Evaluation board audio input with a standard audio cable.
The development of the DRM receiver, besides demonstrating the intensive processing power of the Diopsis platform, had a key role in the architecture definition of the chip second generation, as well as in debug of the D940HF hardware and software tools. Moreover the DRM receiver was a demonstrator for the DIAM project when the D940HF was existing only on the FPGA emulator, and a toy application for the HARTES toolchain, which successfully builds the ARM+DSP DRM application from a single project in a single development environment. My inteview on the Diopsis DRM receiver on Radiopassioni is available here (in Italian).
This work began in the framework of my M.Sc. Thesis in Electronic Engineering, then continued at the Atmel Roma Advanced DSP R&D Center, also within the context of the research projects DIAM (Digital AM Hardware and Software Platform-Based Set from C to Silicon) and HARTES (Holistic Approach to Reconfigurable Realt-Time Embedded Systems). Credit for this work goes also to my outstanding colleagues at the Atmel Roma Advanced DSP R&D Center.
S. Fasciani, “Analysis and Design of a Digital Radio Mondiale Deecoder based on a VLIW DSP,” M.Sc. Thesis, University of Rome Tor Vergata, Italy, 2006 [PDF in Italian].