YUV Denoise
Algorithm
Our YUV denoise module performs in YUV space.we choose the Non-local mean as our denosing algorithm in our hls implementation since it’s a hardware-friendly algorithm and can be realized by using several line buffers to support for the window serach operation required by each pixel.
The NLM algorithm<sup>1</sup> is a simple but effective denosing method. Given a discrete noisy image $v = {v(i) | i ∈ I}$, the estimated value $NLv$, for a pixel i, is computed as a weighted average of all the pixels in the image,
$$ NLv=\sum_{j \in L}\omega(i,j)v(j) $$
the weights $\omega(i,j)$ are defined as:
$$\omega(i,j)=\frac{1}{Z(i)}e^{-\frac{\Vert v(N_i)-v(N_j)\Vert^2,a}{h^2}}$$
$v(N_i)$ is a vector formed by the grey values of the pixels around a squre neighborhood of fixed size centoered at pixel i .
As shown in Fig 1, similar pixel neighborhoods will be given a large weight, $w(p,q1)$ and $w(p,q2)$, while much different neighborhoods give a small weight $w(p,q3)$ .
HLS Implementation
register parameters
typedef struct{
uint1 eb;//1 bit
uint14 ysigma2;// sigma = [5,127]
uint10 yinvsigma2;// invsigma = (1/sigma)<< 7
uint14 uvsigma2;//[5,127]
uint10 uvinvsigma2;//(1 / sigma) << 7
uint4 yfilt;//[0.1~0.8] << 4
uint4 uvfilt;//[0.1~0.8] << 4
uint5 yinvfilt;//(1 / filt) << 1
uint5 uvinvfilt;//(1 / filt) << 1
uint14 yH2;
uint18 yinvH2;
uint14 uvH2;
uint18 uvinvH2;
}yuvdns_register;
The explanation of parameters in yuvdns_register are listed in the following table
parameters | explanation |
---|---|
eb |
controls the yuvdns module is enable or not |
ysigma2 |
controls the squared deviation of the gaussian kernel used for distance measure for in Y plane |
uvsigma2 |
controls the squared deviation of the gaussian kernel used for distance measure for in UV plane |
yH2 |
acts as a degree of filtering in Y plane |
uvH2 |
acts as a degree of filtering in UV plane |
functions
yuv444dns
void yuv444dns(top_register top_reg, yuvdns_register yuvdns_reg, stream_u10 &src_y, stream_u10 &src_u, stream_u10 &src_v, stream_u10 &dst_y, stream_u10 &dst_u, stream_u10 &dst_v)
input/output description
Params | description |
---|---|
top_reg |
global configure register |
yuvdns_reg |
a configure register used only by yuvdns moudule |
src_y |
Y component of source image streams |
src_u |
U component of source image streams |
src_v |
V component of source image streams |
dst_y |
Y compoent of denoised image streams |
dst_u |
U compoent of denoised image streams |
dst_v |
V compoent of denoised image streams |
Return value
None
function description
yuv444dns
is the top denosing function,it performs NLM denoise for incoming src_y
、src_u
、src_v
pixel streams and outputs denosied pixel streams dst_y
、dst_u
、dst_v
seperately and simutaneosly.
in the original Non local mean algorithm, the denosied value of a pixel is computed as a weighted average of all the pixels in the image which has a high demand of line buffers to store neighbor pixels, thus we decrease the search area from the whole image to a 9x9 neigbor Window around the pixel.
in yuv444dns
function, 8 line buffers are defined to store previous line data, and a 9x9 2D array to store the searching window around the center pixel.
uint10 yWindow[9][9];
uint10 uWindow[9][9];
uint10 vWindow[9][9];
uint10 ylineBuf[8][4096];
uint10 ulineBuf[8][4096];
uint10 vlineBuf[8][4096];
for each pixel prepared to be denosied, we get the 9x9 window formed by it’s arounding pixels
the window pixel preparation could be diveded into the following 3 steps
Step 1
datas in the 9x9 window shift left 1 pixel width
for(uint4 i = 0; i < 9; i++){
for(uint4 j = 0; j < 8; j++){
yWindow[i][j] = yWindow[i][j+1];
uWindow[i][j] = uWindow[i][j+1];
vWindow[i][j] = vWindow[i][j+1];
}
}
Step 2
The 9x9 Window reads 8x1 pixel data from the 8 linebuffers and the input stream pixel y_t
, u_t
,v_t
, then the 9x1 new pixels will be stored in the rightest column of the Window
for(uint4 i = 0; i < 8; i++){
yWindow[i][8] = ylineBuf[i][col];
uWindow[i][8] = ulineBuf[i][col];
vWindow[i][8] = vlineBuf[i][col];
}
yWindow[8][8] = y_t;
uWindow[8][8] = u_t;
vWindow[8][8] = v_t;
Step 3
Update the line buffer
for(uint4 i = 0; i < 7; i++){
ylineBuf[i][col] = ylineBuf[i+1][col];
ulineBuf[i][col] = ulineBuf[i+1][col];
vlineBuf[i][col] = vlineBuf[i+1][col];
}
ylineBuf[7][col] = y_t;
ulineBuf[7][col] = u_t;
vlineBuf[7][col] = v_t;
When the above 3 steps are finished, a yuvdns_nlm
function(will be discussed later) will be called to perform the block serach operation.
Additionally, it's worth noting that for those border pixels which do not have enough neibor pixels to form a 9x9 search window, we directly ouput its’ origin value. finally every pixels of a frame will be denoised except the 4-pixel width border.
yuvdns_nlm
function declaration
uint10 yuvdns_nlm(uint10 Window[9][9],uint14 sigma2, uint14 H2,uint18 invH2){
uint8 weight_1[8]={255,226,200,176,156,138,122,108};
uint8 weight_2[16]={88,72,59,48,39,32,26,21,17,14,11,9,7,5,2,0};
uint21 diff;
uint22 diff_1;
uint22 diff_2;
uint22 diff_3;
uint28 diff_tmp;
uint8 weight = 0, maxweight = 0;
uint14 totalweight =0;
uint25 totalvalue = 0;
nlm_row_loop:for (uint4 j = 1; j < 8; j++){
nlm_col_loop:for (uint4 i = 1; i < 8; i++) {
if (i != 4 || j != 4) {
uint10 dis_1 = 0;
uint20 dis_11 = 0;
dis_1 = yuvdns_abs(Window[j-1][i-1],Window[3][3]);
dis_11 = dis_1 * dis_1;
uint10 dis_2 = 0;
uint20 dis_22 = 0;
dis_2 = yuvdns_abs(Window[j-1][i],Window[3][4]);
dis_22 = dis_2 * dis_2;
uint10 dis_3 = 0;
uint20 dis_33 = 0;
dis_3 = yuvdns_abs(Window[j-1][i+1],Window[3][5]);
dis_33 = dis_3 * dis_3;
uint10 dis_4 = 0;
uint20 dis_44 = 0;
dis_4 = yuvdns_abs(Window[j][i-1],Window[4][3]);
dis_44 = dis_4 * dis_4;
uint10 dis_5 = 0;
uint20 dis_55 = 0;
dis_5 = yuvdns_abs(Window[j][i],Window[4][4]);
dis_55 = dis_5 * dis_5;
uint10 dis_6 = 0;
uint20 dis_66 = 0;
dis_6 = yuvdns_abs(Window[j][i+1],Window[4][5]);
dis_66 = dis_6 * dis_6;
uint10 dis_7 = 0;
uint20 dis_77 = 0;
dis_7 = yuvdns_abs(Window[j+1][i-1],Window[5][3]);
dis_77 = dis_7 * dis_7;
uint10 dis_8 = 0;
uint20 dis_88 = 0;
dis_8 = yuvdns_abs(Window[j+1][i],Window[5][4]);
dis_88 = dis_8 * dis_8;
uint10 dis_9 = 0;
uint20 dis_99 = 0;
dis_9 = yuvdns_abs(Window[j+1][i+1],Window[5][5]);
dis_99 = dis_9 * dis_9;
diff_1 = dis_11 + dis_22 + dis_33;
diff_2 = dis_44 + dis_55 + dis_66;
diff_3 = dis_77 + dis_88 + dis_99;
diff = (diff_1 + diff_2 + diff_3) >> 3;
if(diff < 2 * sigma2){
diff = 0;
}else{
diff = diff - 2 * sigma2;
}
uint32 count = 0;
if(H2 == 0){
weight = 0;
}
else if(diff <= H2){
diff_tmp = 7 * diff;
count = (diff_tmp * invH2)>>14;
weight = weight_1[count];
}
else{
diff_tmp = 5 * diff;
count = (diff_tmp * invH2)>>14;
count = yuvdns_weight2_clip((count -5));
weight = weight_2[count];
}
if(weight > maxweight){
maxweight = weight;
}
totalweight = totalweight + weight;
totalvalue = totalvalue + Window[j][i] * weight;
}
}
}
totalweight = totalweight + maxweight;
totalvalue = totalvalue + Window[4][4] * maxweight;
if(totalweight==0)
return Window[4][4];
else
return yuvdns_clip(totalvalue / totalweight, YUVDNS_Max_Value, 0);
}
parameters description
Params | description |
---|---|
Window[9][9] |
search area of the pixel to be denoised |
sigma2 |
squared deviation of gaussian kernel for distance measure |
H2 |
$h^2$ |
invH2 |
$\frac{1}{h^2}$ |
Return
the denoised value of the center pixel in Window[9][9]
function description
yuvdns_nlm
function performs the nlm search in the 9x9 Window which calculates the similarities between each neighbor pixel and the center pixel. Finally, yuvdns_nlm
returns the denosied pixel value.
Reference
[1]Buades, A. , B. Coll , and J. M. Morel . "A non-local algorithm for image denoising." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on IEEE, 2005.