资源描述
H.264 JM中全搜索ME算法的改写及编码性能对比
一. 实验要求:
把JM中全搜索ME算法换成任一个快速ME算法,对编码的性能进行比较。
二. 实验步骤:
1. 通过阅读JM73的源代码得知,JM为实现全搜索ME算法在mv-search.c文件中提供了两个函数FullPelBlockMotionSearch与FastFullPelBlockMotionSearch,这两个函数实现的都是全搜索。可以通过定义宏_FAST_FULL_ME_来选择使用FastFullPelBlockMotionSearch而不使用FullPelBlockMotionSearch ,为了改写的方便我们选择使用.FullPelBlockMotionSearch
并对其改写,这样我们需在defines.h头文件中将#define _FAST_FULL_ME_一行删去或注释掉,这样编译器将只编译FullPelBlockMotionSearch函数。
2. FullPelBlockMotionSearch函数改写,该函数原本实现的是全搜索,现在我们将其改写成三步搜索法(TSS)。 改写后的函数如下:
FullPelBlockMotionSearch (pel_t** orig_pic,
int ref,
int pic_pix_x,
int pic_pix_y,
int blocktype,
int pred_mv_x,
int pred_mv_y,
int* mv_x,
int* mv_y,
int search_range,
int min_mcost,
double lambda)
{
int pos, cand_x, cand_y, y, x4, mcost;
pel_t *orig_line, *ref_line;
pel_t *(*get_ref_line)(int, pel_t*, int, int);//函数指针
pel_t* ref_pic = ref<0 ? listX[LIST_1][0]->imgY_11 : listX[LIST_0][ref]->imgY_11;
int best_pos = 0;
int max_pos = (2*search_range+1)*(2*search_range+1);
int lambda_factor = LAMBDA_FACTOR (lambda);
int blocksize_y = input->blc_size[blocktype][1];
int blocksize_x = input->blc_size[blocktype][0];
int blocksize_x4 = blocksize_x >> 2;
int pred_x = (pic_pix_x << 2) + pred_mv_x;
int pred_y = (pic_pix_y << 2) + pred_mv_y;
int center_x = pic_pix_x + *mv_x;
int center_y = pic_pix_y + *mv_y;
int check_for_00 = (blocktype==1 && !input->rdopt && img->type!=B_SLICE && ref==0);
//-----------------------------------------
int first_search_range=(search_range+1)>>1;//第一次搜索时的搜索半径
int v_search_range=first_search_range;//可变的搜索半径
int i,j;
int offset_i=0;//用来记录cost最小的i,j
int offset_j=0;
int usr_offset_x=0;//自定义的偏移量(即搜索中心坐标)
int usr_offset_y=0;
//===== set function for getting reference picture lines =====
if ((center_x > search_range) && (center_x < img->width -1-search_range-blocksize_x) &&
(center_y > search_range) && (center_y < img->height-1-search_range-blocksize_y) )
{
get_ref_line = FastLineX;
}
else
{
get_ref_line = UMVLineX;
}
//--------------------------------------------
while(v_search_range!=0)
{
//每开始一次新的搜索都要将offset_j,offset_j清零
offset_i=0;
offset_j=0;
for(j=-1;j<=1;j++)//9点扫描
for(i=-1;i<=1;i++)
{
if(i==0 && j==0 && v_search_range!=first_search_range)//如果不是第1次扫
//描就跳过搜索中心只扫描8点
continue;
cand_x = center_x +i*v_search_range+usr_offset_x ;//候选点坐标
cand_y = center_y +j*v_search_range+usr_offset_y ;
//以下为计算mcost
mcost = MV_COST (lambda_factor, 2, cand_x, cand_y, pred_x, pred_y);
if (check_for_00 && cand_x==pic_pix_x && cand_y==pic_pix_y)
{
mcost -= WEIGHTED_COST (lambda_factor, 16);
}
if (mcost >= min_mcost) continue;
for (y=0; y<blocksize_y; y++)
{
ref_line = get_ref_line (blocksize_x, ref_pic, cand_y+y, cand_x);
orig_line = orig_pic [y];
for (x4=0; x4<blocksize_x4; x4++)
{
mcost += byte_abs[ *orig_line++ - *ref_line++ ];
mcost += byte_abs[ *orig_line++ - *ref_line++ ];
mcost += byte_abs[ *orig_line++ - *ref_line++ ];
mcost += byte_abs[ *orig_line++ - *ref_line++ ];
}
if (mcost >= min_mcost)
{
break;
}
}
if (mcost < min_mcost)//记录cost最小的i,j值
{
offset_i=i;
offset_j=j;
min_mcost = mcost;
}
}
//更新偏移量
usr_offset_x+=(offset_i*v_search_range);
usr_offset_y+=(offset_j*v_search_range);
v_search_range>>=1;//搜索半径变为原来一半
}
//将偏移量赋值给运动矢量
*mv_x += usr_offset_x;
*mv_y += usr_offset_y;
return min_mcost;
}
3. 编码性能比较
(1). 首先将实验组和对照组中defines.h文件中#define _FAST_FULL_ME_一行注释掉,采用FullPelBlockMotionSearch函数。
(2). 将实验组中FullPelBlockMotionSearch函数按步骤2中所示改写,对照组不改写。
(3). 将实验组与对照组编译生成lencod.exe可执行文件。
(4).将实验组与对照组bin文件夹下的encoder.cfg文件中以下几行改写如下
InputFile = "carphone.qcif"
FramesToBeEncoded = 10
OutputFile = "carphone.264"
输入文件为carphone.qcif,输出为carphone.264,编码帧数为10。
(5).打开命令提示符窗口,分别转换到实验组与对照组的bin目录下,输入以下命令
lencod –f encoder.cfg >>result.txt
执行编码程序lencod并将显示结果存储在result.txt中。
(6).编码结果对照
(1)实验组的result.txt文件内容
--------------------------------------------------------------------------
Input YUV file : carphone.qcif
Output H.26L bitstream : carphone.264
Output YUV file : test_rec.yuv
Output log file : log.dat
Output statistics file : stat.dat
--------------------------------------------------------------------------
Frame Bit/pic QP SnrY SnrU SnrV Time(ms) Frm/Fld IntraMBs
0(I) 21936 28 37.8180 40.8088 41.9402 1312 FRM
2(P) 4784 28 37.3496 40.8073 41.6637 1687 FRM 2
1(B) 1696 30 35.8968 40.7067 41.6921 2218 FRM
4(P) 4640 28 37.3205 40.6092 41.4210 1922 FRM 4
3(B) 1248 30 36.5347 40.7708 41.4978 2438 FRM
6(P) 4616 28 37.4653 40.3927 40.9947 2157 FRM 5
5(B) 1144 30 36.7100 40.6593 41.2797 2735 FRM
8(P) 4688 28 37.7075 40.6800 41.1975 2422 FRM 2
7(B) 1480 30 36.5810 40.5311 40.8929 2969 FRM
10(P) 3576 28 37.5449 40.5996 41.1195 2688 FRM 1
9(B) 1504 30 36.4399 40.3870 40.8980 2969 FRM
12(P) 3224 28 37.5867 40.6390 41.2422 2641 FRM 1
11(B) 1032 30 36.7461 40.5284 40.9614 2937 FRM
14(P) 3576 28 37.5550 40.5308 40.9929 2672 FRM 0
13(B) 904 30 36.6648 40.4775 41.0538 2907 FRM
16(P) 3760 28 37.6256 40.4624 40.8054 2656 FRM 2
15(B) 1072 30 37.2065 40.8150 41.0300 2891 FRM
18(P) 3792 28 37.6711 40.9188 41.1401 2656 FRM 3
17(B) 888 30 36.7916 40.5002 40.7988 2953 FRM
--------------------------------------------------------------------------
Total Frames: 19 (10)
Leaky BucketRateFile does not have valid entries;
using rate calculated from avg. rate
Number Leaky Buckets: 8
Rmin Bmin Fmin
54915 23059 23059
68640 22144 22144
82365 21936 21936
96090 21936 21936
109815 21936 21936
123540 21936 21936
137265 21936 21936
150990 21936 21936
--------------------------------------------------------------------------
Freq. for encoded bitstream : 15
Hadamard transform : Used
Image format : 176x144
Error robustness : Off
Search range : 16
No of ref. frames used in P pred : 5
No of ref. frames used in B pred : 5
Total encoding time for the seq. : 47.830 sec
Sequence type : IBPBP (QP: I 28, P 28, B 30)
Entropy coding method : CABAC
Search range restrictions : none
RD-optimized mode decision : used
Data Partitioning Mode : 1 partition
Output File Format : H.26L Bit Stream File Format
------------------ Average data all frames ------------------------------
SNR Y(dB) : 37.12
SNR U(dB) : 40.62
SNR V(dB) : 41.19
Total bits : 69560 (I 21936, P 36656, B 10968)
Bit rate (kbit/s) @ 30.00 Hz : 109.83
Bits to avoid Startcode Emulation : 0
Bits for parameter sets : 168
--------------------------------------------------------------------------
Exit JM 7 encoder ver 7.3
(2)对照组的result.txt文件内容
--------------------------------------------------------------------------
Input YUV file : carphone.qcif
Output H.26L bitstream : carphone.264
Output YUV file : test_rec.yuv
Output log file : log.dat
Output statistics file : stat.dat
--------------------------------------------------------------------------
Frame Bit/pic QP SnrY SnrU SnrV Time(ms) Frm/Fld IntraMBs
0(I) 21936 28 37.8180 40.8088 41.9402 1328 FRM
2(P) 4768 28 37.4518 40.7602 41.6957 2062 FRM 1
1(B) 1768 30 36.0831 40.6879 41.6153 2984 FRM
4(P) 4712 28 37.4707 40.6501 41.5580 2813 FRM 1
3(B) 1184 30 36.4756 40.7163 41.4512 3391 FRM
6(P) 4752 28 37.5929 40.4176 41.0701 3531 FRM 5
5(B) 1216 30 36.7950 40.7214 41.4023 3891 FRM
8(P) 4880 28 37.9403 40.6092 41.3792 4218 FRM 1
7(B) 1400 30 36.6516 40.1799 40.8776 4593 FRM
10(P) 3464 28 37.6014 40.5464 41.4059 5016 FRM 1
9(B) 1384 30 36.6828 40.3234 40.9549 4547 FRM
12(P) 2920 28 37.6527 40.5534 41.2942 4828 FRM 0
11(B) 944 30 36.6652 40.2466 41.0684 4391 FRM
14(P) 3552 28 37.6500 40.4163 41.0015 4875 FRM 1
13(B) 1048 30 37.0687 40.1718 40.8037 4312 FRM
16(P) 3440 28 37.7184 40.5513 40.8449 4797 FRM 2
15(B) 848 30 37.1261 40.6332 41.0123 4125 FRM
18(P) 3656 28 37.6867 40.8786 41.1308 4750 FRM 2
17(B) 888 30 36.7941 40.5576 40.8907 4375 FRM
--------------------------------------------------------------------------
Total Frames: 19 (10)
Leaky BucketRateFile does not have valid entries;
using rate calculated from avg. rate
Number Leaky Buckets: 8
Rmin Bmin Fmin
54270 23086 23086
67830 22182 22182
81390 21936 21936
94950 21936 21936
108510 21936 21936
122070 21936 21936
135630 21936 21936
149190 21936 21936
--------------------------------------------------------------------------
Freq. for encoded bitstream : 15
Hadamard transform : Used
Image format : 176x144
Error robustness : Off
Search range : 16
No of ref. frames used in P pred : 5
No of ref. frames used in B pred : 5
Total encoding time for the seq. : 74.827 sec
Sequence type : IBPBP (QP: I 28, P 28, B 30)
Entropy coding method : CABAC
Search range restrictions : none
RD-optimized mode decision : used
Data Partitioning Mode : 1 partition
Output File Format : H.26L Bit Stream File Format
------------------ Average data all frames ------------------------------
SNR Y(dB) : 37.21
SNR U(dB) : 40.55
SNR V(dB) : 41.23
Total bits : 68760 (I 21936, P 36144, B 10680)
Bit rate (kbit/s) @ 30.00 Hz : 108.57
Bits to avoid Startcode Emulation : 0
Bits for parameter sets : 168
--------------------------------------------------------------------------
Exit JM 7 encoder ver 7.3
通过以上的显示结果可以看出:采用三步搜索法的实验组的平均每帧的编码时间在2700ms左右,而采用全搜素算法的对照组的平均每帧的编码时间在4500ms左右;实验组编码后的所有帧的总数据量为69560bit,而对照组编码后的所有帧的总数据量为68760bit。
4. 实验结论:
三步搜索算法要比全搜索算法的平均每帧编码时间要少许多,但是三步搜索算法压缩效率比全搜索算法差一点。
展开阅读全文