BASM for beginners, lesson 1C
<<BASM 初学者入门>> 第 1 课 C
http://www.cnpack.org
QQ Group: 130970
翻译:SkyJacker
版本:草稿版
状态:未校对
时间:2007
This is the third lesson in the BASM for beginner’s series. The first two lessons
introduced a number of integer instructions and this lesson will continue doing that.
The example function is this
这是 BASM 初学者系列的第 3 课。前两课介绍了整数指令,这一课将继续介绍它们。
例子函数如下:
function StrLen1(const Str: PChar) : Cardinal;
begin
Result := 0;
while (Str[Result] <> #0) do
Inc(Result);
end;
Its functionality is the same as the RTL function of the same name. It searches a PChar
string for the zero terminator and returns the length of the string. As usual we copy the
assembler code generated by the compiler from the cpu view.
它的功能与 RTL 中同名的函数一样。搜索以 #0 结尾的字符串,并返回字符串的长度。
和以前一样,我们从 Cpu View 中复制编译器产生的汇编代码。
function StrLen2(const Str: PChar) : Cardinal;
begin
Result := 0;
{
xor edx,edx
jmp +$01
}
while (Str[Result] <> #0) do
{
inc edx
cmp byte ptr [eax+edx],$00
jnz -$07
}
Inc(Result);
{
}
{
mov eax,edx
ret
}
end;
At first this code listing looks confusing. This is because the optimizer relocated code.
An example of this is the inc edx instruction that increments the Result variable. It is not
located under the Pascal line Inc(Result) where we expected it to be.
代码第一眼看起来比较零乱。这是因为被编译器优化了。
这是一个通过将 edx 加一来返回其递增结果的例子。
它不是产生在我们所认为的 pascal 代码中的 inc(Result) 的位置。
Let us go through the code line by line and see what it does. The first line looks like this.
让我们来弄明白每一行代码。第一行如下:
Result := 0;
{
xor edx,edx
It clears the edx register. All bits are set to zero. Edx is allocated for the result variable.
The result of a function is returned in the eax register if it is an integer value.
Therefore we would expect eax being allocated for the Result variable, but eax is in use for
the input parameter Str. The compiler allocated edx, which can be used freely, for Result
temporarily and just before the functions exits it is copied into eax by the line
它清空 edx 寄存器,所有位都置为零,Edx 是为了保存结果而分配的。
如果结果是一个整数,则函数通过 eax 寄存器返回结果。因为,我们会希望将 eax 作为结果变量,
但是 eax 被输入参数 str 使用。编译器分配可以自由修改的 edx 存放临时结果。
只是在函数退出时将它复制到 eax 中,如下行
{
mov eax,edx
ret
}
end;
The second line of assembler is a jump instruction that jumps 1 byte forward.
第二行代码是一条向前跳 1 个字节的跳转指令。
jmp +$01
}
This way the inc edx instruction that increments Result by one is bypassed.
inc edx 指令使结果不断增加一。
while (Str[Result] <> #0) do
{
inc edx
The inc edx instruction is a result of the Pascal code Inc(Result); and it would have looked
like this if the compiler did not relocate it.
inc edx 指令是 Pascal 代码 Inc(Result) 的汇编代码;
如果编译器没有重新部署它,那么它应该像这样。
Inc(Result);
{
inc edx
}
The while loop is compiled into three lines of assembler code of which the inc edx line is
the loop body and the two remaining lines are the loop control code.
while 循环语句被编译成三行汇编代码,inc edx 在循环体内,剩下的两行是循环控制指令。
while (Str[Result] <> #0) do
{
inc edx
cmp byte ptr [eax+edx],$00
jnz -$07
}
这一行
cmp byte ptr [eax+edx],$00
compares a byte of the PChar string with zero. The Pascal code Str[Result] is generating
this code byte ptr [eax+edx]
Eax is a pointer to the beginning of the PChar and it is what the function received as the
Str parameter.
将 Pchar 字符串的一个字节与零比较。Pascal 代码 Str[Result] 产生了代码 [eax+edx]
Eax 是指向 PChar 开始的指针,它是由函数的 Str 参数传入的。
Edx is the Result variable. In the first loop iteration it is zero and the
first character of the string is compared to the immediate value $00, which is simply a
complicated way of writing 0. Because we only want to compare one character to zero at a
time it is necessary to express that the [eax+edx] pointer should be understood as a pointer
to a byte. The byte ptr code does this. A compare instruction sets the flags in the EFLAGS
register according to the result of the compare. The jump instruction
jnz -$07
}
tests the zero flag and jumps 7 bytes back if the flag is not zero. Jnz stands for Jump Not
Zero. If the pointer [eax+edx] is not pointing at a zero terminator the loop is iterated
once more.
Edx 是结果变量。在循环的开始它是零,字符串的第一个字符与立即数 $00 比较,这仅仅是
一个复杂的写 0 标志的方法。因为我们只是想一次用一个字符与零比较, 所以需要用一个指向
字节的指针来指示 [eax+edx]。代码 byte ptr 就是来完成这个功能。
比较指令会根据比较的结果来置标志寄存器相应的位。
跳转指令
jnz -$07
}
检测零标志位,如果标志位不是零则向后跳 7 个字节。
Jnz 表示结果不为零时跳转。
如果指针 [eax+edx] 不是指向零结尾,则继续循环。
If we want to translate the function into a pure BASM function we have to investigate where
the two jumps are jumping to. This can be done by tracing through the code with the cpu view
open. We also saw earlier that the first jump bypassed the one byte instruction inc edx.
Therefore we need a label right after this line. Because I had a day where my fantasy was
sleeping I simply named it L1 for Label 1 ;-) It is also possible to use our understanding
of the code to realize that the last jump jumps to the start of the loop and the start of
the loop is just before the single loop body instruction inc edx.
Then the function looks like this.
如果我们要将函数翻译成 BASM 函数,则不得不研究那两个跳转跳向哪里。
这可以用 cpu view 来跟踪。
我们早就注意到了,第一个跳转指令绕过了单字节指令 inc edx。
因此,这一行的右面需要一个标签。
有一天我犯晕,仅命名 L1 来表示 Lablel 1
它也可能使用我们理解的代码来实现
最后的跳转指令调到循环的开始,同时循环体的开始正好是指令 inc edx。
函数如下:
function StrLen3(const Str: PChar) : Cardinal;
asm
//Result := 0;
xor edx,edx
jmp @L1
//while (Str[Result] <> #0) do
//Inc(Result);
@LoopStart :
inc edx
@L1 :
cmp byte ptr [eax+edx],$00
jnz @LoopStart
mov eax,edx
//ret
end;
We can make it look a little nicer by writing a zero the simple way and by removing the
outcommented ret instruction.
我们可以制造一个看起来更简短的写零方法,同时删除被注释的 ret 指令。
function StrLen4(const Str: PChar) : Cardinal;
asm
//Result := 0;
xor edx,edx
jmp @L1
//while (Str[Result] <> #0) do
//Inc(Result);
@LoopStart :
inc edx
@L1 :
cmp byte ptr [eax+edx], 0
jnz @LoopStart
mov eax,edx
end;
比较指令一次执行 1 个字节,这个论点将马上被重写的这一行所检验:
cmp byte ptr [eax+edx], 0
改为这两行:
mov cl, byte ptr [eax+edx]
cmp cl, 0
Step through the function with the CPU view open and watch how the lowest byte of the ecx
register holds the ASCII value of the character from the string under inspection.
在 CPU View 中单行调试,观察 ecx 寄存器的最低字节,其从字符串装入字符的 ascii 值。
The line 这行
cmp cl, 0
can be coded as 可以被编码为
test cl, cl
This is the simplest form of a peephole optimization. Changing one instruction with another
that performs the same logic.
这是一个简单窥孔优化形式。就是用另一个执行逻辑相同的指令来替代当前指令。
Another one is this 另一个例子是
xor edx,edx
"optimized" into "优化"为
mov edx, 0
The preferred way of zeroing a register on P4 is the first one as described at page 103 of
the Intel Pentium 4 and Intel Xeon Processor Optimization Reference Manual. This is also
true for other processors.
在 P4 上首选的清零方式是第一种(xor edx,edx),它在 Intel Pentium 4 和 Intel Xeon 处理器优化手册的
第 103 页被介绍。对其他处理器也适用。
What new instructions did we learn? Xor, jmp, inc, cmp, test and jnz. We also learned how to
implement a loop and how to work with one byte of data at a time. The peephole optimization
technique was also introduced.
在这一课中我们学到了哪些新指令呢? xor, jmp, inc, cmp, test 和 jnz。
我们也学到了如何执行一个循环和如何一次执行一个字节,同时介绍了窥孔优化技术。