FLD pushes the source operand onto the FPU stack. If the source is a register, the register number used is that before the stack-top pointer is decremented. In particular, coding FLD ST(0) duplicates the stack top.