Addressing Addressing Modes
How does the CPU know the difference between a register and a memory address?
These are different *addressing modes* (that's what you should gogulate upon). In sane instruction sets, these modes map into bits in the instructions. As someone already said, it's the assembler's job to figure out what you mean and pick the right instruction and set its addressing mode bits and transcribe your operands into the instruction stream.
Let's say any instruction starting with 0001 means "move." The next nybble might specify the addressing mode. In a register-to-register mode, the next two nybbles might be the registers. Boom, it's there in a 16-bit instruction no need to fetch operands.
Register-to-memory modes will fetch an operand for the address. And so on.
This got quite ridiculous after a while, with some architectures having 18 modes, so you'd have stuff like a base address register to which was added the result of multiplying another register by an operand, plus another offset operand. All in one instruction.
It turned out that using the complicated modes was often *slower* than explicitly calculating the address with individual discrete instructions. Thus the RISC revolution. ISAs went from 25 addressing modes to six, or four. (Isn't that a song?)
Now, I mentioned "sane" above. x86 is insane. I am long past caring exactly how it does what it does. It is not "orthogonal" and some registers have special purposes and they're intimately related to addressing modes - segment registers, a sort of dedicated base address register. (This actually does get kind of cool with x86's execution level "rings" but was just a pain before the 80386.)
But these days - and if I'm wrong somebody correct me - on the inside x86 chips crack their complicated instructions into streams of simpler ops, which are executed by a kind of VLIW (Very Long Instruction Word, means lots of room to specify x86 weirdness) engine. The crazy addressing modes are thus simplified, and coded nicely into slots in the VLIW machine's words. Maybe I'm thinking of the Athlon and Intel does it different.
Whatever. The trend is toward fewer addressing modes and doing address calculation explicitly in honest code. Fewer modes but more registers.