126x Filetype PDF File size 2.12 MB Source: www.codeproject.com
How to create your own virtual machine! Part I Presented by: Alan L. Bryan A.k.a. Icemanind Questions? Comments? Email me at icemanind@yahoo.com Please leave feedback if you enjoyed this tutorial. The more feedback I get, the more it’ll make me want to write Part II Introduction Welcome to my tutorial on virtual machines. This tutorial will introduce you to the concept of a virtual machine and then we will, step by step, create our own simple virtual machine in C#. Keep in mind that a virtual machine is a very complicated thing and even the simplest virtual machine can take years for a team of programmers to create. With that said, don’t expect to be able to create your own language or virtual machine that will take over .NET or Java overnight. In this tutorial, we will first layout the plan for our virtual machine. Then we will create a very simple intermediate language. An intermediate language is the lowest level language still readable by humans. It is comparable to assembly language, which is also the lowest level language on most computers. The first program we will create will be a very simple intermediate compiler that will convert our intermediate language to bytecode. Bytecode is a set of binary instructions that our virtual machine will be able to directly execute. It is comparable to machine language, which is a set of binary or machine instructions that all computers and CPUs understand. This virtual machine will be our second project. It will be a virtual machine, created from scratch in C# that will execute our bytecode. It will be very simple at first, but then we will expand it by adding threading support and dual screen outputs (you’ll find out what I’m talking about later). All of the code in this tutorial is created using Visual Studio 2008 Professional, targeting the .NET Framework 2.0. Since I’m targeting the 2.0 framework, you should be able to use Visual Studio 2005 as well. Since creating a virtual machine really does dive down into the nuts and bolts of how computers work, I am assuming the reader of this has a pretty good, or a basic knowledge of, programming, hexadecimal and binary number systems, and threading. It would also really help to know something about assembly language, although I will try to help you understand things on a need-to-know basis. If I haven’t scared you off and you’re still interested in how to make a virtual machine, then let’s begin! How to create your own virtual machine in a step-by-step tutorial 2009 Brought to you by icemanind Planning it out As described in the introduction, the first thing we will want to do is draw out a rough blue print of what our machine will be able to do. I have decided to call our machine, B32 (Binary 32), although, for simplicity’s sake it will not be a 32-bit machine. It will be a 16-bit machine. B32 will have 64K of memory and it can be addressed anywhere from $0000 - $FFFF. A B32 executable program can access any part of that memory. Along with a 64K memory space, we will introduce 5 registers into our virtual machine. All CPU’s and all virtual machines have what’s called registers. A register is similar to a variable. Registers hold numbers and depending on how large the register is, determines how large of a number it can hold. Unlike variables, however, registers do not take up memory space. Registers are “built into” CPUs. This will make more sense once you see an example, which is coming up real soon. To keep things simple, we will only implement 5 registers into our virtual machines. These registers will be called A, B, D, X and Y. The A and B registers are only 8 bits in length, which means each register can hold any number between 0 and 255 unsigned or between -128 to 127 signed. For now, we are going to worry only about unsigned integers. We will get into signed later and we will briefly touch on floating point numbers later. The X, Y and D registers will be 16 bits in length, capable of storing any number between 0 and 65,535 unsigned or between -32768 to 32767 signed. The D register will be something of a unique register. The D register will hold the concatenated values of the A and B registers. In other words, if register A has $3C and register B has $10, than register D will contain $3C10. Anytime a value in the A or B register is changed, then the value in the D register is also changed. The same is true if a value in the D register is changed, the A and B registers will be changed accordingly. You will see later why this is handy to have. This has been a lot of dry talk, but here is a picture to represent our B32 registers: B32Registers A B X Y 8 8 16 16 bits bits bits bits { D 16 bits Hopefully this makes sense to you. If not, you will catch on as we progress through the tutorial. Earlier when I told you that our virtual machine had 64K of free memory for an executable to use, that was not entirely true. Really it’s only 60K because 4000 bytes must be reserved for screen 3 How to create your own virtual machine in a step-by-step tutorial 2009 Brought to you by icemanind output. I’ve chosen to use $A000 - $AFA0. This area of memory will map to our screen. In most CPUs and most virtual machines, this memory is mapped inside the video card memory, however, for simplicity; I am going to share our 64K of memory with our video output. This memory will give us an 80x25 screen (80 columns, 25 rows). You may be thinking right now, “I think your math is off dude. 80 times 25 is only 2000”. This is true; however, the extra 2000 bytes will be for an attribute. For those of us old enough to remember programming assembly language, back in the old DOS days, will already be familiar with an attribute byte. An attribute byte defines the foreground and background color of our text. How it works is the last 3 bits of the byte make up the RGB or Red, Green, th Blue values of our foreground color. The 4 bit is an intensity flag. If this bit is 1 then the color is brighter. The next 3 bits make up the RGB values of our background color. The last bit is not used (back in DOS days, this bit was used to make text blink, but in B32, it is ignored). You will see later how colors are created using this method. The final part of this section will define the mnemonics and the bytecode that make up a B32 executable. Mnemonics are the building block of our assembly language code that will be assembled to bytecode. For now, I am only going to introduce enough for us to get started and we will expand on our list throughout this tutorial. The first mnemonic we will introduce is called “LDA”. “LDA” is short for “Load A Register” and what it will do is assign a value to the A register. Now in most CPUs and virtual machines, you have what’s called addressing modes. Addressing modes determine how a register gets its value. For example, is the value specified directly on the operand (an operand is the data that follows the mnemonic) or does it pull a value from somewhere in memory or is loaded from a value assigned to another register? There can be dozens of addressing modes, depending on how complex of a virtual machine you want to create. For now, our virtual machine will only pull data directly specified in the operand. We will assign this mnemonic a bytecode value of $01. Since we decided earlier that the A register can only hold an 8 bit value, we now that the entire length of a “LDA” mnemonic that pulls direct data from the operand will be 2 bytes in length (1 byte for the mnemonic and 1 byte for the data). The next mnemonic we will discuss will be called “LDX”. “LDX” is short for “Load X Register” and, just like “LDA”, it will load a value directly into the X register from the operand. Another difference between “LDX” and “LDA” is the length. Since our X register can hold 16 bits of data, that means the total length of the bytecodes will be 3 bytes instead of 2 (1 byte for the mnemonic and 2 bytes for the data). We will assign this mnemonic a bytecode of $02. If I lost you guys, keep reading and I promise this will make sense when we look at some examples. The next mnemonic we will discuss now will be called “STA”. “STA” is short for “Store A Register” and its function will be to store the value contained in the A register into a location somewhere in our 64K memory. Unlike our load mnemonics, which pulls the value directly from the operand, our store mnemonic will pull its data from the value stored in one of the 16 bit registers. We will assign this mnemonic a bytecode of $03. The final mnemonic we will discuss is call “END”. “END” will do exactly that. It will terminate the application. All B32 programs must have an END mnemonic as the last line of the program. The operand for the END mnemonic will be a label that will point to where execution of our B32 program will begin. 4
no reviews yet
Please Login to review.