(fwd)Design of a 16bit Synchronous Processor
Design of a 16bit Synchronous Processor
Author : K V Sarma Jonnavithula
When my colleague Mr. Balamurugan Selvaraj asked me to write something as a guest writer for this website, I was pleasantly surprised. I didnt expect it, given the fact that I am not even 1 year old in the industry. So, even before we start, I would like to thank Mr. Balamurugan Selvaraj for being so kind and encouraging. This is my first post on a web site in the internet. I hope it will be useful to you, the all important reader.
My first thoughts about what to write went into something very generic but then I realized that I may not be able to do justice because I myself am not very knowledgeable in this regard. So, after lot of thought, I realized that I should write about my experiences. So, I decided to write about a project I did as a part of one of my courses in my post graduation in BITS, Pilani. The project was "Design of 16 bit Synchronous Processor".
The project was done under Dr. Anu Gupta who was the instructor for the course Advanced VLSI Design. The project was part of the course structure. Initially, there were teams of two. Me and my project mate Nidhi Choudary were to do the Control Path of a 16 bit asynchronous processor while two of my other batch-mates - Sandeep and Ashutosh Tiwari were to design the Data Path of a 16 bit asynchronous processor. We requested Anu ji to club the two parts and make it a single project to be done by four of us. We renamed the poject as Design of a 16 bit Asynchronous processor.
We started with a simple literature survey. We found that 16 bit asynchronous processors do exist but only in the form of ASICs. As a part of our project, we had to demonstrate the work done on an Altera Cyclone II FPGA. A quick search showed that this FPGA cant be used to implement such asynchronous designs. We reported the same to Anu ji and told her our difficulty. She conceded our request to rename the project as "Design of 16 bit Asynchronous Processor". Well thats a bit about the history of the project.
Our first target was to define the targets of the project. The objectives were clear. We have to design a 16 bit synchronous processor core - just the control path and data path, implement the circuit on an FPGA and execute a sample code. Although we wanted to make a board and implement a set of codes, we decided against it due to the time involved.
As a part of one other course called "VLSI Architectures", we had learnt how to design processors. We used Nick Trednick's "Micro Processor Design : The Flow Chart Method" book during that course. Click here to know more about the book. We used the techniques described in the book. Since the processor itself was not very complex, it made sense to use this simple technique, though it might not be used anywhere now. For simplicity, we call the processor "SynchPro".
Fig 1 : Block Diagram of SynchPro
1) Instruction Decoder
2) Control
3) Data Path
4) Counter
5) IRF, IRE latches
ID decodes the instruction and selects the state that has to be executed next. It also allocates the count_val to the counter so that it can assert the count_done signal high upon counting the cycles. This is mainly useful to distinguish the normal states and states with external bus activity. In case there is bus activity, the count_val is 0010 i.e., 2 cycles else 0001 i.e., 1 cycle. ID also deciphers the instruction and identifies rx,ry,rz from the instruction which will be used by control unit.
Control generates control word. This control word is 36 bit control word. This unit identifies the bus controls and register controls depending upon the state under execution. It also generates the register enables. The bus and register controls are generated depending upon the State under execution because the register number depends upon the instruction and here we make use of the rx,ry,rz generated by instruction decoder.
Counter, as has been described above, is what maintains order in the house. It decides when to stop the current state, when to start the next state and is thus crucial. Its o/p is used by datapath as well as instruction decoder. The purposes have already been explained.
Datapath is the core execution unit. Its operation is simple and it is the unit where we have all the registers (general purpose(r0 to r7) as well as PC,Do,Di,Ao) and operation units (ALU, Multiplier, Shifter).
Data Path
Fig 2 : Data Path of SynchPro
The different resources of the processor are:
Address out register (Ao) of 16-bit
Data out register (Do) of 16-bit
Data in register (Di) of 16-bit
Program Counter register (PC) of 16-bit
8 General Purpose registers (R0-R7) of 16-bit each
Two ALU output registers (A & B) of 16-bit each. Two registers have been taken because of the multiplication unit
Arithmetic Control Unit to perform all arithmetic and logical operations
Shifter to perform all shifting operations
Multiplier to perform multiplication of two 16-bit numbers
Two buses (a and b) connect all these elements together
Control Path
Fig 3 shows the Control Path of SynchPro. IRF and IRE are shown only for convenience.
Fig 3 : Control Path of SynchPro
The Instruction Decoder sends the address of the control word sequence to the control store.
The Control Store contains the control word sequences for all the instructions.
The State Sequencer steps the Control Store through each control word in the sequence for the instruction.
The Control Word Decoder transforms each of the Control Words into specific control signals for the execution unit, register select unit, bus select unit.
Design of Control Path
The design of Control path was done using the hardware flow chart techniques. The control word was generated for all the special purpose registers (pc,do,di,ao) using optimized flow charts. More on that later.
Instruction Set Summary
Instruction Format
The length of the instruction is 16 bits. Five MSB bits are assigned for opcode, next three for destination register and next three for source register. Remaining LSB bits can be used in future for complex instructions.
Addressing Modes
Two types of addressing modes have been used
Register Direct
Register Indirect
Instruction Set
All the instructions are categorized into 4 parts:
Data Movement Instructions
Arithmetic and Logical Instructions
Shift Instructions
Jump Instructions
The details of opcode and instruction formats for these categories are detailed in the following.
Data Movement Instructions
1) MOV Rx, Ry
Move the contents of Ry to Rx
Flags are not effected.
2) MOV @Rx, Ry
Move the contents of Ry to the location denoted by contents of Rx
Flags are not effected.
3) MOV Rx, @Ry
Move the contents of the location denoted by Ry to Rx
Flags are not effected.
Arithmetic & Logic Instructions
1) AND Rx, Ry
Bitwise AND the contents of Rx to the contents of Ry
Flags are not effected.
2) OR Rx, Ry
Bitwise OR the contents of Rx to the contents of Ry
Flags are not effected.
3) XOR Rx, Ry
Bitwise XOR the contents of Rx to the contents of Ry
Flags are not effected.
4) XNOR Rx, Ry
Bitwise XNOR the contents of Rx to the contents of Ry
Flags are not effected.
5) ADD Rx, Ry
Add the contents of Rx to the contents of Ry and save the result into Rx
Flags are effected.
6) ADC Rx, Ry
Add the contents of Rx to the contents of Ry with the previous carry and save the result into Rx
Flags are effected.
7) DEC Rx
Decrement the content of Rx by one and save the result in Rx
Flags are effected.
8) INC Rx
Increment the content of Rx by one and save the result in Rx
Flags are effected.
9) CMP Rx
Complement of Rx and save the result in Rx
Flags are effected.
10) TCMP Rx
Two's complement of Rx and save the result in Rx
Flags are effected.
11) SUB Rx, Ry
Subtract the contents of Ry to the contents of Rx and save the result in Rx
Flags are effected.
12) MUL @ Rx
Multiply the contents of R0 and R1 and save the result to the location denoted by contents of Rx
Flags are not effected.
Shifting Instructions
1) SHL Rx, Ry
Shift the contents of Rx to left by the value denoted by the lower 4 bits of Ry
Flags are not effected.
2) SHR Rx, Ry
Shift the contents of Rx to right by the value denoted by the lower 4 bits of Ry
Flags are not effected.
3) ROL Rx, Ry
Rotate the contents of Rx to the left by the value denoted by the lower 4 bits of Ry
Flags are effected.
4) ROR Rx, Ry
Rotate the contents of Rx to the right by the value denoted by the lower 4 bits of Ry
Flags are not effected.
Jump Instructions
There are two types of Branch instructions. They are Conditional and Unconditional. SynchPro supports two types of the instructions.
1) JMP @Rx (This is Unconditional Jump)
Jump unconditionally to the address given by the contents of Ry
Flags are not effected and not used in this Branching.
2) CJNE Rx, Ry, @Rz
Compare Rx and Ry if not equal, jump to the address given by the contents of Rz
Flags are effected by this instruction and zero flag is used in its execution.
3) DJNZ Rx, @Ry
Decrement the Rx by one and if not equal to zero, jump to the address given by contents of Ry
Flags are effected by this instruction and zero flag is used in its execution.
In addition to these, SynchPro allows for use of no operation instructions useful in creating delays in program.
NOP
No operation
Flags are not effected. This instruction simply increments PC and does not effect any other register.
With these specifications, we went on with the procedures specified in Nick Tredinick's book. Our next step was to make the hardware flowcharts. Making hardware flowcharts is a very old approach. Today, many faster means might be in use. However, as students we had inherent sense of satisfaction in going through the process of making the flow charts, the way we learnt it. You can read the book for more on how to do it. I will summarize it as follows:- Make a list of processor instructions.
- Divide them into two parts - the operation jobs and house keeping jobs. An instruction is said to complete it execution cycle only when it has done both the parts operations part and the house keeping part.
- List out the necessary resources for each of the parts in two different charts.
- Mix the two parts of the charts for each instruction so that there is no conflict in using the resources.
- Now that we have the list of resources required for each of the instructions, arrive at control word per state of the instruction. Note that each instruction might have different number of states because the processor is not of RISC type but of CISC type
Rules of operation of SynchPro:
- rst has to be low upon power on and it has to be high during the operation of the processor. During operation, if rst is de-asserted and asserted, then the processor starts execution from 0000h location. This is the first instruction in the code segment. It powers on into "nop" i.e., once we start the processor, it executes a "nop" instruction.
- There are two external buses. One is for EDBDatain and the other for EDBCodeSeg. EAB and EDBDataOut are o/ps and we assume no buses for the verification. Instead, when data is to be written out, Write pin will go high.
- The test board we are implementing for demo of SynchPro is specified as TestBoard.vhd in the codes. The address allocation is as follows:
0000h to 0FFFh is Code Segment Address Space
1FFFh to 7FFFh is Data Segment Address Space (read only)
8FFFh to FFFFh is Data Segment Address Space (write only) i.e., this goes as EDBDataOut
Synthesis was done on Altera Cyclone II FPGA EP1C6Q240C8. As expected, multiplier was on critical path and the maximum clock frequency was 19.51 MHz. Replacing the Wallace Tree Multiplier with a Booth's Algorithm implementation must result in better performance. Area didnt matter as a parameter because synthesis target was FPGA. To summarize, I would like to say that, due to some very innovative and interesting ideas at different points of time through the project we could make SynchPro which is- a processor core that has a rich instruction set
- a multiplier
- a rich arithmetic and logical instruction set
- three jump statements - one unconditional and two conditional
Labels: CPU design, logic design, Processor