Architectures Numeriques Avancees VHDL - Semestre 7

Annee Universitaire : 2023-2024
Semestre : 7
Credits : 2.5 ECTS
Specialite : Conception Numerique et FPGA


PART A - Presentation Generale du Module

Vue d'ensemble

Ce cours enseigne la conception de systemes numeriques complexes avec le langage VHDL (VHSIC Hardware Description Language) et leur implementation sur FPGA. Il couvre la conception d'architectures numeriques avancees : unite arithmetique et logique (ALU), bancs de registres, memoires, et processeurs complets.

Objectifs pedagogiques :

  • Maitriser le langage VHDL pour la description materielle
  • Concevoir des circuits numeriques synthetisables
  • Implementer des systemes sur FPGA (Xilinx)
  • Simuler et verifier des designs avec testbenches
  • Comprendre les machines a etats finis (FSM)
  • Concevoir un processeur simple complet

Position dans le cursus

Ce module s'appuie sur :

  • Fondements electronique numerique (S5) : portes logiques, bascules
  • Logique sequentielle (S5) : machines a etats, compteurs
  • Architecture Informatique materielle (S5) : structure processeur

Il prepare a :

  • Systemes embarques : accelerateurs materiels
  • Traitement du signal sur FPGA : implementation de filtres numeriques
  • Conception ASIC : circuits integres specialises
  • Systemes temps reel : logique cablee haute performance

PART B - Experience Personnelle et Contexte d'Apprentissage

Organisation et ressources

Le module etait organise en cours magistraux et bureau d'etudes pratique :

Cours magistraux (16h) :

  • Syntaxe et structure VHDL
  • Types de donnees et operateurs
  • Logique combinatoire et sequentielle
  • Machines a etats finis (FSM)
  • Architectures de processeurs

Bureau d'etudes (20h) :

Projets de conception progressive avec Xilinx Vivado :

  • Projet ALU : unite arithmetique et logique
  • Projet registre : banc de registres
  • Projet buffer : memoire tampon
  • Projet memoire instruction : ROM pour code programme
  • Projet memoire donnee : RAM pour donnees
  • Projet processeur : integration complete

Devoirs maison :

  • DM1 : ALU et testbench
  • DM2 : Circuit numerique complexe

Outils utilises :

  • Xilinx Vivado Design Suite (synthese, simulation, implementation)
  • FPGA cible : probablement Artix-7 ou Zynq
  • Testbenches VHDL pour simulation
  • Waveform viewer (fichiers .wcfg)

Structure des projets

Chaque projet Vivado contenait :

  • Sources : fichiers VHDL (sources_1/new/)
  • Simulations : testbenches (sim_1/)
  • Contraintes : fichiers .xdc pour mapping des pins (constrs_1/)
  • Waveforms : fichiers .wcfg pour visualisation des signaux
  • Synthese et implementation : runs/synth_1/, runs/impl_1/

Progression pedagogique :

Architecture VHDL

Figure : Architecture d'un processeur simple en VHDL - ALU, registres et memoire

  1. Buffer : Circuit simple (FIFO ou registre a decalage)
  2. ALU : Operations arithmetiques et logiques
  3. Registre : Banc de registres avec lecture/ecriture
  4. Memoire instruction : ROM pour stocker le programme
  5. Memoire donnee : RAM pour stocker les donnees
  6. Processeur : Assemblage final (chemin de donnees + unite de controle)

Methode de travail

Phase 1 : Specification :

Definir l'interface (ports d'entree/sortie) et le comportement attendu du circuit.

Phase 2 : Code VHDL :

Ecrire le code dans Vivado, respecter les regles de synthetisabilite.

Phase 3 : Testbench :

Creer un testbench pour verifier fonctionnellement le circuit par simulation.

Phase 4 : Simulation :

Lancer la simulation comportementale, observer les chronogrammes (waveforms), corriger les erreurs.

Phase 5 : Synthese :

Verifier que le code se synthetise sans erreurs, analyser l'utilisation des ressources (LUT, FF, BRAM).

Phase 6 : Implementation (optionnel) :

Placer et router le design, verifier le timing, programmer le FPGA.

Difficultes rencontrees

Pensee parallele :

VHDL decrit du materiel ou tout s'execute en parallele, contrairement a la programmation sequentielle. Comprendre que les process s'executent simultanement demande un changement de paradigme.

Syntaxe stricte :

VHDL est verbeux et type fortement. Les erreurs de syntaxe ou de typage sont frequentes au debut.

Timing et synchronisation :

Gerer correctement les horloges, resets, et eviter les hazards (glitches) necessite de la rigueur.

Debugging :

Sans printf ni debogueur, le debugging se fait via les chronogrammes. Savoir quels signaux observer est crucial.


PART C - Aspects Techniques Detailles

1. Introduction au VHDL

VHDL = VHSIC Hardware Description Language

VHSIC = Very High Speed Integrated Circuit

Langage de description materielle :

  • Decrit la structure et le comportement de circuits numeriques
  • Utilise pour simulation et synthese (generation de circuits reels)
  • Standard IEEE (IEEE 1076)

Difference programmation logicielle :

AspectLogicielMateriel (VHDL)
ExecutionSequentielleParallele
VariablesModifiees en sequenceSignaux qui evoluent dans le temps
BouclesIterationsReplication de materiel
RessourcesAbstraites (memoire virtuelle)Physiques limitees (LUT, FF)

2. Structure d'un fichier VHDL

Entity (entite) :

Decrit l'interface du composant (ports d'entree/sortie).

Exemple :

entity compteur is
  port (
    clk    : in  std_logic;
    reset  : in  std_logic;
    enable : in  std_logic;
    count  : out std_logic_vector(7 downto 0)
  );
end entity compteur;

Architecture :

Decrit le comportement ou la structure interne du composant.

Exemple :

architecture behavioral of compteur is
  signal count_internal : unsigned(7 downto 0);
begin
  process(clk, reset)
  begin
    if reset = '1' then
      count_internal <= (others => '0');
    elsif rising_edge(clk) then
      if enable = '1' then
        count_internal <= count_internal + 1;
      end if;
    end if;
  end process;

  count <= std_logic_vector(count_internal);
end architecture behavioral;

Types d'architectures :

TypeDescriptionUsage
BehavioralDescription algorithmique du comportementHaut niveau, synthese automatique
DataflowAffectations concurrentes, equationsLogique combinatoire
StructuralInstanciation de composantsHierarchie, connexion de blocs

3. Types de donnees VHDL

Types standards :

TypeDescriptionExemple
std_logicBit logique (9 valeurs)'0', '1', 'Z', 'X', etc.
std_logic_vectorVecteur de bits"10110101"
integerEntier-2147483648 a 2147483647
unsignedEntier non signeutilise pour calculs arithmetiques
signedEntier signecomplement a 2

Valeurs std_logic :

  • '0' : niveau bas
  • '1' : niveau haut
  • 'Z' : haute impedance (trois etats)
  • 'X' : inconnu (non initialise)
  • '-' : don't care (synthese)

Bibliotheques necessaires :

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;      -- std_logic, std_logic_vector
use IEEE.NUMERIC_STD.ALL;         -- unsigned, signed, conversion
use IEEE.STD_LOGIC_UNSIGNED.ALL;  -- operations sur std_logic_vector (ancien)

4. Logique combinatoire

Affectation concurrente :

Les affectations en dehors des process s'executent en parallele.

Exemples :

-- Porte AND
y <= a and b;

-- Multiplexeur
y <= a when sel = '0' else b;

-- Multiplexeur 4 vers 1
with sel select
  y <= a when "00",
       b when "01",
       c when "10",
       d when others;

Process combinatoire :

Pour decrire de la logique combinatoire dans un process :

process(a, b, c)  -- Liste de sensibilite : tous les signaux lus
begin
  if a = '1' then
    y <= b;
  else
    y <= c;
  end if;
end process;

Attention : Si un signal lu n'est pas dans la liste de sensibilite, le process ne se met pas a jour - difference simulation/synthese.

5. Logique sequentielle

Bascule D (D Flip-Flop) :

Element de base de la logique sequentielle. Memorise une valeur sur un front d'horloge.

process(clk)
begin
  if rising_edge(clk) then  -- Front montant
    q <= d;
  end if;
end process;

Avec reset asynchrone :

process(clk, reset)
begin
  if reset = '1' then       -- Reset prioritaire
    q <= '0';
  elsif rising_edge(clk) then
    q <= d;
  end if;
end process;

Avec reset synchrone :

process(clk)
begin
  if rising_edge(clk) then
    if reset = '1' then
      q <= '0';
    else
      q <= d;
    end if;
  end if;
end process;

Regle d'or :

  • Reset asynchrone : dans la sensibilite du process et teste en premier
  • Reset synchrone : a l'interieur du test de rising_edge

6. Machines a etats finis (FSM)

Definition :

Une FSM (Finite State Machine) est un circuit sequentiel avec un nombre fini d'etats. Elle change d'etat selon les entrees et l'etat courant.

Types :

  • Moore : sorties dependent seulement de l'etat
  • Mealy : sorties dependent de l'etat et des entrees

Structure a 2 process :

Process 1 : Registre d'etat (sequentiel)

process(clk, reset)
begin
  if reset = '1' then
    etat_courant <= IDLE;
  elsif rising_edge(clk) then
    etat_courant <= etat_suivant;
  end if;
end process;

Process 2 : Logique de transition (combinatoire)

process(etat_courant, entree)
begin
  case etat_courant is
    when IDLE =>
      if entree = '1' then
        etat_suivant <= TRAITEMENT;
      else
        etat_suivant <= IDLE;
      end if;
      sortie <= '0';

    when TRAITEMENT =>
      if compteur_fini = '1' then
        etat_suivant <= FIN;
      else
        etat_suivant <= TRAITEMENT;
      end if;
      sortie <= '1';

    when FIN =>
      etat_suivant <= IDLE;
      sortie <= '0';

    when others =>
      etat_suivant <= IDLE;
      sortie <= '0';
  end case;
end process;

Declaration des etats :

type etat_type is (IDLE, TRAITEMENT, FIN);
signal etat_courant, etat_suivant : etat_type;

7. Unite Arithmetique et Logique (ALU)

Principe :

L'ALU effectue les operations arithmetiques (addition, soustraction) et logiques (AND, OR, XOR) d'un processeur.

Interface typique :

entity ALU is
  port (
    A      : in  std_logic_vector(31 downto 0);  -- Operande A
    B      : in  std_logic_vector(31 downto 0);  -- Operande B
    OP     : in  std_logic_vector(3 downto 0);   -- Code operation
    Result : out std_logic_vector(31 downto 0);  -- Resultat
    Zero   : out std_logic;                      -- Flag zero
    Carry  : out std_logic                       -- Flag retenue
  );
end entity ALU;

Operations courantes :

Code OPOperationDescription
0000ANDET logique bit a bit
0001OROU logique bit a bit
0010ADDAddition
0110SUBSoustraction (A - B)
0111SLTSet if Less Than (A < B)
1100NORNON-OU logique

Implementation :

process(A, B, OP)
  variable temp : unsigned(32 downto 0);  -- 33 bits pour la retenue
begin
  case OP is
    when "0000" =>  -- AND
      Result <= A and B;
      Carry <= '0';

    when "0001" =>  -- OR
      Result <= A or B;
      Carry <= '0';

    when "0010" =>  -- ADD
      temp := ('0' & unsigned(A)) + ('0' & unsigned(B));
      Result <= std_logic_vector(temp(31 downto 0));
      Carry <= temp(32);

    when "0110" =>  -- SUB
      temp := ('0' & unsigned(A)) - ('0' & unsigned(B));
      Result <= std_logic_vector(temp(31 downto 0));
      Carry <= temp(32);

    when others =>
      Result <= (others => '0');
      Carry <= '0';
  end case;

  -- Flag Zero
  if Result = x"00000000" then
    Zero <= '1';
  else
    Zero <= '0';
  end if;
end process;

8. Banc de registres

Principe :

Ensemble de registres (16 ou 32 typiquement) pour stocker temporairement des donnees dans un processeur.

Interface :

entity RegisterFile is
  port (
    clk       : in  std_logic;
    reset     : in  std_logic;
    -- Lecture
    ReadAddr1 : in  std_logic_vector(4 downto 0);   -- Adresse registre 1
    ReadAddr2 : in  std_logic_vector(4 downto 0);   -- Adresse registre 2
    ReadData1 : out std_logic_vector(31 downto 0);  -- Donnee registre 1
    ReadData2 : out std_logic_vector(31 downto 0);  -- Donnee registre 2
    -- Ecriture
    WriteEn   : in  std_logic;                      -- Autorisation ecriture
    WriteAddr : in  std_logic_vector(4 downto 0);   -- Adresse ecriture
    WriteData : in  std_logic_vector(31 downto 0)   -- Donnee a ecrire
  );
end entity RegisterFile;

Implementation :

architecture behavioral of RegisterFile is
  type reg_array is array (0 to 31) of std_logic_vector(31 downto 0);
  signal registers : reg_array;
begin
  -- Ecriture synchrone
  process(clk, reset)
  begin
    if reset = '1' then
      registers <= (others => (others => '0'));
    elsif rising_edge(clk) then
      if WriteEn = '1' and WriteAddr /= "00000" then  -- R0 toujours a 0
        registers(to_integer(unsigned(WriteAddr))) <= WriteData;
      end if;
    end if;
  end process;

  -- Lecture asynchrone
  ReadData1 <= registers(to_integer(unsigned(ReadAddr1)));
  ReadData2 <= registers(to_integer(unsigned(ReadAddr2)));
end architecture behavioral;

Particularite :

Le registre R0 est souvent cable a zero (convention MIPS/RISC-V).

9. Memoires

ROM (Read-Only Memory) :

Utilisee pour stocker le programme (instructions).

architecture behavioral of ROM is
  type rom_array is array (0 to 255) of std_logic_vector(31 downto 0);
  constant rom_data : rom_array := (
    x"00000000",  -- NOP
    x"20010005",  -- ADDI R1, R0, 5
    x"20020003",  -- ADDI R2, R0, 3
    x"00221820",  -- ADD R3, R1, R2
    -- ... autres instructions
    others => x"00000000"
  );
begin
  process(clk)
  begin
    if rising_edge(clk) then
      data_out <= rom_data(to_integer(unsigned(address)));
    end if;
  end process;
end architecture behavioral;

RAM (Random Access Memory) :

Utilisee pour stocker les donnees.

architecture behavioral of RAM is
  type ram_array is array (0 to 1023) of std_logic_vector(31 downto 0);
  signal ram_data : ram_array := (others => (others => '0'));
begin
  process(clk)
  begin
    if rising_edge(clk) then
      if write_enable = '1' then
        ram_data(to_integer(unsigned(address))) <= data_in;
      end if;
      data_out <= ram_data(to_integer(unsigned(address)));
    end if;
  end process;
end architecture behavioral;

Utilisation des BRAM (Block RAM) sur FPGA :

Les FPGA possedent des blocs memoire dedies (BRAM). Vivado infere automatiquement des BRAM si le code suit certains modeles (comme ci-dessus).

10. Processeur simple

Architecture :

Un processeur minimal contient :

  • Chemin de donnees (Datapath) : ALU, registres, memoires, multiplexeurs
  • Unite de controle : FSM qui genere les signaux de controle

Cycle d'execution :

  1. Fetch : Lire l'instruction en memoire (PC → Mem_Instr → IR)
  2. Decode : Decoder l'instruction (champs opcode, registres, immediat)
  3. Execute : Executer l'operation (ALU)
  4. Memory : Acces memoire donnees si necessaire (LOAD/STORE)
  5. Write Back : Ecrire le resultat dans le registre destination

Signaux de controle :

L'unite de controle genere des signaux pour piloter le datapath :

  • RegWrite : autoriser ecriture dans le banc de registres
  • ALUOp : selection operation ALU
  • MemRead, MemWrite : lecture/ecriture memoire donnees
  • MemToReg : selection source donnee a ecrire dans registre
  • Branch : autoriser branchement conditionnel

Exemple simplifie :

Type d'instruction R (registre-registre) :

  • Opcode determine que c'est une instruction R
  • Lire deux registres sources (rs, rt)
  • ALU effectue l'operation (funct determine laquelle)
  • Ecrire le resultat dans registre destination (rd)

11. Testbench et simulation

Testbench :

Fichier VHDL sans ports (entity vide) qui instancie le circuit a tester (UUT - Unit Under Test) et genere des stimuli.

Structure :

entity testbench is
  -- Pas de ports
end entity testbench;

architecture test of testbench is
  -- Declaration des signaux de test
  signal clk : std_logic := '0';
  signal reset : std_logic := '1';
  signal input : std_logic_vector(7 downto 0);
  signal output : std_logic_vector(7 downto 0);

  constant CLK_PERIOD : time := 10 ns;
begin
  -- Instanciation du composant a tester
  UUT: entity work.mon_circuit
    port map (
      clk => clk,
      reset => reset,
      input => input,
      output => output
    );

  -- Generation de l'horloge
  clk_process: process
  begin
    clk <= '0';
    wait for CLK_PERIOD/2;
    clk <= '1';
    wait for CLK_PERIOD/2;
  end process;

  -- Generation des stimuli
  stim_process: process
  begin
    reset <= '1';
    input <= x"00";
    wait for 50 ns;

    reset <= '0';
    wait for 10 ns;

    input <= x"12";
    wait for 20 ns;

    input <= x"34";
    wait for 20 ns;

    wait;  -- Arret de la simulation
  end process;
end architecture test;

Verifications :

Utiliser des assertions pour verifier automatiquement :

assert (output = x"46") report "Erreur : resultat incorrect" severity error;

Waveform (chronogramme) :

Fichiers .wcfg dans Vivado permettent de configurer quels signaux afficher et comment. Essentiels pour le debugging visuel.

12. Synthese et implementation sur FPGA

Etapes :

1. Synthese (Synthesis) :

  • Conversion du code VHDL en netlist (portes logiques)
  • Optimisation logique
  • Verification que le code est synthetisable

2. Implementation :

  • Placement : affecter les ressources logiques aux elements du FPGA (LUT, FF)
  • Routage : connecter les elements via les interconnexions
  • Verification timing : s'assurer que les contraintes temporelles sont respectees

3. Generation du bitstream :

  • Fichier .bit a programmer dans le FPGA

Ressources FPGA :

RessourceDescription
LUT (Look-Up Table)Implemente logique combinatoire (fonction quelconque de N entrees)
FF (Flip-Flop)Bascule D pour logique sequentielle
BRAM (Block RAM)Blocs memoire dedies (18 Kb ou 36 Kb)
DSPBlocs multiplicateurs/accumulateurs materiels
IOBroches d'entree/sortie configurables

Contraintes :

Fichier .xdc (Xilinx Design Constraints) :

  • Affectation des pins physiques
  • Definition des horloges (periode, frequence)
  • Contraintes de timing

Exemple :

# Horloge 100 MHz
create_clock -period 10.000 -name clk [get_ports clk]

# Pins
set_property PACKAGE_PIN W5 [get_ports clk]
set_property IOSTANDARD LVCMOS33 [get_ports clk]

set_property PACKAGE_PIN U16 [get_ports led[0]]
set_property IOSTANDARD LVCMOS33 [get_ports led[0]]

PART D - Analyse Reflexive et Perspectives

Competences acquises

Conception materielle :

Maitrise du VHDL pour decrire des circuits numeriques synthetisables. Capacite a penser en termes de parallelisme materiel plutot que sequence logicielle.

Architecture de processeur :

Comprehension profonde du fonctionnement interne d'un processeur : ALU, registres, memoires, unite de controle, chemin de donnees.

Outils professionnels :

Utilisation de Xilinx Vivado, outil standard de l'industrie pour FPGA. Simulation, synthese, implementation, analyse de timing.

Points cles a retenir

1. VHDL = description materielle, pas programmation :

Le code VHDL decrit du materiel qui existe physiquement. Tout s'execute en parallele. Changement de paradigme fondamental.

2. Synthetisabilite :

Tout le code VHDL ne peut pas etre synthetise en materiel. Eviter wait for (sauf dans testbenches), boucles infinies, division par variables, etc.

3. Horloge et synchronisme :

Les circuits synchrones (cadences par horloge) sont plus robustes et previsibles. Toujours utiliser rising_edge(clk) pour detecter les fronts.

4. Testbench = essentiel :

La simulation avec testbench permet de valider le fonctionnement avant synthese. Debug beaucoup plus facile qu'apres implementation.

5. Ressources limitees :

Les FPGA ont un nombre fini de LUT, FF, BRAM. Importance de l'optimisation et du partage de ressources.

Applications pratiques

Traitement du signal :

Implementation de filtres numeriques haute performance (FIR, IIR), FFT, traitement d'image temps reel.

Accelerateurs materiels :

Acceleration de calculs intensifs (cryptographie, compression, IA) en dechargeant le processeur.

Systemes embarques critiques :

Aeronautique, automobile, spatial necessitent des circuits deterministes et fiables (logique cablee sur FPGA ou ASIC).

Prototypage ASIC :

Les FPGA servent a valider des designs avant fabrication d'ASIC (cout eleve, non modifiable).

Retour d'experience

Courbe d'apprentissage :

VHDL est plus difficile que la programmation classique au debut. Necessite de comprendre le materiel sous-jacent.

Projets progressifs :

La progression Buffer → ALU → Registres → Memoires → Processeur est pedagogiquement excellente. Chaque etape ajoute de la complexite.

Vivado puissant mais complexe :

L'outil est complet mais l'interface peut etre intimidante. Temps de synthese/implementation parfois long.

Debugging visuel :

Les waveforms (chronogrammes) sont indispensables. Savoir identifier les signaux critiques a observer facilite grandement le debugging.

Limites et ouvertures

Limites du module :

  • Pas d'implementation physique sur carte FPGA (simulation seulement)
  • Processeur tres simplifie (pas de pipeline, cache, etc.)
  • Aspects timing et optimisation peu approfondis

Ouvertures vers :

  • Processeurs avances : pipeline, prediction de branchement, cache
  • Systemes sur puce (SoC) : processeur + peripheriques integres
  • High-Level Synthesis (HLS) : generer VHDL depuis C/C++
  • Verification formelle : assertions, model checking
  • ASIC design : conception de circuits integres application specifique

Evolution technologique

Tendances actuelles :

HLS (High-Level Synthesis) :

Outils comme Vivado HLS ou Vitis HLS permettent d'ecrire en C/C++ et generent automatiquement du VHDL/Verilog. Accelere le developpement.

FPGA + IA :

Les FPGA modernes (Versal, Zynq UltraScale+) integrent des accelerateurs IA (DPU). Utilises pour inference de reseaux de neurones temps reel.

FPGA adaptatives (Versal) :

Architecture hybride : FPGA + CPU + DSP + IA. Reconfiguration dynamique.

Cloud FPGA :

AWS, Azure, Alibaba proposent des instances avec FPGA pour acceleration dans le cloud.

Conseils pour reussir

1. Penser materiel :

Visualiser mentalement les portes, bascules, multiplexeurs generes par votre code VHDL.

2. Simuler tot et souvent :

Ne pas attendre d'avoir ecrit tout le code. Tester chaque module independamment.

3. Bien commenter :

VHDL est verbeux. Des commentaires clairs aident a relire le code plus tard.

4. Respecter les conventions :

Noms de signaux explicites (clk, reset, enable), indentation coherente.

5. Utiliser les types appropries :

unsigned/signed pour arithmetique, std_logic_vector pour bus de donnees generiques.

Conclusion

Ce module fournit une excellente introduction a la conception numerique avec VHDL et FPGA. La realisation d'un processeur complet, meme simple, permet de comprendre en profondeur l'architecture des ordinateurs.

Competences transferables :

  • Pensee parallele et conception materielle
  • Rigueur dans la specification et la verification
  • Maitrise d'outils professionnels (Vivado)
  • Comprehension de l'architecture des processeurs

Pertinence professionnelle :

Les FPGA sont utilises dans de nombreux domaines (telecommunications, defense, medical, automobile, finance). La demande en ingenieurs FPGA reste forte.

Message principal :

VHDL et les FPGA offrent un controle total sur le materiel, permettant des performances et une efficacite energetique inatteignables avec du logiciel seul. La maitrise de ces outils ouvre des opportunites dans les systemes haute performance.

Recommandations :

  • Pratiquer sur des cartes FPGA reelles (Basys, Arty, DE10-Nano)
  • Explorer des projets plus complexes (controleurs, interfaces, DSP)
  • Apprendre Verilog en complement (langage concurrent)
  • Se former au HLS pour accelerer le developpement
  • Etudier l'architecture de processeurs reels (RISC-V open-source)

Liens avec les autres cours :


Illustrations du Projet BE Pipeline

Chemin de donnees du pipeline processeur

Figure : Chemin de donnees (Datapath) du processeur pipeline

Schema de l'ALU

Figure : Schema de l'Unite Arithmetique et Logique (ALU)

Banc de registres

Figure : Banc de registres (Register Bench)

Memoire d'instructions

Figure : Memoire d'instructions (Instruction Memory)

Memoire de donnees

Figure : Memoire de donnees (Data Memory)

Gestion des aleas du pipeline

Figure : Gestion des aleas (hazards) dans le pipeline


Rapports et Projets

Rapport de Projet VHDLVHDL Project Report

Rapport du bureau d'etudes : conception d'un processeur pipeline en VHDL, incluant ALU, banc de registres, memoires et gestion des aleas.Design study report: pipeline processor design in VHDL, including ALU, register file, memories and hazard management.

Telecharger le rapport PDFDownload PDF report


Documents de CoursCourse Documents

Cours VHDL CompletComplete VHDL Course

Cours complet de VHDL : syntaxe, processus, machines a etats, simulation et synthese pour FPGA.Complete VHDL course: syntax, processes, state machines, simulation and synthesis for FPGA.

TelechargerDownload

Projet Processeur RISCRISC Processor Project

Sujet du projet BE : conception d'un microprocesseur RISC en VHDL avec ALU, banc de registres et memoire.Design study project brief: RISC microprocessor design in VHDL with ALU, register file and memory.

TelechargerDownload


Cours suivi en 2023-2024 a l'INSA Toulouse, Departement Genie Electrique et Informatique.

Advanced Digital Architectures VHDL - Semester 7

Academic Year: 2023-2024
Semester: 7
Credits: 2.5 ECTS
Specialization: Digital Design and FPGA


PART A - General Module Presentation

Overview

This course teaches the design of complex digital systems using the VHDL language (VHSIC Hardware Description Language) and their implementation on FPGA. It covers the design of advanced digital architectures: arithmetic and logic unit (ALU), register files, memories, and complete processors.

Learning objectives:

  • Master the VHDL language for hardware description
  • Design synthesizable digital circuits
  • Implement systems on FPGA (Xilinx)
  • Simulate and verify designs with testbenches
  • Understand finite state machines (FSM)
  • Design a complete simple processor

Position in the curriculum

This module builds upon:

  • Digital Electronics Fundamentals (S5): logic gates, flip-flops
  • Sequential Logic (S5): state machines, counters
  • Hardware Computer Architecture (S5): processor structure

It prepares for:

  • Embedded systems: hardware accelerators
  • Signal processing on FPGA: digital filter implementation
  • ASIC design: specialized integrated circuits
  • Real-time systems: high-performance hardwired logic

PART B - Personal Experience and Learning Context

Organization and resources

The module was organized into lectures and practical design studies:

Lectures (16h):

  • VHDL syntax and structure
  • Data types and operators
  • Combinational and sequential logic
  • Finite state machines (FSM)
  • Processor architectures

Design studies (20h):

Progressive design projects with Xilinx Vivado:

  • ALU project: arithmetic and logic unit
  • Register project: register file
  • Buffer project: buffer memory
  • Instruction memory project: ROM for program code
  • Data memory project: RAM for data
  • Processor project: complete integration

Homework assignments:

  • HW1: ALU and testbench
  • HW2: Complex digital circuit

Tools used:

  • Xilinx Vivado Design Suite (synthesis, simulation, implementation)
  • Target FPGA: likely Artix-7 or Zynq
  • VHDL testbenches for simulation
  • Waveform viewer (.wcfg files)

Project structure

Each Vivado project contained:

  • Sources: VHDL files (sources_1/new/)
  • Simulations: testbenches (sim_1/)
  • Constraints: .xdc files for pin mapping (constrs_1/)
  • Waveforms: .wcfg files for signal visualization
  • Synthesis and implementation: runs/synth_1/, runs/impl_1/

Pedagogical progression:

VHDL Architecture

Figure: Architecture of a simple VHDL processor - ALU, registers and memory

  1. Buffer: Simple circuit (FIFO or shift register)
  2. ALU: Arithmetic and logic operations
  3. Register: Register file with read/write
  4. Instruction memory: ROM for storing the program
  5. Data memory: RAM for storing data
  6. Processor: Final assembly (datapath + control unit)

Working method

Phase 1: Specification:

Define the interface (input/output ports) and the expected behavior of the circuit.

Phase 2: VHDL Code:

Write the code in Vivado, comply with synthesizability rules.

Phase 3: Testbench:

Create a testbench to functionally verify the circuit through simulation.

Phase 4: Simulation:

Run behavioral simulation, observe waveforms, correct errors.

Phase 5: Synthesis:

Verify that the code synthesizes without errors, analyze resource utilization (LUT, FF, BRAM).

Phase 6: Implementation (optional):

Place and route the design, verify timing, program the FPGA.

Challenges encountered

Parallel thinking:

VHDL describes hardware where everything executes in parallel, unlike sequential programming. Understanding that processes execute simultaneously requires a paradigm shift.

Strict syntax:

VHDL is verbose and strongly typed. Syntax and type errors are frequent at first.

Timing and synchronization:

Correctly managing clocks, resets, and avoiding hazards (glitches) requires rigor.

Debugging:

Without printf or debugger, debugging is done through waveforms. Knowing which signals to observe is crucial.


PART C - Detailed Technical Aspects

1. Introduction to VHDL

VHDL = VHSIC Hardware Description Language

VHSIC = Very High Speed Integrated Circuit

Hardware description language:

  • Describes the structure and behavior of digital circuits
  • Used for simulation and synthesis (real circuit generation)
  • IEEE Standard (IEEE 1076)

Difference from software programming:

AspectSoftwareHardware (VHDL)
ExecutionSequentialParallel
VariablesModified sequentiallySignals that evolve over time
LoopsIterationsHardware replication
ResourcesAbstract (virtual memory)Limited physical (LUT, FF)

2. Structure of a VHDL file

Entity:

Describes the component interface (input/output ports).

Example:

entity compteur is
  port (
    clk    : in  std_logic;
    reset  : in  std_logic;
    enable : in  std_logic;
    count  : out std_logic_vector(7 downto 0)
  );
end entity compteur;

Architecture:

Describes the behavior or internal structure of the component.

Example:

architecture behavioral of compteur is
  signal count_internal : unsigned(7 downto 0);
begin
  process(clk, reset)
  begin
    if reset = '1' then
      count_internal <= (others => '0');
    elsif rising_edge(clk) then
      if enable = '1' then
        count_internal <= count_internal + 1;
      end if;
    end if;
  end process;

  count <= std_logic_vector(count_internal);
end architecture behavioral;

Architecture types:

TypeDescriptionUsage
BehavioralAlgorithmic description of behaviorHigh level, automatic synthesis
DataflowConcurrent assignments, equationsCombinational logic
StructuralComponent instantiationHierarchy, block connection

3. VHDL data types

Standard types:

TypeDescriptionExample
std_logicLogic bit (9 values)'0', '1', 'Z', 'X', etc.
std_logic_vectorBit vector"10110101"
integerInteger-2147483648 to 2147483647
unsignedUnsigned integerused for arithmetic calculations
signedSigned integertwo's complement

std_logic values:

  • '0': low level
  • '1': high level
  • 'Z': high impedance (tri-state)
  • 'X': unknown (uninitialized)
  • '-': don't care (synthesis)

Required libraries:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;      -- std_logic, std_logic_vector
use IEEE.NUMERIC_STD.ALL;         -- unsigned, signed, conversion
use IEEE.STD_LOGIC_UNSIGNED.ALL;  -- operations on std_logic_vector (legacy)

4. Combinational logic

Concurrent assignment:

Assignments outside processes execute in parallel.

Examples:

-- AND gate
y <= a and b;

-- Multiplexer
y <= a when sel = '0' else b;

-- 4-to-1 multiplexer
with sel select
  y <= a when "00",
       b when "01",
       c when "10",
       d when others;

Combinational process:

To describe combinational logic within a process:

process(a, b, c)  -- Sensitivity list: all signals read
begin
  if a = '1' then
    y <= b;
  else
    y <= c;
  end if;
end process;

Warning: If a read signal is not in the sensitivity list, the process does not update - simulation/synthesis mismatch.

5. Sequential logic

D Flip-Flop:

Basic element of sequential logic. Stores a value on a clock edge.

process(clk)
begin
  if rising_edge(clk) then  -- Rising edge
    q <= d;
  end if;
end process;

With asynchronous reset:

process(clk, reset)
begin
  if reset = '1' then       -- Reset has priority
    q <= '0';
  elsif rising_edge(clk) then
    q <= d;
  end if;
end process;

With synchronous reset:

process(clk)
begin
  if rising_edge(clk) then
    if reset = '1' then
      q <= '0';
    else
      q <= d;
    end if;
  end if;
end process;

Golden rule:

  • Asynchronous reset: in the process sensitivity list and tested first
  • Synchronous reset: inside the rising_edge test

6. Finite State Machines (FSM)

Definition:

An FSM (Finite State Machine) is a sequential circuit with a finite number of states. It changes state based on inputs and the current state.

Types:

  • Moore: outputs depend only on the state
  • Mealy: outputs depend on the state and the inputs

2-process structure:

Process 1: State register (sequential)

process(clk, reset)
begin
  if reset = '1' then
    etat_courant <= IDLE;
  elsif rising_edge(clk) then
    etat_courant <= etat_suivant;
  end if;
end process;

Process 2: Transition logic (combinational)

process(etat_courant, entree)
begin
  case etat_courant is
    when IDLE =>
      if entree = '1' then
        etat_suivant <= TRAITEMENT;
      else
        etat_suivant <= IDLE;
      end if;
      sortie <= '0';

    when TRAITEMENT =>
      if compteur_fini = '1' then
        etat_suivant <= FIN;
      else
        etat_suivant <= TRAITEMENT;
      end if;
      sortie <= '1';

    when FIN =>
      etat_suivant <= IDLE;
      sortie <= '0';

    when others =>
      etat_suivant <= IDLE;
      sortie <= '0';
  end case;
end process;

State declaration:

type etat_type is (IDLE, TRAITEMENT, FIN);
signal etat_courant, etat_suivant : etat_type;

7. Arithmetic and Logic Unit (ALU)

Principle:

The ALU performs the arithmetic operations (addition, subtraction) and logic operations (AND, OR, XOR) of a processor.

Typical interface:

entity ALU is
  port (
    A      : in  std_logic_vector(31 downto 0);  -- Operand A
    B      : in  std_logic_vector(31 downto 0);  -- Operand B
    OP     : in  std_logic_vector(3 downto 0);   -- Operation code
    Result : out std_logic_vector(31 downto 0);  -- Result
    Zero   : out std_logic;                      -- Zero flag
    Carry  : out std_logic                       -- Carry flag
  );
end entity ALU;

Common operations:

OP CodeOperationDescription
0000ANDBitwise logical AND
0001ORBitwise logical OR
0010ADDAddition
0110SUBSubtraction (A - B)
0111SLTSet if Less Than (A < B)
1100NORLogical NOR

Implementation:

process(A, B, OP)
  variable temp : unsigned(32 downto 0);  -- 33 bits for carry
begin
  case OP is
    when "0000" =>  -- AND
      Result <= A and B;
      Carry <= '0';

    when "0001" =>  -- OR
      Result <= A or B;
      Carry <= '0';

    when "0010" =>  -- ADD
      temp := ('0' & unsigned(A)) + ('0' & unsigned(B));
      Result <= std_logic_vector(temp(31 downto 0));
      Carry <= temp(32);

    when "0110" =>  -- SUB
      temp := ('0' & unsigned(A)) - ('0' & unsigned(B));
      Result <= std_logic_vector(temp(31 downto 0));
      Carry <= temp(32);

    when others =>
      Result <= (others => '0');
      Carry <= '0';
  end case;

  -- Zero Flag
  if Result = x"00000000" then
    Zero <= '1';
  else
    Zero <= '0';
  end if;
end process;

8. Register file

Principle:

A set of registers (typically 16 or 32) for temporarily storing data in a processor.

Interface:

entity RegisterFile is
  port (
    clk       : in  std_logic;
    reset     : in  std_logic;
    -- Read
    ReadAddr1 : in  std_logic_vector(4 downto 0);   -- Register 1 address
    ReadAddr2 : in  std_logic_vector(4 downto 0);   -- Register 2 address
    ReadData1 : out std_logic_vector(31 downto 0);  -- Register 1 data
    ReadData2 : out std_logic_vector(31 downto 0);  -- Register 2 data
    -- Write
    WriteEn   : in  std_logic;                      -- Write enable
    WriteAddr : in  std_logic_vector(4 downto 0);   -- Write address
    WriteData : in  std_logic_vector(31 downto 0)   -- Data to write
  );
end entity RegisterFile;

Implementation:

architecture behavioral of RegisterFile is
  type reg_array is array (0 to 31) of std_logic_vector(31 downto 0);
  signal registers : reg_array;
begin
  -- Synchronous write
  process(clk, reset)
  begin
    if reset = '1' then
      registers <= (others => (others => '0'));
    elsif rising_edge(clk) then
      if WriteEn = '1' and WriteAddr /= "00000" then  -- R0 always 0
        registers(to_integer(unsigned(WriteAddr))) <= WriteData;
      end if;
    end if;
  end process;

  -- Asynchronous read
  ReadData1 <= registers(to_integer(unsigned(ReadAddr1)));
  ReadData2 <= registers(to_integer(unsigned(ReadAddr2)));
end architecture behavioral;

Special note:

Register R0 is often hardwired to zero (MIPS/RISC-V convention).

9. Memories

ROM (Read-Only Memory):

Used for storing the program (instructions).

architecture behavioral of ROM is
  type rom_array is array (0 to 255) of std_logic_vector(31 downto 0);
  constant rom_data : rom_array := (
    x"00000000",  -- NOP
    x"20010005",  -- ADDI R1, R0, 5
    x"20020003",  -- ADDI R2, R0, 3
    x"00221820",  -- ADD R3, R1, R2
    -- ... other instructions
    others => x"00000000"
  );
begin
  process(clk)
  begin
    if rising_edge(clk) then
      data_out <= rom_data(to_integer(unsigned(address)));
    end if;
  end process;
end architecture behavioral;

RAM (Random Access Memory):

Used for storing data.

architecture behavioral of RAM is
  type ram_array is array (0 to 1023) of std_logic_vector(31 downto 0);
  signal ram_data : ram_array := (others => (others => '0'));
begin
  process(clk)
  begin
    if rising_edge(clk) then
      if write_enable = '1' then
        ram_data(to_integer(unsigned(address))) <= data_in;
      end if;
      data_out <= ram_data(to_integer(unsigned(address)));
    end if;
  end process;
end architecture behavioral;

Using BRAM (Block RAM) on FPGA:

FPGAs have dedicated memory blocks (BRAM). Vivado automatically infers BRAM if the code follows certain patterns (as shown above).

10. Simple processor

Architecture:

A minimal processor contains:

  • Datapath: ALU, registers, memories, multiplexers
  • Control unit: FSM that generates control signals

Execution cycle:

  1. Fetch: Read the instruction from memory (PC → Instr_Mem → IR)
  2. Decode: Decode the instruction (opcode, register, immediate fields)
  3. Execute: Execute the operation (ALU)
  4. Memory: Data memory access if needed (LOAD/STORE)
  5. Write Back: Write the result to the destination register

Control signals:

The control unit generates signals to drive the datapath:

  • RegWrite: enable writing to the register file
  • ALUOp: ALU operation selection
  • MemRead, MemWrite: data memory read/write
  • MemToReg: data source selection for register write
  • Branch: enable conditional branching

Simplified example:

R-type instruction (register-register):

  • Opcode determines it is an R-type instruction
  • Read two source registers (rs, rt)
  • ALU performs the operation (funct determines which one)
  • Write the result to the destination register (rd)

11. Testbench and simulation

Testbench:

A VHDL file with no ports (empty entity) that instantiates the circuit under test (UUT - Unit Under Test) and generates stimuli.

Structure:

entity testbench is
  -- No ports
end entity testbench;

architecture test of testbench is
  -- Test signal declarations
  signal clk : std_logic := '0';
  signal reset : std_logic := '1';
  signal input : std_logic_vector(7 downto 0);
  signal output : std_logic_vector(7 downto 0);

  constant CLK_PERIOD : time := 10 ns;
begin
  -- Instantiation of the component under test
  UUT: entity work.mon_circuit
    port map (
      clk => clk,
      reset => reset,
      input => input,
      output => output
    );

  -- Clock generation
  clk_process: process
  begin
    clk <= '0';
    wait for CLK_PERIOD/2;
    clk <= '1';
    wait for CLK_PERIOD/2;
  end process;

  -- Stimulus generation
  stim_process: process
  begin
    reset <= '1';
    input <= x"00";
    wait for 50 ns;

    reset <= '0';
    wait for 10 ns;

    input <= x"12";
    wait for 20 ns;

    input <= x"34";
    wait for 20 ns;

    wait;  -- Stop simulation
  end process;
end architecture test;

Verification:

Use assertions for automatic checking:

assert (output = x"46") report "Error: incorrect result" severity error;

Waveform:

.wcfg files in Vivado allow configuring which signals to display and how. Essential for visual debugging.

12. Synthesis and implementation on FPGA

Steps:

1. Synthesis:

  • Conversion of VHDL code into a netlist (logic gates)
  • Logic optimization
  • Verification that the code is synthesizable

2. Implementation:

  • Placement: assign logic resources to FPGA elements (LUT, FF)
  • Routing: connect elements through interconnections
  • Timing verification: ensure timing constraints are met

3. Bitstream generation:

  • .bit file to be programmed into the FPGA

FPGA resources:

ResourceDescription
LUT (Look-Up Table)Implements combinational logic (arbitrary function of N inputs)
FF (Flip-Flop)D flip-flop for sequential logic
BRAM (Block RAM)Dedicated memory blocks (18 Kb or 36 Kb)
DSPHardware multiplier/accumulator blocks
IOConfigurable input/output pins

Constraints:

.xdc file (Xilinx Design Constraints):

  • Physical pin assignment
  • Clock definition (period, frequency)
  • Timing constraints

Example:

# 100 MHz clock
create_clock -period 10.000 -name clk [get_ports clk]

# Pins
set_property PACKAGE_PIN W5 [get_ports clk]
set_property IOSTANDARD LVCMOS33 [get_ports clk]

set_property PACKAGE_PIN U16 [get_ports led[0]]
set_property IOSTANDARD LVCMOS33 [get_ports led[0]]

PART D - Reflective Analysis and Perspectives

Skills acquired

Hardware design:

Mastery of VHDL for describing synthesizable digital circuits. Ability to think in terms of hardware parallelism rather than software sequences.

Processor architecture:

Deep understanding of a processor's internal workings: ALU, registers, memories, control unit, datapath.

Professional tools:

Use of Xilinx Vivado, industry-standard tool for FPGA. Simulation, synthesis, implementation, timing analysis.

Key takeaways

1. VHDL = hardware description, not programming:

VHDL code describes hardware that physically exists. Everything executes in parallel. A fundamental paradigm shift.

2. Synthesizability:

Not all VHDL code can be synthesized into hardware. Avoid wait for (except in testbenches), infinite loops, division by variables, etc.

3. Clock and synchronism:

Synchronous circuits (clocked) are more robust and predictable. Always use rising_edge(clk) to detect edges.

4. Testbench = essential:

Simulation with testbench allows validating functionality before synthesis. Debugging is much easier than after implementation.

5. Limited resources:

FPGAs have a finite number of LUTs, FFs, BRAMs. Optimization and resource sharing are important.

Practical applications

Signal processing:

Implementation of high-performance digital filters (FIR, IIR), FFT, real-time image processing.

Hardware accelerators:

Acceleration of intensive computations (cryptography, compression, AI) by offloading the processor.

Critical embedded systems:

Aerospace, automotive, and space industries require deterministic and reliable circuits (hardwired logic on FPGA or ASIC).

ASIC prototyping:

FPGAs are used to validate designs before ASIC fabrication (high cost, non-modifiable).

Experience feedback

Learning curve:

VHDL is more difficult than traditional programming at first. It requires understanding the underlying hardware.

Progressive projects:

The progression Buffer → ALU → Registers → Memories → Processor is pedagogically excellent. Each step adds complexity.

Vivado: powerful but complex:

The tool is comprehensive but the interface can be intimidating. Synthesis/implementation times can be long.

Visual debugging:

Waveforms are indispensable. Knowing how to identify critical signals to observe greatly facilitates debugging.

Limitations and future directions

Module limitations:

  • No physical implementation on an FPGA board (simulation only)
  • Very simplified processor (no pipeline, cache, etc.)
  • Timing and optimization aspects not thoroughly covered

Future directions:

  • Advanced processors: pipeline, branch prediction, cache
  • Systems on chip (SoC): processor + integrated peripherals
  • High-Level Synthesis (HLS): generate VHDL from C/C++
  • Formal verification: assertions, model checking
  • ASIC design: application-specific integrated circuit design

Technological evolution

Current trends:

HLS (High-Level Synthesis):

Tools like Vivado HLS or Vitis HLS allow writing in C/C++ and automatically generate VHDL/Verilog. This accelerates development.

FPGA + AI:

Modern FPGAs (Versal, Zynq UltraScale+) integrate AI accelerators (DPU). Used for real-time neural network inference.

Adaptive FPGAs (Versal):

Hybrid architecture: FPGA + CPU + DSP + AI. Dynamic reconfiguration.

Cloud FPGA:

AWS, Azure, Alibaba offer instances with FPGA for cloud acceleration.

Tips for success

1. Think hardware:

Mentally visualize the gates, flip-flops, and multiplexers generated by your VHDL code.

2. Simulate early and often:

Don't wait until all the code is written. Test each module independently.

3. Comment well:

VHDL is verbose. Clear comments help when re-reading the code later.

4. Follow conventions:

Explicit signal names (clk, reset, enable), consistent indentation.

5. Use appropriate types:

unsigned/signed for arithmetic, std_logic_vector for generic data buses.

Conclusion

This module provides an excellent introduction to digital design with VHDL and FPGA. Building a complete processor, even a simple one, enables a deep understanding of computer architecture.

Transferable skills:

  • Parallel thinking and hardware design
  • Rigor in specification and verification
  • Mastery of professional tools (Vivado)
  • Understanding of processor architecture

Professional relevance:

FPGAs are used in many domains (telecommunications, defense, medical, automotive, finance). Demand for FPGA engineers remains strong.

Key message:

VHDL and FPGAs offer total control over hardware, enabling performance and energy efficiency unattainable with software alone. Mastering these tools opens opportunities in high-performance systems.

Recommendations:

  • Practice on real FPGA boards (Basys, Arty, DE10-Nano)
  • Explore more complex projects (controllers, interfaces, DSP)
  • Learn Verilog as a complement (concurrent language)
  • Train in HLS to accelerate development
  • Study real processor architectures (open-source RISC-V)

Links to other courses:


BE Pipeline Project Illustrations

Pipeline processor datapath

Figure: Pipeline processor datapath

ALU schematic

Figure: Arithmetic and Logic Unit (ALU) schematic

Register file

Figure: Register file (Register Bench)

Instruction memory

Figure: Instruction memory

Data memory

Figure: Data memory

Pipeline hazard management

Figure: Pipeline hazard management


Reports and Projects

VHDL Project Report

Design study report: pipeline processor design in VHDL, including ALU, register file, memories and hazard management.

Download PDF report


Course Documents

Complete VHDL Course

Complete VHDL course: syntax, processes, state machines, simulation and synthesis for FPGA.

Download

RISC Processor Project

Design study project brief: RISC microprocessor design in VHDL with ALU, register file and memory.

Download


Course taken in 2023-2024 at INSA Toulouse, Department of Electrical and Computer Engineering.