Speech recognition system by neural network based on FPGA

 

Huazhen Yu, Hu Fan, Bo Liu

The southeast University, Nanjing, China

 

 

Overview

This design is for speech recognition for keywords recognition, the key word is “Hello DongDong”.

The speech feature is captured by MFCC of each 1.5s speech with 148 frames at the front end. The procedure of the MFCC algorithm is shown as follow.

img1

 

MFCC

LSTM neural network is used to recognize this keyword in this design. The structure of neural network is consisted of one LSTM layer with 30 LSTM cells and one full connect layer with two units.

 The demo records the voice by MATLAB on PC, and then send the MFCC feature to FPGA by UART. The architecture is as follow.

C:\Users\MICROF~1\AppData\Local\Temp\1544763573(1).png

 

Top architecture of RTL

   The LSTM layer’s RTL architecture is show as follow figure.

C:\Users\MICROF~1\AppData\Local\Temp\1544764617(1).png

 

LSTM layer top achitecture

The LSTM cell is implemented by follow architecture.

img4

 

 

LSTM CELL RTL architecture

The operation parallelism is adjustable.

Source Code Github Link

https://github.com/Compiler-sim/RTL-Testbench/tree/master/LSTM%2BFC

发布时间:2019-08-21 23:04
浏览量:0
点赞
收藏
Home    Catalog    Previous OpenHW    Speech recognition system by neural network based on FPGA