Speech recognition system by neural network based on FPGA
Huazhen Yu, Hu Fan, Bo Liu
The southeast University, Nanjing, China
This design is for speech recognition for keywords recognition, the key word is “Hello DongDong”.
The speech feature is captured by MFCC of each 1.5s speech with 148 frames at the front end. The procedure of the MFCC algorithm is shown as follow.
LSTM neural network is used to recognize this keyword in this design. The structure of neural network is consisted of one LSTM layer with 30 LSTM cells and one full connect layer with two units.
Top architecture of RTL
The LSTM layer’s RTL architecture is show as follow figure.
LSTM layer top achitecture
The LSTM cell is implemented by follow architecture.
LSTM CELL RTL architecture
The operation parallelism is adjustable.
Source Code Github Link