language model and long and shortterm memory recurrent neural network speech model 3 Decoder The ultimate goal of speech recognition is to find the optimal word sequence in the search space composed of various possible word sequences This essentially belongs to the category of search algorithm or decoding algorithm that is the task to be completed by the decoder Search space Speech recognition searches for the optimal word
sequence All possible word sequence candidates constitute the search space in the decoding process There are many ways to construct the decoding search space which can be divided into two categories dynamic compilation and Germany Telegram Number Data decoding space and static compilation and decoding space Dynamic compilation just precompiles the pronunciation dictionary into a state network to form a search space Other knowledge sources are dynamically integrated during the decoding process according to the historical information carried on the active path The static compilation and decoding space is to compile all knowledge sources into a
state network and obtain probability information based on the transfer weights between nodes during the decoding process Dynamic search space decoding algorithm The problem of finding the optimal word sequence in speech recognition can be transformed into the problem of finding the optimal state sequence in the search space composed of a tree dictionary This problem is generally solved using the Viterbi Vieri