Speaker Recognition using RBF Neural Network Trained LPC and MFCC Features

$ 160.00

Total downloads: 6

This code is written in MATLAB 2017a version for speaker recognition using LPC and MFCC features. Results of recognition accuracy by both features set are compared and it is analysed that MFCC features perform well for speaker recognition. Radial Basis Function in neural network is used to classify those features.

For training features extraction 5 differenent speakers including maleand female both are selected and each speaker had spoken five words -“next,pause,play,return,stop”, 12 times repeating the same word. This willconstitute the data set of 300 features for each LPC coefficient. Thetesting features' matrix will consist of 300 rows and LPC order numberof columns.

Description

Contents

%%%%%%%%% To check RBFNN training progress, kindly see teh command window
%%%%%%%%% during execution


clear all;
clc;
close all

========== extract LPC coeffcients from the Training speech signals========

for training features extrcation 5 differenent speakers including male and female both are selected and each speaker had spoken five words – “next,pause,play,return,stop”, 12 times repeating the same word. This will constitute the data set of 300 features for each LPC coefficient. The testing features’ matrix will consist of 300 rows and LPC order number of columns.

%--------------------------------------------------------------------------
% read the testing files folders
trainingdir=[cd '\Training Speech Signal'];
order=12; % LPC order

[trainingfeatures , target]= LPC_features(trainingdir,order); % call function to extract LPC features
[MFCC_trainingfeatures, MFCC_target]= MFCC_features(trainingdir,order);% call function to extract MFCC features
save LPC_trainingdata trainingfeatures target
save MFCC_trainingdata MFCC_trainingfeatures MFCC_target

======== extract the LPC coefficients for the testing speech signal

each speaker has spoken 6 times the same word as for teh training dataset

testingdir=[cd '\testing Speech Signal'];
order=12; % LPC order

[testingfeatures, ref]= LPC_features(testingdir,order); % call function to extract LPC features
[MFCC_testingfeatures, MFCC_ref]= MFCC_features(testingdir,order);% call function to extract MFCC features
save LPC_testingdata testingfeatures
save MFCC_testingdata MFCC_testingfeatures

========Radial basis Neural Network training for LPC=====================

input_layer_size  = order;  % input layer size
hidden_layer_size = 25;   % 25 hidden units
num_labels = 5;          % 10 labels, from 1 to 5
eg = 0.02; % sum-squared error goal
sc = 1;    % spread constant
a=radbas(trainingfeatures);
figure(1)
plot(trainingfeatures,a)
title('Radial basis Transfer function for LPC featurs ')
xlabel('Input Training Features');
ylabel('Output a');
net=newrb(trainingfeatures',target',eg,sc); % create radial basis network
Y = sim(net,testingfeatures');
output=round(Y);% output label
figure(2)
plot(trainingfeatures(:,1),target,'*')
hold on
plot(testingfeatures(:,1),output,'r*')
legend('Target','output')
title('Labels for testing & trained LPC features for 1st order')
xlabel('Input');
ylabel('Labels');
NEWRB, neurons = 0, MSE = 2
NEWRB, neurons = 50, MSE = 0.687671
NEWRB, neurons = 100, MSE = 0.329234
NEWRB, neurons = 150, MSE = 0.169144
NEWRB, neurons = 200, MSE = 0.0809509

Evaluation for LPC

for ii=1:5
    [accuracy(ii),precision(ii), recall(ii) ]= Evaluate(ref',output,ii);
end
  LPC_acc=mean(accuracy)*100;   % accuracy
  LPC_preci=mean(precision)*100;% precision
  LPC_re=mean(recall)*100;% recall

========Radial basis Neural Network training for MFCC=====================

input_layer_size  = order;  % input layer size
hidden_layer_size = 25;   % 25 hidden units
num_labels = 5;          % 10 labels, from 1 to 5
a=radbas(MFCC_trainingfeatures);
figure(3)
plot(MFCC_trainingfeatures,a)
title('Radial basis Transfer function for MFCC features ')
xlabel('Input Training Features');
ylabel('Output a');
net=newrb(MFCC_trainingfeatures',MFCC_target',eg,sc); % create radial basis network
Y = sim(net,MFCC_testingfeatures');
MFCC_output=round(Y);% output label
figure(4)
plot(MFCC_trainingfeatures(:,1),MFCC_target,'*')
hold on
plot(MFCC_testingfeatures(:,1),MFCC_output,'r*')
legend('Target','output')
title('Labels for testing & trained MFCC features for 1st order')
xlabel('Input');
ylabel('Labels');
NEWRB, neurons = 0, MSE = 2
NEWRB, neurons = 50, MSE = 0.558296
NEWRB, neurons = 100, MSE = 0.266055
NEWRB, neurons = 150, MSE = 0.124313
NEWRB, neurons = 200, MSE = 0.0321756

Evaluation for MFCC

for ii=1:5
    [accuracy(ii),precision(ii), recall(ii) ]= Evaluate(MFCC_ref',MFCC_output,ii);
end
  MFCC_acc=mean(accuracy)*100;   % accuracy
  MFCC_preci=mean(precision)*100;% precision
  MFCC_re=mean(recall)*100;% recall

Plotting of comparison

figure(5)
bar([LPC_acc,MFCC_acc;LPC_preci,MFCC_preci;LPC_re,MFCC_re])
set(gca,'XTick',0)
text(0.8,-5,'Accuracy')
text(1.8,-5,'Precision')
text(2.8,-5,'Recall')
legend('LPC','MFCC')
title('Comparison Plot for evaluation parameters')

Queries

1. Is this for speech recognition or speaker recognition.
Ans: Speaker recognition
2. Plot the original signal with both LPC and MFCC estimated signal (features extraction stage) as well as the prediction error for both methods.
Ans: Done
3. Plot Error spectrum and signal spectrum in both.
Ans: Done
4. Plot the original signal and estimated signal after RBFN classification for both.
Ans: Figure 4 is the ans.
5. Calculate the learning rate, recognition rate and MC rate.
Ans: no need to provide input in RBFNN as learning rate , recognition rate. For more information kindly check out the rbfnn toolbox in MATLAB
6. Cite the reference for your speech files.
Ans: self recorded with six different objects
7. Explain the way that you used in convert the speech signal in dataset.
Ans: used audioread function of signal processing toolbox
8. what is the percentage for training and testing data?
Ans: 70% and 30% respectively

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.