ProtoTracer  1.0
Real-time 3D rendering and animation engine
Loading...
Searching...
No Matches
FFTVoiceDetection< peakCount > Class Template Reference

Detects visemes based on FFT voice analysis. More...

#include <FFTVoiceDetection.h>

Inheritance diagram for FFTVoiceDetection< peakCount >:
Collaboration diagram for FFTVoiceDetection< peakCount >:

Public Member Functions

 FFTVoiceDetection ()
 Constructs a new FFTVoiceDetection instance.
 
void SetThreshold (float threshold)
 Sets the threshold for formant calculations.
 
float GetViseme (MouthShape viseme)
 Retrieves the probability of a specific viseme.
 
void PrintVisemes ()
 Prints the probabilities of all visemes to the serial console.
 
void ResetVisemes ()
 Resets all viseme probabilities to zero.
 
void Update (float *peaks, float maxFrequency)
 Updates the viseme probabilities based on new FFT data.
 

Private Member Functions

void CalculateFormants (float *peaks, uint8_t bandwidth)
 Calculates formant frequencies (F1 and F2) from FFT peaks.
 
void CalculateVisemeGroup ()
 Calculates the viseme group probabilities based on formants.
 

Private Attributes

Vector2D visEE = Vector2D(350.0f, 3200.0f)
 Coordinates for "EE".
 
Vector2D visAE = Vector2D(500.0f, 2700.0f)
 Coordinates for "AE".
 
Vector2D visUH = Vector2D(1100.0f, 2700.0f)
 Coordinates for "UH".
 
Vector2D visAR = Vector2D(850.0f, 850.0f)
 Coordinates for "AR".
 
Vector2D visER = Vector2D(1000.0f, 1000.0f)
 Coordinates for "ER".
 
Vector2D visAH = Vector2D(900.0f, 2400.0f)
 Coordinates for "AH".
 
Vector2D visOO = Vector2D(600.0f, 600.0f)
 Coordinates for "OO".
 
Vector2Dcoordinates [visemeCount] = { &visEE, &visAE, &visUH, &visAR, &visER, &visAH, &visOO }
 Array of viseme coordinates.
 
float visRatioEE = 0.0f
 Probability for "EE".
 
float visRatioAE = 0.0f
 Probability for "AE".
 
float visRatioUH = 0.0f
 Probability for "UH".
 
float visRatioAR = 0.0f
 Probability for "AR".
 
float visRatioER = 0.0f
 Probability for "ER".
 
float visRatioAH = 0.0f
 Probability for "AH".
 
float visRatioOO = 0.0f
 Probability for "OO".
 
floatvisRatios [visemeCount] = { &visRatioEE, &visRatioAE, &visRatioUH, &visRatioAR, &visRatioER, &visRatioAH, &visRatioOO }
 Array of viseme probabilities.
 
PeakDetection< peakCountpeakDetection = PeakDetection<peakCount>(8, 2.0f, 0.5f)
 Peak detection instance.
 
RunningAverageFilter< 10 > peakSmoothing = RunningAverageFilter<10>(0.1f)
 Smoothing filter for peak data.
 
bool peaksBinary [peakCount]
 Binary array indicating peak presence.
 
float peakDensity [peakCount]
 Array of peak densities.
 
float f1
 Formant frequency F1.
 
float f2
 Formant frequency F2.
 
float threshold = 400.0f
 Threshold for formant calculations.
 

Static Private Attributes

static const uint8_t visemeCount = 7
 Number of supported visemes.
 

Additional Inherited Members

- Public Types inherited from Viseme
enum  MouthShape {
  EE , AE , UH , AR ,
  ER , AH , OO , SS
}
 Enumerates the possible mouth shapes for viseme detection. More...
 

Detailed Description

template<size_t peakCount>
class FFTVoiceDetection< peakCount >

Detects visemes based on FFT voice analysis.

The FFTVoiceDetection class uses formant frequencies (F1 and F2) derived from FFT peaks to detect and assign probabilities to various mouth shapes (visemes). It employs peak detection, smoothing filters, and threshold-based calculations to determine the most probable viseme.

Template Parameters
peakCountThe number of peaks to analyze in the FFT data.

Definition at line 53 of file FFTVoiceDetection.h.

Constructor & Destructor Documentation

◆ FFTVoiceDetection()

template<size_t peakCount>
FFTVoiceDetection ( )
inline

Constructs a new FFTVoiceDetection instance.

Definition at line 107 of file FFTVoiceDetection.h.

Member Function Documentation

◆ CalculateFormants()

template<size_t peakCount>
void CalculateFormants ( float peaks,
uint8_t  bandwidth 
)
private

Calculates formant frequencies (F1 and F2) from FFT peaks.

Parameters
peaksArray of FFT peak values.
bandwidthBandwidth of the FFT data.

◆ CalculateVisemeGroup()

template<size_t peakCount>
void CalculateVisemeGroup ( )
private

Calculates the viseme group probabilities based on formants.

◆ GetViseme()

◆ PrintVisemes()

template<size_t peakCount>
void PrintVisemes ( )

Prints the probabilities of all visemes to the serial console.

◆ ResetVisemes()

template<size_t peakCount>
void ResetVisemes ( )

Resets all viseme probabilities to zero.

◆ SetThreshold()

◆ Update()

Member Data Documentation

◆ coordinates

template<size_t peakCount>
Vector2D* coordinates[visemeCount] = { &visEE, &visAE, &visUH, &visAR, &visER, &visAH, &visOO }
private

Array of viseme coordinates.

Definition at line 66 of file FFTVoiceDetection.h.

◆ f1

template<size_t peakCount>
float f1
private

Formant frequency F1.

Definition at line 85 of file FFTVoiceDetection.h.

◆ f2

template<size_t peakCount>
float f2
private

Formant frequency F2.

Definition at line 86 of file FFTVoiceDetection.h.

◆ peakDensity

template<size_t peakCount>
float peakDensity[peakCount]
private

Array of peak densities.

Definition at line 83 of file FFTVoiceDetection.h.

◆ peakDetection

template<size_t peakCount>
PeakDetection<peakCount> peakDetection = PeakDetection<peakCount>(8, 2.0f, 0.5f)
private

Peak detection instance.

Definition at line 79 of file FFTVoiceDetection.h.

◆ peaksBinary

template<size_t peakCount>
bool peaksBinary[peakCount]
private

Binary array indicating peak presence.

Definition at line 82 of file FFTVoiceDetection.h.

◆ peakSmoothing

template<size_t peakCount>
RunningAverageFilter<10> peakSmoothing = RunningAverageFilter<10>(0.1f)
private

Smoothing filter for peak data.

Definition at line 80 of file FFTVoiceDetection.h.

◆ threshold

template<size_t peakCount>
float threshold = 400.0f
private

Threshold for formant calculations.

Definition at line 88 of file FFTVoiceDetection.h.

◆ visAE

template<size_t peakCount>
Vector2D visAE = Vector2D(500.0f, 2700.0f)
private

Coordinates for "AE".

Definition at line 59 of file FFTVoiceDetection.h.

◆ visAH

template<size_t peakCount>
Vector2D visAH = Vector2D(900.0f, 2400.0f)
private

Coordinates for "AH".

Definition at line 63 of file FFTVoiceDetection.h.

◆ visAR

template<size_t peakCount>
Vector2D visAR = Vector2D(850.0f, 850.0f)
private

Coordinates for "AR".

Definition at line 61 of file FFTVoiceDetection.h.

◆ visEE

template<size_t peakCount>
Vector2D visEE = Vector2D(350.0f, 3200.0f)
private

Coordinates for "EE".

Definition at line 58 of file FFTVoiceDetection.h.

◆ visemeCount

template<size_t peakCount>
const uint8_t visemeCount = 7
staticprivate

Number of supported visemes.

Definition at line 55 of file FFTVoiceDetection.h.

◆ visER

template<size_t peakCount>
Vector2D visER = Vector2D(1000.0f, 1000.0f)
private

Coordinates for "ER".

Definition at line 62 of file FFTVoiceDetection.h.

◆ visOO

template<size_t peakCount>
Vector2D visOO = Vector2D(600.0f, 600.0f)
private

Coordinates for "OO".

Definition at line 64 of file FFTVoiceDetection.h.

◆ visRatioAE

template<size_t peakCount>
float visRatioAE = 0.0f
private

Probability for "AE".

Definition at line 70 of file FFTVoiceDetection.h.

◆ visRatioAH

template<size_t peakCount>
float visRatioAH = 0.0f
private

Probability for "AH".

Definition at line 74 of file FFTVoiceDetection.h.

◆ visRatioAR

template<size_t peakCount>
float visRatioAR = 0.0f
private

Probability for "AR".

Definition at line 72 of file FFTVoiceDetection.h.

◆ visRatioEE

template<size_t peakCount>
float visRatioEE = 0.0f
private

Probability for "EE".

Definition at line 69 of file FFTVoiceDetection.h.

◆ visRatioER

template<size_t peakCount>
float visRatioER = 0.0f
private

Probability for "ER".

Definition at line 73 of file FFTVoiceDetection.h.

◆ visRatioOO

template<size_t peakCount>
float visRatioOO = 0.0f
private

Probability for "OO".

Definition at line 75 of file FFTVoiceDetection.h.

◆ visRatios

template<size_t peakCount>
float* visRatios[visemeCount] = { &visRatioEE, &visRatioAE, &visRatioUH, &visRatioAR, &visRatioER, &visRatioAH, &visRatioOO }
private

Array of viseme probabilities.

Definition at line 77 of file FFTVoiceDetection.h.

◆ visRatioUH

template<size_t peakCount>
float visRatioUH = 0.0f
private

Probability for "UH".

Definition at line 71 of file FFTVoiceDetection.h.

◆ visUH

template<size_t peakCount>
Vector2D visUH = Vector2D(1100.0f, 2700.0f)
private

Coordinates for "UH".

Definition at line 60 of file FFTVoiceDetection.h.


The documentation for this class was generated from the following file: