![]() |
ProtoTracer
1.0
Real-time 3D rendering and animation engine
|
Detects visemes based on FFT voice analysis. More...
#include <FFTVoiceDetection.h>
Public Member Functions | |
FFTVoiceDetection () | |
Constructs a new FFTVoiceDetection instance. | |
void | SetThreshold (float threshold) |
Sets the threshold for formant calculations. | |
float | GetViseme (MouthShape viseme) |
Retrieves the probability of a specific viseme. | |
void | PrintVisemes () |
Prints the probabilities of all visemes to the serial console. | |
void | ResetVisemes () |
Resets all viseme probabilities to zero. | |
void | Update (float *peaks, float maxFrequency) |
Updates the viseme probabilities based on new FFT data. | |
Private Member Functions | |
void | CalculateFormants (float *peaks, uint8_t bandwidth) |
Calculates formant frequencies (F1 and F2) from FFT peaks. | |
void | CalculateVisemeGroup () |
Calculates the viseme group probabilities based on formants. | |
Private Attributes | |
Vector2D | visEE = Vector2D(350.0f, 3200.0f) |
Coordinates for "EE". | |
Vector2D | visAE = Vector2D(500.0f, 2700.0f) |
Coordinates for "AE". | |
Vector2D | visUH = Vector2D(1100.0f, 2700.0f) |
Coordinates for "UH". | |
Vector2D | visAR = Vector2D(850.0f, 850.0f) |
Coordinates for "AR". | |
Vector2D | visER = Vector2D(1000.0f, 1000.0f) |
Coordinates for "ER". | |
Vector2D | visAH = Vector2D(900.0f, 2400.0f) |
Coordinates for "AH". | |
Vector2D | visOO = Vector2D(600.0f, 600.0f) |
Coordinates for "OO". | |
Vector2D * | coordinates [visemeCount] = { &visEE, &visAE, &visUH, &visAR, &visER, &visAH, &visOO } |
Array of viseme coordinates. | |
float | visRatioEE = 0.0f |
Probability for "EE". | |
float | visRatioAE = 0.0f |
Probability for "AE". | |
float | visRatioUH = 0.0f |
Probability for "UH". | |
float | visRatioAR = 0.0f |
Probability for "AR". | |
float | visRatioER = 0.0f |
Probability for "ER". | |
float | visRatioAH = 0.0f |
Probability for "AH". | |
float | visRatioOO = 0.0f |
Probability for "OO". | |
float * | visRatios [visemeCount] = { &visRatioEE, &visRatioAE, &visRatioUH, &visRatioAR, &visRatioER, &visRatioAH, &visRatioOO } |
Array of viseme probabilities. | |
PeakDetection< peakCount > | peakDetection = PeakDetection<peakCount>(8, 2.0f, 0.5f) |
Peak detection instance. | |
RunningAverageFilter< 10 > | peakSmoothing = RunningAverageFilter<10>(0.1f) |
Smoothing filter for peak data. | |
bool | peaksBinary [peakCount] |
Binary array indicating peak presence. | |
float | peakDensity [peakCount] |
Array of peak densities. | |
float | f1 |
Formant frequency F1. | |
float | f2 |
Formant frequency F2. | |
float | threshold = 400.0f |
Threshold for formant calculations. | |
Static Private Attributes | |
static const uint8_t | visemeCount = 7 |
Number of supported visemes. | |
Additional Inherited Members | |
![]() | |
enum | MouthShape { EE , AE , UH , AR , ER , AH , OO , SS } |
Enumerates the possible mouth shapes for viseme detection. More... | |
Detects visemes based on FFT voice analysis.
The FFTVoiceDetection class uses formant frequencies (F1 and F2) derived from FFT peaks to detect and assign probabilities to various mouth shapes (visemes). It employs peak detection, smoothing filters, and threshold-based calculations to determine the most probable viseme.
peakCount | The number of peaks to analyze in the FFT data. |
Definition at line 53 of file FFTVoiceDetection.h.
|
inline |
Constructs a new FFTVoiceDetection instance.
Definition at line 107 of file FFTVoiceDetection.h.
Calculates the viseme group probabilities based on formants.
float GetViseme | ( | MouthShape | viseme | ) |
Retrieves the probability of a specific viseme.
viseme | The viseme to query. |
Referenced by AphoriAnimation::UpdateFFTVisemes(), ArtleckAnimationV2::UpdateFFTVisemes(), BasilGardenAnimation::UpdateFFTVisemes(), BroookAnimation::UpdateFFTVisemes(), ElGatoAnimation::UpdateFFTVisemes(), InfraredAnimation::UpdateFFTVisemes(), LeonHuskyAnimation::UpdateFFTVisemes(), MyntAnimation::UpdateFFTVisemes(), ProtobottAnimation::UpdateFFTVisemes(), SammyAnimation::UpdateFFTVisemes(), SergaliciousAnimation::UpdateFFTVisemes(), StrawberryAnimation::UpdateFFTVisemes(), TamamoAnimation::UpdateFFTVisemes(), TechSaneAnimation::UpdateFFTVisemes(), VesperAnimation::UpdateFFTVisemes(), WaffleDaProtoAnimation::UpdateFFTVisemes(), WarzoneAnimation::UpdateFFTVisemes(), Warzone2Animation::UpdateFFTVisemes(), XenraxAnimation::UpdateFFTVisemes(), HUB75AnimationSplit::UpdateFFTVisemes(), WS35AnimationSplit::UpdateFFTVisemes(), and ProtogenProject::UpdateFFTVisemes().
Prints the probabilities of all visemes to the serial console.
Sets the threshold for formant calculations.
threshold | The new threshold value. |
Referenced by AphoriAnimation::Update(), BasilGardenAnimation::Update(), BroookAnimation::Update(), HertzzAnimation::Update(), InfraredAnimation::Update(), SammyAnimation::Update(), SergaliciousAnimation::Update(), StrawberryAnimation::Update(), TamamoAnimation::Update(), Warzone2Animation::Update(), XenraxAnimation::Update(), GammaAnimation::Update(), WS35AnimationSplit::Update(), and ProtogenProject::UpdateFace().
Updates the viseme probabilities based on new FFT data.
peaks | Array of FFT peak values. |
maxFrequency | Maximum frequency represented in the FFT data. |
Referenced by AphoriAnimation::UpdateFFTVisemes(), ArtleckAnimationV2::UpdateFFTVisemes(), BasilGardenAnimation::UpdateFFTVisemes(), BroookAnimation::UpdateFFTVisemes(), ElGatoAnimation::UpdateFFTVisemes(), InfraredAnimation::UpdateFFTVisemes(), LeonHuskyAnimation::UpdateFFTVisemes(), MyntAnimation::UpdateFFTVisemes(), ProtobottAnimation::UpdateFFTVisemes(), SammyAnimation::UpdateFFTVisemes(), SergaliciousAnimation::UpdateFFTVisemes(), StrawberryAnimation::UpdateFFTVisemes(), TamamoAnimation::UpdateFFTVisemes(), TechSaneAnimation::UpdateFFTVisemes(), VesperAnimation::UpdateFFTVisemes(), WaffleDaProtoAnimation::UpdateFFTVisemes(), WarzoneAnimation::UpdateFFTVisemes(), Warzone2Animation::UpdateFFTVisemes(), XenraxAnimation::UpdateFFTVisemes(), AlphaAnimation::UpdateFFTVisemes(), GammaAnimation::UpdateFFTVisemes(), HUB75AnimationSplit::UpdateFFTVisemes(), WS35AnimationSplit::UpdateFFTVisemes(), and ProtogenProject::UpdateFFTVisemes().
|
private |
Array of viseme coordinates.
Definition at line 66 of file FFTVoiceDetection.h.
Formant frequency F1.
Definition at line 85 of file FFTVoiceDetection.h.
Formant frequency F2.
Definition at line 86 of file FFTVoiceDetection.h.
Array of peak densities.
Definition at line 83 of file FFTVoiceDetection.h.
|
private |
Peak detection instance.
Definition at line 79 of file FFTVoiceDetection.h.
Binary array indicating peak presence.
Definition at line 82 of file FFTVoiceDetection.h.
|
private |
Smoothing filter for peak data.
Definition at line 80 of file FFTVoiceDetection.h.
Threshold for formant calculations.
Definition at line 88 of file FFTVoiceDetection.h.
Coordinates for "AE".
Definition at line 59 of file FFTVoiceDetection.h.
Coordinates for "AH".
Definition at line 63 of file FFTVoiceDetection.h.
Coordinates for "AR".
Definition at line 61 of file FFTVoiceDetection.h.
Coordinates for "EE".
Definition at line 58 of file FFTVoiceDetection.h.
Number of supported visemes.
Definition at line 55 of file FFTVoiceDetection.h.
Coordinates for "ER".
Definition at line 62 of file FFTVoiceDetection.h.
Coordinates for "OO".
Definition at line 64 of file FFTVoiceDetection.h.
Probability for "AE".
Definition at line 70 of file FFTVoiceDetection.h.
Probability for "AH".
Definition at line 74 of file FFTVoiceDetection.h.
Probability for "AR".
Definition at line 72 of file FFTVoiceDetection.h.
Probability for "EE".
Definition at line 69 of file FFTVoiceDetection.h.
Probability for "ER".
Definition at line 73 of file FFTVoiceDetection.h.
Probability for "OO".
Definition at line 75 of file FFTVoiceDetection.h.
|
private |
Array of viseme probabilities.
Definition at line 77 of file FFTVoiceDetection.h.
Probability for "UH".
Definition at line 71 of file FFTVoiceDetection.h.
Coordinates for "UH".
Definition at line 60 of file FFTVoiceDetection.h.