[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jemoka / Jemoka Knowledge Base / raw/concept/kbhchlasta_2021.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- title: "Chlasta 2021" source: https://www.jemoka.com/posts/kbhchlasta_2021/ tags: [ntj] --- DOI: 10.3389/fpsyg.2020.623237 One-Liner (thrice) Used features extracted by VGGish from raw acoustic audio against a SVM, Perceptron, 1NN; got \(59.1\%\) classif. accuracy for dementia Then, trained a CNN on raw wave-forms and got \(63.6\%\) accuracy Then, they fine-tuned a VGGish on the raw wave-forms and didn’t report their results and just said “we discovered that audio transfer learning with a pretrained VGGish feature extractor performs better” Gah! Novelty Threw the kitchen sink to process only raw acoustic input, most of it missed; wanted 0 human involvement. It seems like last method is promising. Notable Methods fine-tuning VGGish against raw acoustic waveforms to build a classifier via a CNN. Key Figs Their fancy network Its just a CNN afaik with much maxpooling; could have used some skipped connections. I wonder if it overfit? Their actual training results Looks generally pretty bad, but a run of their DemCNN seem to have gotten state-of-the-art results. Not sure where transfer training data went. New Concepts VGGish Notes Accuracy question According to this the state of the art at the time from pure audio was 56.6%? For a binary classifier isn’t that just doing nothing? So somebody did get better before?