NLP tools
·
Word
embeddings: GloVe (https://nlp.stanford.edu/projects/glove/),
Word2Vec (https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa),
BERT (https://pypi.org/project/bert-embedding//),
ELMo (https://allennlp.org/elmo),
RoBERTa
·
Stanford
NLP software (https://nlp.stanford.edu/software/)
·
Unigrams,
bigrams, trigrams
·
Linguistic
Inquiry and Word Count (LIWC): (https://repositories.lib.utexas.edu/bitstream/handle/2152/31333/LIWC2015_LanguageManual.pdf)
·
POS
tags: NLTK
·
Morphological
analysis:
o
Polyglot: https://polyglot.readthedocs.io/en/latest/MorphologicalAnalysis.html
o
Morfessor : https://morfessor.readthedocs.io/en/latest/
o
LegaliPy: http://syllabipy.com
·
Flesch
reading ease and other readability formulas (Kincaid et al 1975)
·
Speciteller: Specificity score (Li and Nenkova 2015)
·
Concreteness
score (Brysbaert et
al 2014)
·
Dictionary
of Affect and revised 2009 version (Whissell
1989, Whissell 2009)
·
Hedge
words and phrases (Ulinski
et al 2018)
·
textstat:
tools to extract readability measures from text (readability, complexity, and grade level)
·
Tools to restore punctuation in
unpunctuated text/ASR results:
Speech
approaches
·
Aenaes:
text/speech alignment (https://www.readbeyond.it/aeneas/)
·
MFCC
features
· Acoustic-prosodic features
o OpenSMILE (https://www.audeering.com/opensmile/)
o Parselmouth (https://parselmouth.readthedocs.io/en/stable/)
o Praat (https://www.fon.hum.uva.nl/praat/)
o
Prosodic labeling
o
http://www.speech.cs.cmu.edu/tobi/
o
https://www.ling.ohio-state.edu/research/phonetics/E_ToBI/
o Prosodic analysis: AuToBI – A Tool for Automatic ToBI annotation (https://github.com/AndrewRosenberg/AuToBI)
o Video series in speech acoustics:
·
ASR
o
Kaldi
(https://github.com/kaldi-asr/kaldi)
o
Google
Cloud Speech-to-Text (https://cloud.google.com/speech-to-text)
o
And
more: https://www.goodfirms.co/blog/best-free-open-source-speech-recognition-software
o
Basic
information: https://cmusphinx.github.io/wiki/tutorialconcepts/
·
TTS
o
Simon King Merlin
video tutorial: http://www.speech.zone/courses/one-off/merlin-interspeech2017/
o
http://www.cs.cmu.edu/~awb/synthesizers.html
·
Noise
reduction: (https://dl.acm.org/doi/10.1145/2964284.2967306)
o
Calculating
spectral centroids
o
MFCCs
o
Median
filtering
o
Spleeter
o
Denoising
script (multiple methods included)
· Old and new speech software:
o
SoX conversion
software: http://sox.sourceforge.net
o
http://linux-sound.org/speech.html
· Spectrogam reading practice:
o
https://home.cc.umanitoba.ca/~robh/howto.html
o
https://linguistics.ucla.edu/people/hayes/103/SpectrogramReading/index.htm
Visual
features
·
Fisher
Vector encoding (FV) (https://papers.nips.cc/paper/1998/file/db1915052d15f7815c8b88e879465a1e-Paper.pdf)
·
Vector
of Linearly Aggregated Descriptors (VLAD) (https://lear.inrialpes.fr/pubs/2010/JDSP10/jegou_compactimagerepresentation.pdf)
·
Facial
expression detection (FED) (https://www.jstor.org/stable/30204706?seq=1#metadata_info_tab_contents)
Statistical
measures
·
Pearson’s
correlation
·
Krippendorff’s alpha
·
Paired
T-tests
Machine
Learning
·
Weka
·
Scikit-learn
(https:/scikit-learn.org/stable)
Some other
potentially useful papers:
https://www.aclweb.org/anthology/W16-0301.pdf
https://www.aclweb.org/anthology/W17-3101.pdf
http://www.cs.columbia.edu/speech/PaperFiles/2019/clpsych19.pdf
http://www.cs.columbia.edu/speech/PaperFiles/2010/Hirschberg_etal2010.pdf