2022-10-09 Franny Gaede - text analysis at DLF

franny gaede (@mfgaede): #LearnAtDLF Text analysis workshop: starts with a quote from https://aclanthology.org/P99-1001.pdf, used in a variety of disciplines (bunch of citations!) #DLFforum

Twitter link: https://twitter.com/mfgaede/status/1579168675126730753

5 questions that text analysis can help you answer:

  1. what are these texts about?
  2. how are these texts connected?
  3. what emotions (affects) are found within these texts?
  4. what names are used in these texts?
  5. which of these texts are most similar? #DLFforum

Question 1: what are these texts about? Can do word frequency analysis. Example: https://www.cameronblevins.org/posts/text-analysis-of-martha-ballards-diary-part-3/ #DLFforum

Question 1: what are these texts about? Can do collocation analysis. Example: https://doi.org/10.1093/notesj/gjz084 #DLFforum

Question 1: what are these texts about? TF-IDF analysis (term frequency-inverse document frequency analysis) #DLFforum

Question 1: what are these texts about? Topic modeling. Example: http://signsat40.signsjournal.org/topic-model/ #DLFforum

Question 2: how are these texts connected? Concordance. Example: https://www.loc.gov/item/31013167/, https://www.jstor.org/understand

Question 2: how are these texts connected? Network analysis. Example: Twitter suggested for you #DLFforum

Question 2: How are these texts connected? Network analysis. Example: https://journals.sagepub.com/doi/full/10.1177/20563051211055442 #DLFforum

Question 2: how are these texts connected? Bibliometric analysis. https://doi.org/10.1007/s10551-019-04129-4 #DLFforum

Question 3: what emotions (affects) are found within texts? Sentiment analysis. https://pmj.bmj.com/content/98/1161/544 #DLFforum

Question 4: What names are used in these texts? Named entity recognition. #DLFforum

Question 5: Which of these texts are most similar? Authorship attribution. Example: https://www.scientificamerican.com/article/how-a-computer-program-helped-show-jk-rowling-write-a-cuckoos-calling/ #DLFforum

Question 5: Which of these texts are most similar? Clustering. Examples: topic modeling is a specialized type! also: market segmentation, social network analysis, search result grouping, medical imaging, image segmentation, anomaly detection #DLFforum

Question 5: Which of these texts are most similar? Supervised machine learning. Examples: Image and object recognition, predictive analytics, customer sentiment analysis, spam detection (known/predetermined topics) #DLFforum


Pages that link to this page