For Immediate Release, April 1, 2025: University of Michigan Press will publish all of the content on Meta platforms as a series of printed books.
https://www.linkedin.com/posts/charles-watkinson-7553a257_amphibians-and-reptiles-of-the-great-lakes-activity-7312775744932179968-sLSu
Resulting from an @snsf_ch SPARK grant this took some time to mature, but the outcome is very imformative and builds a foundation for where to head next - how to liberate facts/information locked in the published literature #textmining #biodiversity https://preprints.arphahub.com/article/153174/
How does academic language shape a discipline? Daniel Erdmann @bbf_dipfberlin uses #TextMining to analyze the history of educational sciences in Germany after 1945—tracking language patterns, discourse shifts & conceptual trends in scholarly journals.
#DigitalHumanities #rstats
Read more: https://href.hypotheses.org/4324 @NFDI4Memory @VHD
Our colleague Hidir Arras from patent4science research is co-organizing the 6th PatentSemTech Workshop at #SIGIR2025 in the beautiful city of Padua, Italy! Call for Papers is open 'til April 23: http://ifs.tuwien.ac.at/patentsemtech/
Submit your cutting-edge research, case studies, and demos exploring #AI, #NLP, and #TextMining innovations applied to #IP and related domains.
From Aachen to Zwickau: Mapping Correspondence in the Wiener Zeitung
At #Dhd2025 Nina C. Rastinger & Claudia Resch explore semi-automated methods for identifying correspondence locations in historical press research.
Key insights:
Transkribus enables tailored transcription with Field Model Training
21,793 headlines & 129,326 tokens analyzed
Uneven network density: Strong news flow from London (enemy) & Paris (ally)
Automating Nature Detection in Historical Travelogues?
At #Dhd2025 Michela Vignoli & Doris Gruber (ONiT Project) explore how #LLM Llama 3.1 70B can analyze nature representations in multilingual travel reports
Challenges remain:
LLMs always produces results—even with flawed data
LLM-corrected texts did not improve searchability in vector databases (3–14% drop)
Conclusion: LLMs aids discovery but manual review is essential for a reliable dataset.
[Atelier Data] Le lab INA organise un atelier @iscpif le 12 mars à 17h30 consacré à l’exploration (#statistique, #TAL…) de transcriptions de JT TF1 et FR2
Il reste encore quelques places : https://framaforms.org/atelier-donnees-ina-1739180738
Une certaine autonomie avec les outils d'analyse quantitative (Python ou R, CSV, etc.) est nécessaire afin de pouvoir profiter pleinement de l'atelier.
Resulting from an Swiss National Science Foundation SNSF SPARK grant this took some time to mature, but the outcome is very imformative and builds a foundation for where to head next - how to liberate facts/information locked in the published literature #textmining #biodiversity https://www.biorxiv.org/content/10.1101/2025.02.18.638830v1
Nouveaux outils pour la fouille de textes !
L'infrastructure Istex lance 2 nouveaux services puissants pour l'analyse de documents : Teeft : Extraction rapide des termes clés
TermSuite : Extraction terminologique avancée
#TextMining #DataMining #ScienceOuverte
https://www.inist.fr/nos-actualites/nouveaux-outils-pour-la-fouille-de-textes-decouvrez-teeft-et-termsuite/
hi hivemind. let's say we want millions of unique IUPAC names from (organic chemistry) literature, how would you do this? Nothing else needed than the full, complete IUPAC name?