med-mastodon.com is one of the many independent Mastodon servers you can use to participate in the fediverse.
Medical community on Mastodon

Administered by:

Server stats:

363
active users

#machinelearning

37 posts27 participants0 posts today

Tonight at 7 PM Central, I'm hosting an event on our Discord. We'll be talking about AI stuff and just hanging out together.
It's a relaxed space, so whether you're really into AI or just curious, you're welcome to join.
Here's the link to join the event:
discord.gg/Gdu6vvmM?event=1397
Come hang out, learn something new, and have some fun.
#AIChat #TechTalk #DiscordEvent #AICommunity #MachineLearning #BlindTech #AccessibilityMatters #NeurodivergentInTech

DiscordJoin the Techopolis Discord Server!Check out the Techopolis community on Discord - hang out with 205 other members and enjoy free voice and text chat.

AI: Explainable Enough

They look really juicy, she said. I was sitting in a small room with a faint chemical smell, doing one my first customer interviews. There is a sweet spot between going too deep and asserting a position. Good AI has to be just explainable enough to satisfy the user without overwhelming them with information. Luckily, I wasn’t new to the problem. 

Nuthatcher atop Persimmons (ca. 1910) by Ohara Koson. Original from The Clark Art Institute. Digitally enhanced by rawpixel.

Coming from a microscopy and bio background with a strong inclination towards image analysis I had picked up deep learning as a way to be lazy in lab. Why bother figuring out features of interest when you can have a computer do it for you, was my angle. The issue was that in 2015 no biologist would accept any kind of deep learning analysis and definitely not if you couldn’t explain the details. 

What the domain expert user doesn’t want:
– How a convolutional neural network works. Confidence scores, loss, AUC, are all meaningless to a biologist and also to a doctor. 

What the domain expert desires: 
– Help at the lowest level of detail that they care about. 
– AI identifies features A, B, C, and that when you see A, B, & C it is likely to be disease X. 

Most users don’t care how a deep learning really works. So, if you start giving them details like the IoU score of the object detection bounding box or if it was YOLO or R-CNN that you used their eyes will glaze over and you will never get a customer. Draw a bounding box, heat map, or outline, with the predicted label and stop there. It’s also bad to go to the other extreme. If the AI just states the diagnosis for the whole image then the AI might be right, but the user does not get to participate in the process. Not to mention regulatory risk goes way up.

This applies beyong images, consider LLMs. No one with any expertise likes a black box. Today, why do LLMs generate code instead of directly doing the thing that the programmer is asking them to do? It’s because the programmer wants to ensure that the code “works” and they have the expertise to figure out if and when it goes wrong. It’s the same reason that vibe coding is great for prototyping but not for production and why frequent readers can spot AI patterns, ahem,  easily.  So in a Betty Crocker cake mix kind of way, let the user add the egg. 

Building explainable-enough AI takes immense effort. It actually is easier to train AI to diagnose the whole image or to give details. Generating high-quality data at that just right level is very difficult and expensive. However, do it right and the effort pays off. The outcome is an AI-Human causal prediction machine. Where the causes, i.e. the median level features, inform the user and build confidence towards the final outcome. The deep learning part is still a black box but the user doesn’t mind because you aid their thinking. 

I’m excited by some new developments like REX which sort of retro-fit causality onto usual deep learning models. With improvements in performance user preferences for detail may change, but I suspect that need for AI to be explainable enough will remain. Perhaps we will even have custom labels like ‘juicy’.

2025 summed up in one headline…

𝙎𝙥𝙤𝙩𝙞𝙛𝙮 𝙋𝙪𝙗𝙡𝙞𝙨𝙝𝙚𝙨 𝘼𝙄-𝙂𝙚𝙣𝙚𝙧𝙖𝙩𝙚𝙙 𝙎𝙤𝙣𝙜𝙨 𝙁𝙧𝙤𝙢 𝘿𝙚𝙖𝙙 𝘼𝙧𝙩𝙞𝙨𝙩𝙨 𝙒𝙞𝙩𝙝𝙤𝙪𝙩 𝙋𝙚𝙧𝙢𝙞𝙨𝙨𝙞𝙤𝙣

404media.co/spotify-publishes-

404 Media · Spotify Publishes AI-Generated Songs From Dead Artists Without Permission"They could fix this problem. One of their talented software engineers could stop this fraudulent practice in its tracks, if they had the will to do so."
#AI#Spotify#music

On reflection, I think the big mistake is the conflation of #AI with #LLM and #MachineLearning.
There are genuine exciting advances in ML with applications all over the place, in science, (not least in my own research group looking at high resolution regional climate downscaling), health diagnostics, defence etc. But these are not the AIs that journalists are talking about, nor that are really related the LLMs.
They're still good uses of GPUs and will probably produce economic benefits, but probably not the multi- trillion ones the pundits seem to be expecting

fediscience.org/@Ruth_Mottram/
Ruth_Mottram - My main problem with @edzitron.com 's piece on the #AIbubble is that I agree with so much of it.
I'm now wondering if I've missed something about #LLMs? The numbers and implications for stock markets are terrifyingly huge!

wheresyoured.at/the-haters-gui

FediScience.orgRuth Mottram (@Ruth_Mottram@fediscience.org)My main problem with @edzitron.com 's piece on the #AIbubble is that I agree with so much of it. I'm now wondering if I've missed something about #LLMs? The numbers and implications for stock markets are terrifyingly huge! https://www.wheresyoured.at/the-haters-gui/

My Road to Bayesian Stats

By 2015, I had heard of Bayesian Stats but didn’t bother to go deeper into it. After all, significance stars, and p-values worked fine. I started to explore Bayesian Statistics when considering small sample sizes in biological experiments. How much can you say when you are comparing means of 6 or even 60 observations? This is the nature work at the edge of knowledge. Not knowing what to expect is normal. Multiple possible routes to a seen a result is normal. Not knowing how to pick the route to the observed result is also normal. Yet, our statistics fails to capture this reality and the associated uncertainties. There must be a way I thought. 

Free Curve to the Point: Accompanying Sound of Geometric Curves (1925) print in high resolution by Wassily Kandinsky. Original from The MET Museum. Digitally enhanced by rawpixel.

I started by searching for ways to overcome small sample sizes. There are minimum sample sizes recommended for t-tests. Thirty is an often quoted number with qualifiers. Bayesian stats does not have a minimum sample size. This had me intrigued. Surely, this can’t be a thing. But it is. Bayesian stats creates a mathematical model using your observations and then samples from that model to make comparisons. If you have any exposure to AI, you can think of this a bit like training an AI model. Of course the more data you have the better the model can be. But even with a little data we can make progress. 

How do you say, there is something happening and it’s interesting, but we are only x% sure. Frequentist stats have no way through. All I knew was to apply the t-test and if there are “***” in the plot, I’m golden. That isn’t accurate though. Low p-values indicate the strength of evidence against the null hypothesis. Let’s take a minute to unpack that. The null hypothesis is that nothing is happening. If you have a control set and do a treatment on the other set, the null hypothesis says that there is no difference. So, a low p-value says that it is unlikely that the null hypothesis is true. But that does not imply that the alternative hypothesis is true. What’s worse is that there is no way for us to say that the control and experiment have no difference. We can’t accept the null hypothesis using p-values either. 

Guess what? Bayes stats can do all those things. It can measure differences, accept and reject both  null and alternative hypotheses, even communicate how uncertain we are (more on this later). All without making assumptions about our data.

It’s often overlooked, but frequentist analysis also requires the data to have certain properties like normality and equal variance. Biological processes have complex behavior and, unless observed, assuming normality and equal variance is perilous. The danger only goes up with small sample sizes. Again, Bayes requires you to make no assumptions about your data. Whatever shape the distribution is, so called outliers and all, it all goes into the model. Small sample sets do produce weaker fits, but this is kept transparent. 

Transparency is one of the key strengths of Bayesian stats. It requires you to work a little bit harder on two fronts though. First you have to think about your data generating process (DGP). This means how do the data points you observe came to be. As we said, the process is often unknown. We have at best some guesses of how this could happen. Thankfully, we have a nice way to represent this. DAGs, directed acyclic graphs, are a fancy name for a simple diagram showing what affects what. Most of the time we are trying to discover the DAG, ie the pathway of a biological outcome. Even if you don’t do Bayesian stats, using DAGs to lay out your thoughts is a great. In Bayesian stats the DAGs can be used to test if your model fits the data we observe. If the DAG captures the data generating process the fit is good, and not if it doesn’t. 

The other hard bit is doing analysis and communicating the results. Bayesian stats forces you to be verbose about your assumptions in your model. This part is almost magicked away in t-tests. Frequentist stats also makes assumptions about the model that your data is assumed to follow. It all happens so quickly that there isn’t even a second to think about it. You put in your data, click t-test and woosh! You see stars. In Bayesian stats stating the assumptions you make in your model (using DAGs and hypothesis about DGPs) communicates to the world what and why you think this phenomenon occurs. 

Discovering causality is the whole reason for doing science. Knowing the causality allows us to intervene in the forms of treatments and drugs. But if my tools don’t allow me to be transparent and worse if they block people from correcting me, why bother?

Richard McElreath says it best:

There is no method for making causal models other than science. There is no method to science other than honest anarchy.