Hello Reader,

Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below ↓

Top ML Papers of the Week

Here's a non exhaustive list of the latest ML papers to keep you up-to-date with the field.

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

MM-Vet1 , an evaluation benchmark that examines large multimodal models (LMMs) on complicated multimodal tasks.

Paper: https://arxiv.org/pdf/2308.02490.pdf

On the Transition from Neural Representation to Symbolic Knowledge

A Neural-Symbolic Transitional Dictionary Learning (TDL) framework that employs an EM algorithm to learn a transitional representation of data that compresses high-dimension information of visual parts of an input into a set of tensors as neural variables and discover the implicit predicate structure in a self-supervised way.

Paper: https://arxiv.org/pdf/2308.02000.pdf

SoK: Assessing the State of Applied Federated Machine Learning

A study that aims to explore the current state of applied Federated Machine Learning and identify the challenges hindering its practical adoption.

Paper: https://arxiv.org/pdf/2308.02454.pdf

A large language model-assisted education tool to provide feedback on open-ended responses

A tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions.

Paper: https://arxiv.org/pdf/2308.02439.pdf

Unlocking the Potential of Similarity Matching: Scalability, Supervision and Pre-training

A biologically plausible learning approach for machine learning that's different from back propagation.

Paper: https://arxiv.org/pdf/2308.02427.pdf

Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition: A Systematic Study

Deep learning technology with Wifi CSI-based (channel state information) for human action recognition.

Paper: https://arxiv.org/pdf/2308.02412.pdf

ML Deep Dive: LayoutLMv3 for Document Understanding

sample output of LayoutLMv3 on a legal document

A lot of businesses produce a ton of documents every day which in turn are consumed by other businesses. Some of these businesses include: legal firms, accounting firms and e-commerce.

This requires a ton of manual labor to read, understand and extract the right information.

We can definitely do better.

Here’s one of the the best approaches out there for document understanding which I personally tried.

Introducing LayoutLMv3.

Here’s the good and the bad about this method.

The good.

LayoutLMv3 is a deep learning model that’s pre-trained using a multimodal Transformers for Document AI with unified text and image masking.

LayoutLMv3 is pre-trained with a word-patch alignment objective to learn crossmodal alignment by predicting whether the corresponding image patch of a text word is masked.

This unified architecture and training objectives make LayoutLMv3 a general-purpose pretrained model for both text-centric and image-centric Document AI tasks.

Experimental results show that LayoutLMv3 achieves state-of-the-art performance on :

text-centric tasks such as:

form understanding,
receipt understanding,
and document visual question answering,

Image-centric tasks such as:

document image classification
and document layout analysis

Here’s the bad.

LayoutLMv3 is very dependent on OCR engines.

This means that you can’t use it without a prior OCR model that does text detection and extraction.

Also, if you want to train your own model, then the annotation of your dataset may not be straightforward.

You basically have to use an OCR engine to do the extraction of the text.

Then you have to specify which texts represent which entity: invoice date, invoice number, customer name, customer address, …

There aren’t that many annotation tools out there to help you do this.

I personally had to build my own annotation tool because I needed to integrate LayoutLMv3 with a proprietary OCR engine.

Below is a sample output of how LayoutLMv3 can do question answering on a document.

To help you understand more about this model and even train it on your own data, here are some resources:

you can check the original paper of LayoutLMv3
You can also check the github repo
If you want train and test the model by yourself, check out this Colab.
To annotate your data to prepare it for training using LayoutLMv3, you can check these annotation tools: this and this.

People of ML : Taha from Eden AI

A fellow ML engineer and data scientist Taha Zemmouri and his team have created a very cool tool called Eden AI, that basically aggregates the latest machine learning models in a form of APIs. You can access several cutting edge ML models using their easy to use API calls.

This is very inspiring for people who are trying to build ML products (such as yours truly 😁).

Would you like to know more about how he created this product and what is his vision for the future? Let me know by clicking below ↓

If I see enough interest, I will reach out to him and get you some juicy responses!

Meme of the Day!

Tweet of the Day!

Yann LeCun

@ylecun

A talk I gave at MIT recently.
"Objective-Driven AI: towards AI systems that can learn, remember, plan, reason, have common sense, yet are steerable and safe"
Slides: https://drive.google.com/file/d/1wzHohvoSgKGZvzOWqZybjm4M4veKR6t3/view?usp=drivesdk
Video: 
https://youtu.be/vyqXLJsmsrk?list=PLKemzYMx2_Ot1MZ_er2vFiINdJEgDO8Hg

2:40 PM • Aug 6, 2023

436

Retweets

2119

Likes

Read 56 replies

What'd you think of today's edition?

That's it for this week's edition, I hope you enjoyed it!

Machine Learning for Medical Imaging

Legal Documents Understanding with ML

Top ML Papers of the Week

ML Deep Dive: LayoutLMv3 for Document Understanding

People of ML : Taha from Eden AI

Meme of the Day!

Tweet of the Day!

What'd you think of today's edition?

AI Scribes: The Future of Medical Documentation?

From DeepSeek to Lung Tumors

LLMs that are HIPAA Compliant!