Hello Reader,
Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β
β
β
Here's a non exhaustive list of the latest ML papers to keep you up-to-date with the field.
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
MM-Vet1 , an evaluation benchmark that examines large multimodal models (LMMs) on complicated multimodal tasks.
Paper: https://arxiv.org/pdf/2308.02490.pdfβ
β
On the Transition from Neural Representation to Symbolic Knowledge
A Neural-Symbolic Transitional Dictionary Learning (TDL) framework that employs an EM algorithm to learn a transitional representation of data that compresses high-dimension information of visual parts of an input into a set of tensors as neural variables and discover the implicit predicate structure in a self-supervised way.
Paper: https://arxiv.org/pdf/2308.02000.pdfβ
β
SoK: Assessing the State of Applied Federated Machine Learning
A study that aims to explore the current state of applied Federated Machine Learning and identify the challenges hindering its practical adoption.
Paper: https://arxiv.org/pdf/2308.02454.pdfβ
β
A large language model-assisted education tool to provide feedback on open-ended responses
A tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions.
Paper: https://arxiv.org/pdf/2308.02439.pdfβ
β
Unlocking the Potential of Similarity Matching: Scalability, Supervision and Pre-training
A biologically plausible learning approach for machine learning that's different from back propagation.
Paper: https://arxiv.org/pdf/2308.02427.pdfβ
β
Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition: A Systematic Study
Deep learning technology with Wifi CSI-based (channel state information) for human action recognition.
Paper: https://arxiv.org/pdf/2308.02412.pdfβ
β
A lot of businesses produce a ton of documents every day which in turn are consumed by other businesses. Some of these businesses include: legal firms, accounting firms and e-commerce.
This requires a ton of manual labor to read, understand and extract the right information.
We can definitely do better.
Hereβs one of the the best approaches out there for document understanding which I personally tried.
Introducing LayoutLMv3.
Hereβs the good and the bad about this method.
The good.
LayoutLMv3 is a deep learning model thatβs pre-trained using a multimodal Transformers for Document AI with unified text and image masking.
LayoutLMv3 is pre-trained with a word-patch alignment objective to learn crossmodal alignment by predicting whether the corresponding image patch of a text word is masked.
This unified architecture and training objectives make LayoutLMv3 a general-purpose pretrained model for both text-centric and image-centric Document AI tasks.
Experimental results show that LayoutLMv3 achieves state-of-the-art performance on :
text-centric tasks such as:
Image-centric tasks such as:
Hereβs the bad.
LayoutLMv3 is very dependent on OCR engines.
This means that you canβt use it without a prior OCR model that does text detection and extraction.
Also, if you want to train your own model, then the annotation of your dataset may not be straightforward.
You basically have to use an OCR engine to do the extraction of the text.
Then you have to specify which texts represent which entity: invoice date, invoice number, customer name, customer address, β¦
There arenβt that many annotation tools out there to help you do this.
I personally had to build my own annotation tool because I needed to integrate LayoutLMv3 with a proprietary OCR engine.
Below is a sample output of how LayoutLMv3 can do question answering on a document.
To help you understand more about this model and even train it on your own data, here are some resources:
β
β
A fellow ML engineer and data scientist Taha Zemmouri and his team have created a very cool tool called Eden AI, that basically aggregates the latest machine learning models in a form of APIs. You can access several cutting edge ML models using their easy to use API calls.
This is very inspiring for people who are trying to build ML products (such as yours truly π).
Would you like to know more about how he created this product and what is his vision for the future? Let me know by clicking below β
If I see enough interest, I will reach out to him and get you some juicy responses!
β
β
|
β
β
βThat's it for this week's edition, I hope you enjoyed it!
π Learn how to build AI systems for medical imaging domain by leveraging tools and techniques that I share with you! | π‘ The newsletter is read by people from: Nvidia, Baker Hughes, Harvard, NYU, Columbia University, University of Toronto and more!
Hi Reader! I hope you're doing well in this fine weekend! In the past weeks I've been working on implementing basic image segmentation models for 2D and 3D from scratch. While doing so, I found a few things that were delightfully surprising while other things were painfully irritating. I tell you all about it in this edition of the newsletter! What Building AI Models from Scratch has Thought me One of the reasons why I did these experimentations was to understand some of the nitty gritty...
Hi Reader, I haven't sent you a newsletter email for some time now. This is because there are major events happening in my personal life. We just had our first kid, so I'm still trying to adapt to the new routine set by this cute little creature! I also changed my office! I used to work from home, but now I am working in a coworking space. I'm hoping that this will help me deliver more value to the newsletter subscribers as well as our clients at PYCAD. Now, back to the newsletter! I've got...
Hello Reader, Welcome to another edition of PYCAD newsletter where we cover interesting topics in Machine Learning and Computer Vision applied to Medical Imaging. The goal of this newsletter is to help you stay up-to-date and learn important concepts in this amazing field! I've got some cool insights for you below β Applications of Machine Learning for Dentistry At PYCAD, we have worked a lot on the applications of AI to the dentistry domain. Here are 3 incredible ones. 1 - Diagnosis and...