Control of some Spotify's functionalities by voice
-
Updated
Jul 6, 2018 - C#
Control of some Spotify's functionalities by voice
Challenge of gesture recognition for the course : "Multimodal Processing Recognition and Interaction" of the HES-SO university (Switzerland)
Developed a multimodal interactive quiz app allowing users to select answers via hand gestures. Created a user-friendly UI/UX in Figma and built the front end with React Native, using MongoDB for data management. Implemented a backend with Express and Node.js, and trained CNN models in Python for gesture recognition, enhancing user engagement.
Travail de groupe pour le cours MPRI de la HES-SO
A multimodal skill built with Amazon Alexa Skills Kit that educates children on the importance of numbers and dates.
Control of some Spotify's functionalities with gestures
Control of some Spotify's functionalities with gestures and speech
Multimodal AI Assistant with Google Gemini-1.5-pro, gTTS, PIL, and SpeechRecognition Technologies!
Repository to contain MMI development during UTA CSE REU 2019. Makes use of Open-Myo, a module to get data from a Myo armband using a generic BLE interface.
Project for Multimodal Interaction course (A.Y. 2019/2020), GesturePad
Technical Draft: A platform to augment web applications with multimodal interactions
Code for ICMI2020 and ICMI2021 papers: "Studying Person-Specific Pointing and Gaze Behavior for Multimodal Referencing of Outside Objects from a Moving Vehicle" and "ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving Vehicle"
A multimodal face liveness detection module that can be used in the context of face anti-spoofing
Using voice and pen to draw diagrams quickly with automatically suggested icons and texts by AI in talking.
Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances (ACL 2024)
Multimodal sentiment analysis using hierarchical fusion with context modeling
Mobile application for exploring fitness data using both speech and touch interaction.
Context-Dependent Sentiment Analysis in User-Generated Videos
A comprehensive reading list for Emotion Recognition in Conversations
Add a description, image, and links to the multimodal-interactions topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-interactions topic, visit your repo's landing page and select "manage topics."