29
11
Indonesia
4 years of experience
I'm a programmer with over 4 years of experience in building computer science projects. I particularly specialize in the field of Artificial Intelligence. Through Omdena Projects, I have collaborated with people in creating AI solutions to real-world problems. Personally, I have innovated a couple AI products, which I publish when participating in hackathons. Currently, my research revolves around medical imaging in Computer Vision and Large Language Model for NLP.
Talking Teddy is a smart, AI-powered companion designed to promote child well-being and entertain children when parents are at work. It acts as a child's interactive friend, capable of hearing, talking, and observing their daily activities, offering both companionship and support. The teddy bear is equipped with several key features, including monitoring the child's activities through a camera, playing fun and engaging music, sending emergency notifications to parents, and checking reminders left by the parent. TeddyTalk is connected to a dashboard to allow parents to view conversation logs, gain insights into their childโs well-being and engagement, track key discussion topics, and send reminders to their child. The AI agent behind Talking Teddy is built using the Langchain framework, which integrates with external services like Twilio, Gemini-Vision, Supabase, and ElevenLabs. The parental dashboard is developed using Next.js and styled with Tailwind CSS.
16 Sep 2024
Our app is an AI-driven content creation tool designed for small businesses and individual users, featuring a user-friendly interface. It can create stories from personal photos and search for relevant content on the internet. Powered by FastAPI and Next.js, our app integrates several Google services, including Google Speech API, Gemini Flash Model, Cloud Run services, and Firebase Buckets. Video editing is facilitated by MoviePy, and we source royalty-free stock from Pexels. Our versatile application can be used for travel diaries, food vlogs, promotional ads, journalistic content, and explainer videos. Future plans include supporting more integrations, AI artwork generation, and user project libraries to further enhance our product.
4 Jul 2024
The app is designed to enhance tourism in Benin by providing tools for captioning and dubbing videos in Yoruba and French. It leverages advanced speech recognition and translation technologies to ensure accurate and high-quality results, making it easier for tourists to understand local attractions and cultural events. By supporting multiple languages, the app helps bridge communication gaps, allowing a broader audience to engage with Benin's rich cultural heritage. In addition to its video captioning and dubbing capabilities, the app features an interactive chatbot powered by Retrieval-Augmented Generation (RAG) technology. This chatbot provides detailed and accurate information about Benin's landmarks, offering insights into historical sites, cultural festivities, and popular tourist destinations. Users can ask questions, get recommendations, and receive personalized assistance, making their exploration of Benin more informative and enjoyable. The app serves as a comprehensive travel companion, combining multimedia translation tools with interactive information services. By making Benin's cultural and historical sites more accessible and understandable, the app aims to enhance the overall travel experience for tourists. Whether for individual travelers or content creators, the app provides valuable resources to promote and explore the beauty of Benin.
16 May 2024
The envisioned mobile app, tailored for elders coping with dementia, integrates a chat interface and flashcards to offer a personalized and interactive therapeutic experience. Through the chat interface, users engage with an AI companion capable of simulating meaningful conversations, providing companionship, and guiding users through various cognitive exercises and reminiscence activities. The flashcards, on the other hand, are designed to stimulate memory and cognitive functions by presenting personalized content such as historical facts, personal memories, or trivia, adapting in complexity based on the user's responses. This combination of features aims to not only refresh and maintain the user's memories but also to foster a sense of connection and well-being, making technology a bridge to their past and a support in their present.
23 Feb 2024
RecoverPal revolutionizes addiction recovery with a blend of AI and empathetic design. Built with Flutter and Python, the app delivers a seamless experience, integrating OpenAI's GPT-4 for customized affirmations and journal prompts, and GPT-4 Vision for art interpretation. DALLยทE enhances artistic expression, while Text-to-Speech technology personalizes meditation sessions. Key features include daily emotional check-ins, art integration, and meditation customization, tailored to users' specific needs. Hosted on Google Cloud Platform, RecoverPal prioritizes privacy and data security. Its evolving design, driven by user feedback, continuously refines the recovery journey, making it a versatile tool for emotional and mental health support.
22 Jan 2024
In my country, there exists an inequality affecting visually impaired individuals who lack access to essential accessibility services. This has driven me to create a mobile app that can harness the power of AI to offer a transformative solution. By enabling the visually impaired to understand the world around them, this app directly aligns with two crucial United Nations Sustainable Development Goals (UNSDGs): Goal 3 - "Good Health and Well-being," and Goal 10 - "Reduced Inequalities." The app is constructed using the Flutter framework for the frontend, while the backend relies on Google AI Studio's Gemini Pro-Vision Model, accompanied with continuous Trulens Evaluation of LLM performance in Gemini-Lens' Hosted Fast-API server on Google Cloud Run Service. The app has a very simple user interface: a camera preview and a mic button for them to speak up, then the query transcribed, along with the image captured will be processed and the response will be spoken back to blind users.
22 Dec 2023
Senior-Digest is an innovative web application designed to simplify news consumption, making it more accessible and personalized. Imagine a dashboard where, in one tab labeled "Summary," the app neatly displays top 10 news headlines with generated summaries. Users can then switch to the "Query" tab to ask questions that dive into particular topics, either by typing or speaking. Here, the app processes the queries using smart language models and quickly fetches relevant answers from its database. Before showing these answers to the user, it checks them for accuracy and relevance. Senior-Digest uses several key technologies to work smoothly and efficiently. Streamlit makes the app's interface easy to use and interactive. Google Cloud offers VertexAI models needed by the application, like LLM, embeddings, speech-to-text and text-to-speech APIs. gspread connects the app to Google Sheets for retrieval of daily news summary. Pinecone is used for storing news embeddings over a period of time. Langchain helps to integrate summarization and RAG components. Finally, Trulens evaluates the query and responses for relevance and groundedness. Senior-Digest's features are especially beneficial for the elderly. Its simple access to the top 10 daily news stories and AI-generated summaries reduces the need for lengthy reading, ideal for those who might struggle with small text. Additionally, its voice-based query system and audio outputs make news consumption easier and more accessible for elderly users, accommodating potential visual challenges.
11 Dec 2023
Our platform, empowered by Vectara's Retrieval-Augmented Generation (RAG) pipeline, marks a significant advancement in customer service technology. It addresses the key limitation of traditional chatbots, which require constant retraining on updated QnA pairs, by automatically refreshing its knowledge base through the ingestion of diverse multimedia content like documents, YouTube videos, and webpages. Currently in its prototype phase on Streamlit, users can test the system by inputting their personal Vectara database credentials, enjoying a conversational interface designed for friendly and efficient interactions. This initial setup paves the way for a broader application, where the platform can be expanded into a network encompassing both client and company sides, creating a comprehensive, AI-powered customer service ecosystem.
9 Nov 2023
Around the world, there exists an inequality affecting visually impaired individuals who lack access to essential accessibility services. This has driven me to develop a software for Smart Glasses that blind people can wear. SightCom2 software utilizes OpenAI technologies, namely Whisper for speech transcription, GPT-3.5 as a LLM, DALL-E for image generation; image captioning, OCR and color recognition models from Clarifai API. This software is served on streamlit cloud, and is a prototype that can potentially be deployed on a microprocessor, assembled in an integrated circuit, between input devices like camera and microphone, and output devices like speakers.
16 Oct 2023