natural language processing | text-to-speech | web development
An simple tool that helps users learn new vocabulary by providing definitions, example sentences, and high quality natural audio pronunciation.
Enter a few words, or paste in a large block of text. The speed of generating text and audio may vary; For reference, 20 word should take around 7 second to generate text with definitions and example and another 10 second to generate audio file.
This is a small project I built in early 2024 to help me learn vocabulary—particularly rare philosophy terms that do not appear often. My method is to use natural repetition so I can learn these words in context without formal study sessions.
The early version runs locally on my computer via a terminal interface, relying on API calls for text-to-speech and ChatGPT-4. For this demo, I made some modifications to run it on this webpage and added a text generation animation to simulate text content is being created (the API returns a block of text all at once). This also makes the wait for text-to-speech generation a bit less dull.
For example, I can generate audio files to listen to while I am walking between school buildings or whenever I have free time, reinforcing the words through repeated exposure.
Example Input text and out put audio file:
"Magnanimity Abstinence Pragmatism Rigorously Disquieting Capriciousness Credulity Stringency Circuitry Countenance Befuddlement Vilifiers Belittlers Luminaries Inconsequential Imperceptible Erratic Wondrous Susceptible Invisibility Fluctuating Ecological Correlates Lonsdale Influxes Phenomenon Deducing Susceptibility Nonnative Prevalence"
Although this is a simple project, it holds a special place for me because I use it frequently, and it's one of my first projects that truly utilizes API calls to meet a real need. I do have a few other OpenAI-related projects, but they were mostly just experiments with this relatively new technology.
game development | unreal engine | blueprint
Sep 2024 - Nov 2024
A music rhythm game that combines combat elements to create cinematic, music-driven fights controlled by the player. Since I had no prior game development experience, this game was developed as a learning project for Unreal Engine and Blueprint. It also serves as a proof of concept for the game I envisioned for future development, featuring core mechanics such as music synchronization, motion wrapping, note spawning, and dynamic camera systems.
the video above is playing in 4k, if lagging 1080 version available here
While gathering game development information and ideas, I realized I may have to tackle multiple physics implementations. I also noticed that certain development processes—such as AI and in-game events—are quite similar to robotics.
Since our school does not offer a game design course, I asked a physics professor (who also excels in robotics and programming) to help me create one. Together, we designed a two-credit independent study course in game design. The course allows for weekly meetings, during which I share my progress and request assistance whenever his expertise might help me solve a problem or spark a new idea, or just learn about general physic knowledge or development process..
The following game documentation and recordings chronicle my Unreal Engine 5 and game design learning process for this course. They showcase my progress from basic blueprinting to designing game mechanics, various features I tested (but not implemented), and illustrate multiple unsuccessful attempts along the way. Feel free to click around randomly or view any process that may interest you.
(Some recordings include voiceovers of tutorial displayed on a separate screen, sometimes played at a faster speed. The recording you see is my screen and development process. All recordings are captured and played in real time. These recordings cover the majority of the development process, though some parts were not recorded—especially for my stone game, where recording attempts caused crashes and engine lag.)
Learning and development recordings (Rhythm Game and Stone Game)→
game development | unreal engine | level design
Nov 2024
The aim is to create a short AAA-style game developed in 5 days, focusing on consistent art style, engaging gameplay mechanics, and storytelling without text or voice acting. The game features a unique soul-switching mechanic as a replacement for the classic respawn mechanic, along with environmental storytelling.
the video above is playing in 4k, if lagging 1080 version available here
Two of my favorite games—Cyberpunk 2077 (which I first played after the 2.0 update) and my personally beloved game, Black Myth: Wukong—allows an immersive personal journey through a cinematic experience. Both provide realism in a fantasy world: Cyberpunk 2077 excels in immersion and storytelling, while Black Myth: Wukong showcases remarkable realism, intricate details, and historical accuracy. Simply walking around in both games and admiring the surroundings is already a treat.
These experiences have motivated me to attempt my own short prototype, aiming to capture a similar sense of visuals, depth, and immersion.
I initially desgined the game to serves as a philosophical testing ground—a thought experiment that challenges our understanding of self and merit. Due to my skill and time limitations, this aspect of the game was not be fully realized, but I hope it still provokes deep some questions and reflections. Questions like what does "self" mean in this context? How does a group function under a shared consciousness? Is identity merely a shifting perspective? When my consciousness moves to a new body, what becomes of the broken one left behind? These questions add layers of depth to the experience, enriching the game beyond its mechanics.
I aspire to explore these ideas further in my future academic journey, where I can refine my skills and dedicate more time to future projects.
While gathering game development information and ideas, I realized I may have to tackle multiple physics implementations. I also noticed that certain development processes—such as AI and in-game events—are quite similar to robotics.
Since our school does not offer a game design course, I asked a physics professor (who also excels in robotics and programming) to help me create one. Together, we designed a two-credit independent study course in game design. The course allows for weekly meetings, during which I share my progress and request assistance whenever his expertise might help me solve a problem or spark a new idea, or just learn about general physic knowledge or development process.
The following game documentation and recordings chronicle my Unreal Engine 5 and game design learning process for this course. They showcase my progress from basic blueprinting to designing game mechanics, various features I tested (but not implemented), and illustrate multiple unsuccessful attempts along the way. Feel free to click around randomly or view any process that may interest you.
(Some recordings include voiceovers of tutorial displayed on a separate screen, sometimes played at a faster speed. The recording you see is my screen and development process. All recordings are captured and played in real time. These recordings cover the majority of the development process, though some parts were not recorded—especially for my stone game, where recording attempts caused crashes and engine lag.)
Learning and development recordings (Rhythm Game and Stone Game)→
Entrepreneurship | UI design | financial
|Aug 2024 - Ongoing
An entrepreneurial project involving 27 members, most of whom are pursuing their master's degrees from various top universities in different fields, including computer science, law, business, finance, and music.
(After discussing with my team, we've decided that our detailed will remain private, I can showcases public deliverable results for investment shown below. I will be providing a general description of the project and my role within the team.
February 4 update: I may showcase some interactive feature independently of their intended purpose. I am currently working on a design mockup that showcases the features independently of their intended purpose.)
I am incredibly proud to be part of this group. Many team members have accomplished impressive feats across different industries, including internships at Nvidia, professional roles at Apple, and producing song clips with over 100 million plays in China—one of which was featured in the Spring Festival Gala, a nationally recognized event watched by millions during the New Year.
I initially became involved in this project when a high school friend on the finance team asked me to design a logo. Soon after, I was invited as a potential investor during a pitch deck presentation. Once I gained deeper insight into the project, I was eager to join. I presented my combined skill set in computer science, design, and finance, along with connections to potential future investors. I am now working closely with the tech and finance teams on building a data training tool as well as investment-related project.
My Contributions (in chronological order)
Generation Results Demo Page →
(For showcase. I am not directly involved in this research).
Attention Variant Experiment Preprint Paper →
(For showcase. I am not directly involved in this research).
Chatbot Interface Prototype Demo →
Fully designed and developed by me, this is my second ever web development attempt(first one is this personal webpage). This is the first demo for the chatbot; all other features have not yet been implemented. The music interface has undergone drastic design changes since this current version.
Design mockup for interactive feature currently under development
I’m incredibly fortunate to work with two venture leaders who are not only supportive but also deeply talented. Collaborating with them energizes me: we constantly identify what’s possible and needed, figure out how to achieve it, and do our best to deliver results—an unmatched way to learn and grow.
Beyond their technical know-how, they communicate clearly, patiently, and considerately. They also devote substantial time to supporting each team member. Their extensive knowledge of each member’s work underscores their dedication. The more I learn about them, the more impressed I become.
Having previously attempted and canceled an entrepreneurial venture myself—a much smaller team—just a year ago, I understand the workload and challenges they have overcome and continue to manage so well.
My first entrepreneurial attempt—“FootPrint” in spring 2024—was both painful and rewarding. Not reaching my goal with a things I fully believed in and gave my all to was hurtful, but it remains a priceless journey.
However, my first venture led me to believe that undertaking a serious, long-term entrepreneurial project is so challenging it’s nearly impossible to manage during school without compromising either academics or the venture itself.
I feel very fortunate that this new venture, NeuroMuse, proved me wrong. Despite the demands of a large-scale project, we integrate our academic pursuits rather than treating them as separate tasks. For example, our LLM team recently published a preprint on an Attention Variant Experiment, and we’re already folding that new knowledge back into our work. I’ll soon contribute to another paper in phase two, focusing on the Arrangement Model (Unified Reasoning Model).
computer vision | machine learning | accessibility
|Summer 2024 - Ongoing
An ongoing project for a real-time sign language translation tool focused on bridging communication between people with hearing or speech impairments and those who do not understand sign language.
This is a research project I proposed and initiated under the guidance of Prof. Eckmann.
Recipient of a $4,500 Summer Experience Award from Skidmore College, recognized for outstanding academic merit and potential to pursue meaningful summer research experiences. The project was later presented at Billie Tisch Center for Integrated Sciences Celebration in Fall 2024 to students, faculty, and external community members.
The Sign Language Translation Tool aims to bridge communication gaps between individuals who are deaf or mute and those unfamiliar with sign language. By offering real-time translation, the tool promotes greater accessibility and inclusion in everyday interactions. The initial deployment will focus on VR/XR devices like the Apple Vision Pro, with future plans to expand to smartphones.
I built a dictionary tool that detects hand gestures and maps them to their corresponding meanings; this forms the backbone of the translation system.
It leverages an understanding of 3D space from 2D video and captures crucial details such as handshapes, palm orientation, location, movement. Future implementation will also include non-manual markers (e.g., facial expressions, posture, and mouth movements).
The dictionary tool addresses the challenges posed by limited sign language training data. Because existing datasets vary in format and detail, the dictionary approach enables using any publicly available 2D sign language video to build robust training data.
Video of me testing the accuracy of hand gesture detection for the dictionary data mentioned below:
Above is an interface for testing and demos. The data generation will use this tool to repeat the following steps (without a UI) and store the data collected for each sign gesture with its meaning.
Achieving sign-to-speech involves the following key steps:
To create the dictionary, we gather extensive data on the core elements of sign language:
Text-to-Speech Demo
This demo showcases the text-to-speech functionality that will be integrated into the sign language translation tool. Enter any text below to generate natural-sounding speech using OpenAI's text-to-speech API. This is a simple setup to confirm the feasibility of this step.
The current Speech-to-Text model is pretty well developed; this section will just be me implementing the available models for my purpose.
Achieving Speech-to-Text generation involves the following key steps:
This demo showcases the speech-to-text functionality that will be integrated into the sign language translation tool. Click the microphone button and speak to see your words transcribed in real time using the Web Speech API. This is a simple setup to confirm the feasibility of this step.
Your transcribed text will appear here...
The potential benefits for sign language education, translation accuracy, and accessibility make it a compelling direction for future development.
interactive audio | web development | music production
An interactive music mixer that provides precise control over a multi-track musical composition. This project is a proof of concept that demonstrates simple audio processing and real-time mixing capabilities, which can be easily implemented into games.
This project explores dynamic music implementation by adjusting the volume of individual stems (instrument tracks) based on in-game actions. This technique instantly changes the musical atmosphere while maintaining coherence. Inspired by my work on NeuroMuse and prior experience in piano and cello, this proof of concept demonstrates how simple adjustments can create immersive audio experiences. I was solely responsible for the development of this project.
An example use case could be a boss fight, with each version of the music representing a different state for the protagonist or the enemy. Different versions of the music would rapidly switch, reflecting changes in movement, health, or attack patterns.
robotics | PID control | embedded systems | course work
|Fall 2024
A collection of robotics projects focusing on control systems, sensor integration, and autonomous navigation. These projects demonstrate practical applications of PID control, embedded systems programming, and real-time sensor processing.
A precision control system that maintains a ball's position on a platform using PID control algorithms. The system demonstrates real-time sensor feedback and dynamic adjustment capabilities.
View PID Ball Balance DocumentationAn autonomous robot capable of following line patterns using infrared sensors and precise motor control. The project showcases sensor integration and motion control algorithms.
View Line Follower DocumentationA maze-solving robot that uses advanced pathfinding algorithms and sensor arrays to navigate through unknown maze environments. This project combines elements of autonomous navigation, mapping, and optimal path planning.
View Micromouse DocumentationEntrepreneurship Competition
Spring 2024
Footprint is a web browser I developed to enhance productivity and engagement in web browsing within 10 days for an entrepreneurship competitio. Its standout feature is an integrated social media platform, linked to each webpage for dynamic interaction. Additional tools, such as a built-in note-taking system and AI-powered summaries, are also linked to each webpage, providing users with a more comprehensive and interactive browsing experience.
This project showcases my skills in web development and user experience design. In addition, the competition involves business planning, financial planning, market research, and presentations to judges. I was solely responsible for the development of this project. More on this project in google file.
course work
Summer 2023
This script is a project from a screenwriting course, where the writing evolved through continuous feedback from three peers and one professor. The structure of the course allowed only minimal edits to previous sections, encouraging growth with each new submission. The final assignment was written without external feedback, showcasing my development as a screenwriter.
View Screenplay PDFweb development | UI/UX | interactive design
|Aug 2024 - Present
This is my first attempt at web development. The goal is to create a minimal, interactive portfolio website that showcases projects through a simple and user-friendly interface. It features a responsive design with smooth transitions, dynamic content loading, and interactive components. I was solely responsible for the development of this personal webpage.