Fanzhuo Ai

Vocabulary Learning Tool

natural language processing | text-to-speech | web development

An simple tool that helps users learn new vocabulary by providing definitions, example sentences, and high quality natural audio pronunciation.

Feel free to try it out!

Enter a few words, or paste in a large block of text. The speed of generating text and audio may vary; For reference, 20 word should take around 7 second to generate text with definitions and example and another 10 second to generate audio file.

Generating content...

Purpose & Background

This is a small project I built in early 2024 to help me learn vocabulary—particularly rare philosophy terms that do not appear often. My method is to use natural repetition so I can learn these words in context without formal study sessions.

The early version runs locally on my computer via a terminal interface, relying on API calls for text-to-speech and ChatGPT-4. For this demo, I made some modifications to run it on this webpage and added a text generation animation to simulate text content is being created (the API returns a block of text all at once). This also makes the wait for text-to-speech generation a bit less dull.

Learning Method

For example, I can generate audio files to listen to while I am walking between school buildings or whenever I have free time, reinforcing the words through repeated exposure.

Example Input text and out put audio file:

"Magnanimity Abstinence Pragmatism Rigorously Disquieting Capriciousness Credulity Stringency Circuitry Countenance Befuddlement Vilifiers Belittlers Luminaries Inconsequential Imperceptible Erratic Wondrous Susceptible Invisibility Fluctuating Ecological Correlates Lonsdale Influxes Phenomenon Deducing Susceptibility Nonnative Prevalence"

Although this is a simple project, it holds a special place for me because I use it frequently, and it's one of my first projects that truly utilizes API calls to meet a real need. I do have a few other OpenAI-related projects, but they were mostly just experiments with this relatively new technology.

Rhythm Game

game development | unreal engine | blueprint

Sep 2024 - Nov 2024

A music rhythm game that combines combat elements to create cinematic, music-driven fights controlled by the player. Since I had no prior game development experience, this game was developed as a learning project for Unreal Engine and Blueprint. It also serves as a proof of concept for the game I envisioned for future development, featuring core mechanics such as music synchronization, motion wrapping, note spawning, and dynamic camera systems.

Demo Video

the video above is playing in 4k, if lagging 1080 version available here

Ball (Music Node) Features

  • Hit Feedback: Fracture and explosion effects using Chaos Destruction, with dynamic impact based on player distance and angle
  • Timing Indicator: Color-shifting outer layer from blue to red indicating perfect beat timing
  • Movement System: Constant-speed movement towards platform center with precise collision detection

Player Mechanics

  • Combat System: Free-flow combat synchronized with music beats
  • Motion Wrapping: Responsive character movement with optimized animation transitions
  • Instant Feedback: Synchronized sound effects and visual cues for seamless gameplay

Spawner System

  • Music Sync: Ball spawning synchronized with music markers from Adobe Audition
  • Dynamic Placement: Randomized spawn locations around the platform
  • Easy Configuration: Simplified system for setting spawn times with different music tracks

Technical Implementation

  • Blueprint Systems: Custom event handling for player interactions and ball behavior
  • Animation: Optimized combat animations with edited transitions for responsive feedback
  • Camera System: Cinematic camera movement using Cine Camera Actor on Spline

While gathering game development information and ideas, I realized I may have to tackle multiple physics implementations. I also noticed that certain development processes—such as AI and in-game events—are quite similar to robotics.

Since our school does not offer a game design course, I asked a physics professor (who also excels in robotics and programming) to help me create one. Together, we designed a two-credit independent study course in game design. The course allows for weekly meetings, during which I share my progress and request assistance whenever his expertise might help me solve a problem or spark a new idea, or just learn about general physic knowledge or development process..

The following game documentation and recordings chronicle my Unreal Engine 5 and game design learning process for this course. They showcase my progress from basic blueprinting to designing game mechanics, various features I tested (but not implemented), and illustrate multiple unsuccessful attempts along the way. Feel free to click around randomly or view any process that may interest you.

(Some recordings include voiceovers of tutorial displayed on a separate screen, sometimes played at a faster speed. The recording you see is my screen and development process. All recordings are captured and played in real time. These recordings cover the majority of the development process, though some parts were not recorded—especially for my stone game, where recording attempts caused crashes and engine lag.)

Learning and development recordings (Rhythm Game and Stone Game)→

View Game Design Document

Stone Game

game development | unreal engine | level design

Nov 2024

The aim is to create a short AAA-style game developed in 5 days, focusing on consistent art style, engaging gameplay mechanics, and storytelling without text or voice acting. The game features a unique soul-switching mechanic as a replacement for the classic respawn mechanic, along with environmental storytelling.

Demo for game and respawn mechanic

the video above is playing in 4k, if lagging 1080 version available here

Two of my favorite games—Cyberpunk 2077 (which I first played after the 2.0 update) and my personally beloved game, Black Myth: Wukong—allows an immersive personal journey through a cinematic experience. Both provide realism in a fantasy world: Cyberpunk 2077 excels in immersion and storytelling, while Black Myth: Wukong showcases remarkable realism, intricate details, and historical accuracy. Simply walking around in both games and admiring the surroundings is already a treat.

These experiences have motivated me to attempt my own short prototype, aiming to capture a similar sense of visuals, depth, and immersion.

I initially desgined the game to serves as a philosophical testing ground—a thought experiment that challenges our understanding of self and merit. Due to my skill and time limitations, this aspect of the game was not be fully realized, but I hope it still provokes deep some questions and reflections. Questions like what does "self" mean in this context? How does a group function under a shared consciousness? Is identity merely a shifting perspective? When my consciousness moves to a new body, what becomes of the broken one left behind? These questions add layers of depth to the experience, enriching the game beyond its mechanics.

I aspire to explore these ideas further in my future academic journey, where I can refine my skills and dedicate more time to future projects.

Game Design

  • Art Direction: Consistent stone-based aesthetic ensuring visual cohesion
  • Storytelling: Environmental narrative without relying on text or voice acting
  • Core Mechanic: Innovative soul/body switching system for puzzle-solving and exploration

Visual Design

  • Character Design: Modified default character models with stone skin and distinctive red cores
  • Environment: Detailed stone landscapes using Nanite technology
  • Lighting: Character cores serve as dynamic light sources and visual guides

Technical Features

  • Soul Switching: Complex camera system with smooth transitions between characters
  • Visual Effects: Wet surface transitions and detailed stone texturing
  • Level Design: Carefully mixed stone models for natural-looking environments

Player Systems

  • Character Control: Smooth transitions between possessed characters
  • Respawn System: Cinematic soul transfer with particle effects and camera work
  • AI Integration: Dynamic AI behavior updating with character switches

While gathering game development information and ideas, I realized I may have to tackle multiple physics implementations. I also noticed that certain development processes—such as AI and in-game events—are quite similar to robotics.

Since our school does not offer a game design course, I asked a physics professor (who also excels in robotics and programming) to help me create one. Together, we designed a two-credit independent study course in game design. The course allows for weekly meetings, during which I share my progress and request assistance whenever his expertise might help me solve a problem or spark a new idea, or just learn about general physic knowledge or development process.

The following game documentation and recordings chronicle my Unreal Engine 5 and game design learning process for this course. They showcase my progress from basic blueprinting to designing game mechanics, various features I tested (but not implemented), and illustrate multiple unsuccessful attempts along the way. Feel free to click around randomly or view any process that may interest you.

(Some recordings include voiceovers of tutorial displayed on a separate screen, sometimes played at a faster speed. The recording you see is my screen and development process. All recordings are captured and played in real time. These recordings cover the majority of the development process, though some parts were not recorded—especially for my stone game, where recording attempts caused crashes and engine lag.)

Learning and development recordings (Rhythm Game and Stone Game)→

View Game Design Document
NeuroMuse Logo

NeuroMuse

Entrepreneurship | UI design | financial

|

Aug 2024 - Ongoing

An entrepreneurial project involving 27 members, most of whom are pursuing their master's degrees from various top universities in different fields, including computer science, law, business, finance, and music.

(After discussing with my team, we've decided that our detailed will remain private, I can showcases public deliverable results for investment shown below. I will be providing a general description of the project and my role within the team.

February 4 update: I may showcase some interactive feature independently of their intended purpose. I am currently working on a design mockup that showcases the features independently of their intended purpose.)

Team Composition

  • 27 members, with two-thirds pursuing master’s degrees in computer science or technology-related fields at top institutions such as Stanford and the University of Michigan.
  • The remaining team members have backgrounds in law, business, finance, and music production.

I am incredibly proud to be part of this group. Many team members have accomplished impressive feats across different industries, including internships at Nvidia, professional roles at Apple, and producing song clips with over 100 million plays in China—one of which was featured in the Spring Festival Gala, a nationally recognized event watched by millions during the New Year.

Personal Experience and Contributions

I initially became involved in this project when a high school friend on the finance team asked me to design a logo. Soon after, I was invited as a potential investor during a pitch deck presentation. Once I gained deeper insight into the project, I was eager to join. I presented my combined skill set in computer science, design, and finance, along with connections to potential future investors. I am now working closely with the tech and finance teams on building a data training tool as well as investment-related project.

My Contributions (in chronological order)

  • Logo Design
  • Speaker Icon
  • Leading and Building the Current UI Interface: Spearheading the interface’s development.
  • Investment Pitch for Competitions/Grants: Assisted in pitch presentations for various university funding opportunities, securing $2,200 (which is still way less than our total training costs).
  • Current Focus and Upcoming Contributions: My primary focus now is developing an annotation tool for training data with eight other team members—an urgent requirement for our second phase. Additionally, I am organizing upcoming investment meetings to secure further funding.

Deliverable Results

Generation Results Demo Page →

(For showcase. I am not directly involved in this research).

Attention Variant Experiment Preprint Paper →

(For showcase. I am not directly involved in this research).

Chatbot Interface Prototype Demo →

Fully designed and developed by me, this is my second ever web development attempt(first one is this personal webpage). This is the first demo for the chatbot; all other features have not yet been implemented. The music interface has undergone drastic design changes since this current version.

Design mockup for interactive feature currently under development

Thoughts

I’m incredibly fortunate to work with two venture leaders who are not only supportive but also deeply talented. Collaborating with them energizes me: we constantly identify what’s possible and needed, figure out how to achieve it, and do our best to deliver results—an unmatched way to learn and grow.

Beyond their technical know-how, they communicate clearly, patiently, and considerately. They also devote substantial time to supporting each team member. Their extensive knowledge of each member’s work underscores their dedication. The more I learn about them, the more impressed I become.

Having previously attempted and canceled an entrepreneurial venture myself—a much smaller team—just a year ago, I understand the workload and challenges they have overcome and continue to manage so well.

My first entrepreneurial attempt—“FootPrint” in spring 2024—was both painful and rewarding. Not reaching my goal with a things I fully believed in and gave my all to was hurtful, but it remains a priceless journey.

However, my first venture led me to believe that undertaking a serious, long-term entrepreneurial project is so challenging it’s nearly impossible to manage during school without compromising either academics or the venture itself.

I feel very fortunate that this new venture, NeuroMuse, proved me wrong. Despite the demands of a large-scale project, we integrate our academic pursuits rather than treating them as separate tasks. For example, our LLM team recently published a preprint on an Attention Variant Experiment, and we’re already folding that new knowledge back into our work. I’ll soon contribute to another paper in phase two, focusing on the Arrangement Model (Unified Reasoning Model).

Sign Language Translation

computer vision | machine learning | accessibility

|

Summer 2024 - Ongoing

An ongoing project for a real-time sign language translation tool focused on bridging communication between people with hearing or speech impairments and those who do not understand sign language.

This is a research project I proposed and initiated under the guidance of Prof. Eckmann.

Recipient of a $4,500 Summer Experience Award from Skidmore College, recognized for outstanding academic merit and potential to pursue meaningful summer research experiences. The project was later presented at Billie Tisch Center for Integrated Sciences Celebration in Fall 2024 to students, faculty, and external community members.

Introduction

The Sign Language Translation Tool aims to bridge communication gaps between individuals who are deaf or mute and those unfamiliar with sign language. By offering real-time translation, the tool promotes greater accessibility and inclusion in everyday interactions. The initial deployment will focus on VR/XR devices like the Apple Vision Pro, with future plans to expand to smartphones.

Objectives

  1. Sign-to-Speech Generation: Converts sign language into spoken language (and generates audio), allowing individuals unfamiliar with sign language to understand it in real time.
  2. Speech-to-Text Generation: Translates spoken language into text, making it accessible for users who are deaf or mute. A text bubble should appear next to the speaker.

Dictionary Tool Details (current progress)

I built a dictionary tool that detects hand gestures and maps them to their corresponding meanings; this forms the backbone of the translation system.

It leverages an understanding of 3D space from 2D video and captures crucial details such as handshapes, palm orientation, location, movement. Future implementation will also include non-manual markers (e.g., facial expressions, posture, and mouth movements).

The dictionary tool addresses the challenges posed by limited sign language training data. Because existing datasets vary in format and detail, the dictionary approach enables using any publicly available 2D sign language video to build robust training data.

This tool I built allows:

  • Dictionary Development: Recognizes hand shape, finger angle, and palm orientation from 2D video input, mapped into 3D space.
  • Data Generation: Produces detailed, frame-by-frame information such as finger angles, hand orientation, and relevant motion cues.
  • Example Output: Generates a structured dataset capturing each gesture’s unique characteristics.

Video of me testing the accuracy of hand gesture detection for the dictionary data mentioned below:

Above is an interface for testing and demos. The data generation will use this tool to repeat the following steps (without a UI) and store the data collected for each sign gesture with its meaning.

Example of Dictionary Data (Single gesture)

Each single frame in the video capture:

  • Finger States: Each finger joint's angle.
  • Hand Orientation: The rotation of the hand.
  • Movement Indicators: Any detected motion compared to previous frames.

Example input video for a single gesture and its dictionary output

Future Plan: 1. Sign-to-Speech Generation

Achieving sign-to-speech involves the following key steps:

  1. Dictionary Tool (where I am now)

      To create the dictionary, we gather extensive data on the core elements of sign language:

      • Handshapes: Each unique configuration of fingers.
      • Palm Orientation: The direction in which the palm faces.
      • Location: Where the hand is positioned relative to the body.
      • Movement (partly implemented): The direction, speed, and style of hand motion.
      • Non-Manual Markers (not yet implemented): Facial expressions, body posture, and mouth movements.
  2. Model Training
    • Once the dictionary tool generates structured data, a model is trained to recognize sign gestures and match them to their intended meanings.
  3. Language Processing
    • Sign language sentences is a collection of meanings and has different grammer than natural language. Using available LLM APIs, the recognized sign meanings are converted into natural language (e.g., ASL to English).
  4. Text-to-Speech
    • The natural language output will then be fed into a text-to-speech module, producing spoken language that accurately reflects the signed message.
    • The current Text-to-Speech model is pretty well developed; this section will just be me implementing the available models for my purpose.
    • Text-to-Speech Demo

      This demo showcases the text-to-speech functionality that will be integrated into the sign language translation tool. Enter any text below to generate natural-sounding speech using OpenAI's text-to-speech API. This is a simple setup to confirm the feasibility of this step.

      Generating audio...

Future Plan: 2. Speech-to-Text Generation Proposal

The current Speech-to-Text model is pretty well developed; this section will just be me implementing the available models for my purpose.

Achieving Speech-to-Text generation involves the following key steps:

  1. Speech Recognition
    • A speech-to-text engine (such as Google Cloud Speech-to-Text, Amazon Transcribe, or a custom ASR model) converts spoken language into written text.
  2. Accessibility
    • The resulting text can be displayed in real time, making spoken language content readily available for individuals who are deaf or mute.
  3. Speech-to-Text Demo

    This demo showcases the speech-to-text functionality that will be integrated into the sign language translation tool. Click the microphone button and speak to see your words transcribed in real time using the Web Speech API. This is a simple setup to confirm the feasibility of this step.

    Click microphone to start speaking

    Your transcribed text will appear here...

Future Plan: 3. Implementation

Step-by-Step Translation Process and VR Implementation

  1. Handshape Detection: Hand tracking plus facial expressions, body posture, and mouth movements (the Apple Vision Pro is ideal hardware for this).
  2. Dictionary Lookup: Compares detected handshapes with dictionary entries, which include variations in handshape, palm orientation, location, movement, and non-manual markers.
  3. Handshape Matching: Matches observed handshapes with the closest dictionary entries.
  4. I implemented a GitHub open-source project HandVector on Vision Pro that fits perfectly with this project step 1 to 3. This setup is to confirm the feasibility of this step. Video of me testing:

  5. Speech Generation: Converts matched signs into a spoken language sentence, accommodating sign language grammar where needed. Uses a text-to-speech module to produce audible output.
  6. Speaker Icon
  7. Text Bubble Generation: Speech to text, add text bubble next to the speaker with in the UI.
  8. User Icon

Why apple vision for initial deploymen

  • Natrual Behavior: Conversations using this device can feel seamless, enabling natural day-to-day behavior and casual conversations.
  • Consistent Tracking: Apple vision hand tracking technology can capture the full three-dimensional aspects of sign language, including depth, orientation, and movement paths that are difficult to capture naturally and consistency with other options like phone or laptop. While inner cameras captures non-manual markers like facial expressions, body posture, and mouth movements.
  • Spatial Context: Signs that rely on spatial relationships and positioning relative to the body can be more accurately captured and interpreted.

The potential benefits for sign language education, translation accuracy, and accessibility make it a compelling direction for future development.

Simple Dynamic Music

interactive audio | web development | music production

An interactive music mixer that provides precise control over a multi-track musical composition. This project is a proof of concept that demonstrates simple audio processing and real-time mixing capabilities, which can be easily implemented into games.

This project explores dynamic music implementation by adjusting the volume of individual stems (instrument tracks) based on in-game actions. This technique instantly changes the musical atmosphere while maintaining coherence. Inspired by my work on NeuroMuse and prior experience in piano and cello, this proof of concept demonstrates how simple adjustments can create immersive audio experiences. I was solely responsible for the development of this project.

Example use case

An example use case could be a boss fight, with each version of the music representing a different state for the protagonist or the enemy. Different versions of the music would rapidly switch, reflecting changes in movement, health, or attack patterns.

Key Features

  • Real-time volume control for 14 individual tracks
  • Preset mixing configurations for different musical themes
  • Organized track groups for intuitive mixing(these 14 tracks are combinations of 60 stems from https://multitracksearch.cambridge-mt.com/ms-mtk-search.htm)
0:00 0:00
Kicks
80%
Snares
80%
Clap
80%
Hats
80%
Cymbals
80%
Ride Hit
80%
Bass
80%
Piano
80%
Synth Leads
80%
Synth Fills
80%
Synth Pad/Saw
80%
Horns
80%
Tubas
80%
Violin
80%

Robotics

robotics | PID control | embedded systems | course work

|

Fall 2024

A collection of robotics projects focusing on control systems, sensor integration, and autonomous navigation. These projects demonstrate practical applications of PID control, embedded systems programming, and real-time sensor processing.

PID Ball Balance System

A precision control system that maintains a ball's position on a platform using PID control algorithms. The system demonstrates real-time sensor feedback and dynamic adjustment capabilities.

View PID Ball Balance Documentation

Line Follower Robot

An autonomous robot capable of following line patterns using infrared sensors and precise motor control. The project showcases sensor integration and motion control algorithms.

View Line Follower Documentation

Final Project: Micromouse Project

A maze-solving robot that uses advanced pathfinding algorithms and sensor arrays to navigate through unknown maze environments. This project combines elements of autonomous navigation, mapping, and optimal path planning.

View Micromouse Documentation

Footprint

Entrepreneurship Competition

Spring 2024

Footprint is a web browser I developed to enhance productivity and engagement in web browsing within 10 days for an entrepreneurship competitio. Its standout feature is an integrated social media platform, linked to each webpage for dynamic interaction. Additional tools, such as a built-in note-taking system and AI-powered summaries, are also linked to each webpage, providing users with a more comprehensive and interactive browsing experience.

This project showcases my skills in web development and user experience design. In addition, the competition involves business planning, financial planning, market research, and presentations to judges. I was solely responsible for the development of this project. More on this project in google file.

Screen Writing

course work

Summer 2023

This script is a project from a screenwriting course, where the writing evolved through continuous feedback from three peers and one professor. The structure of the course allowed only minimal edits to previous sections, encouraging growth with each new submission. The final assignment was written without external feedback, showcasing my development as a screenwriter.

View Screenplay PDF

Personal Website

web development | UI/UX | interactive design

|

Aug 2024 - Present

This is my first attempt at web development. The goal is to create a minimal, interactive portfolio website that showcases projects through a simple and user-friendly interface. It features a responsive design with smooth transitions, dynamic content loading, and interactive components.  I was solely responsible for the development of this personal webpage.

Design Features

  • Navigation: Intuitive side progress bar for quick section access with active state tracking
  • Visual Hierarchy: Clean typography and consistent spacing for optimal readability
  • Interactive Elements: Expandable project cards with smooth animations and dynamic positioning
  • Progress Tracking: Visual indicators showing current section and reading progress

Technical Implementation

  • Smooth Transitions: Custom page transitions and scroll-based animations
  • Dynamic Content: Collapsible sections with optimized performance
  • Responsive Design: Fluid layouts adapting to all screen sizes
  • Performance: Optimized asset loading and smooth scrolling implementation

Interactive Components

  • Project Cards: Expandable content with dynamic button positioning
  • Navigation System: Scroll-based highlighting with smooth animations
  • Media Integration: Support for various media types including audio and images
  • User Experience: Intuitive interfaces for tools like the audio mixer and vocabulary learning system