Steven Yan

Steven Yan

Using my data science powers for social good

PROFESSIONAL SUMMARY:

As adata ana data scientist with a proven track record in various data analysis and impactful projects, my career objective is to leverage my expertise and passion for data-driven insights to excel in a challenging and dynamic role as a Data Scientist. I am committed to contributing my diverse skill set, technical proficiency, and hands-on experience to drive innovation and make a meaningful impact within an organization that values data-driven decision-making and problem-solving.

My career objective encompasses the following key aspirations:

  • Impactful Data Analysis: To continue applying cutting-edge techniques and methodologies to extract actionable insights from complex datasets, enabling informed decision-making and the development of strategic recommendations.
  • Innovation in Machine Learning: To stay at the forefront of machine learning and deep learning technologies, consistently pushing the boundaries of what is possible to tackle real-world challenges and create solutions that have a positive social and economic impact.
  • Collaborative Problem-Solving: To collaborate with cross-functional teams, working closely with experts from diverse backgrounds to identify and solve complex problems that require multidisciplinary insights and expertise.
  • Mentorship and Knowledge Sharing: To play an active role in mentorship and knowledge sharing, fostering an environment of continuous learning and growth for myself and those around me.
  • Ethical and Inclusive Data Practices: To promote ethical data handling, ensure data privacy, and champion diversity and inclusivity in data science practices and applications.

In pursuit of these objectives, I am eager to contribute my skills in data manipulation, statistical analysis, machine learning, and data visualization, along with my proficiency in programming languages such as Python and SQL. My prior experiences as a Data Scientist at StartOut, a Machine Learning Engineer at Omdena, and a Teaching Professional have equipped me with the expertise to excel in a data-intensive role while effectively communicating findings to both technical and non-technical stakeholders.



RELEVANT EXPERIENCE:

  • Implemented Sparse Principal Component Analysis and Difference-in-Differences models to produce policy recommendations highest in magnitude and significance for each state and selected 120 metro areas
  • Developed algorithms for gender and race guessing, resolving conflicting information on entrepreneurial data, and code for automating quarterly data update process
  • Analyzed data to identify top insights on LGBTQ+ entrepreneurship for website posting and reporting to key stakeholders
  • Collaborated with my manager and Dr. Vivienne Ming (Index creator) to co-author the State of LGBTQ+ Entrepreneurship Report
  • Served as technical lead for small team of college interns used for researching data and performing data entry and cleaning tasks


VOLUNTEER EXPERIENCES:

Role: Team Lead

The goal for the project is to democratize access to resources by developing a freely accessible webapp for identifying submitted chest X-ray images for the following respiratory lung disorders: tuberculosis, lung cancer, pneumonia, and COVID.

Four teams worked alongside each other in building models for each disease for 8 weeks, and within each team, we selected the best model for deployment.

I led the Tuberculosis team through the project tasks: data collection, EDA and data preprocessing, model building and evaluation, and model deployment.

Links:

  • Official Project Repo: https://github.com/OmdenaAI/myanmar-chapter-chest-x-rays

  • My Project Repo: https://tinyurl.com/Tuberculosis-Detector
  • Streamlit App: https://chest-xrays-detection-system.streamlit.app/

Role: Lead Machine Learning Engineer

We collaborated in this OmdenaLore AI challenge with the Giga team, a joint initiative between UNICEF and ITU for two months. We built several Computer Vision and Deep Learning models to detect school locations in Sudan using Satellite Imagery. We did an extensive and thorough analysis of the data and built multiple models using datasets provided by the Giga team to solve this problem.

Role: Data Research Analyst

Attended DataDive hackathon event and participated in continued efforts in data wrangling and visualization for project sponsored by CDAC at UChicago to develop new tools to measure broadband access in US



OTHER EXPERIENCES:

  • Developed detailed individualized study plan for premedical students based on academic history and initial in-person assessment to provide targeted content review, MCAT-style practice, and expert experience and insight into how each topic is likely to be tested on Test Day
  • Provided insight into the mindset of top scorers and promoted study skills necessary for success on the exam
  • Advised students through application process and provided essay writing and editing services
  • Offered sliding scale services to any student qualifying for financial assistance from AAMC and passionate about working with underrepresented and underserved communities (including URM’s and students with LD’s) and equalizing access to resources
  • Worked on a team of 25 content writers selected through writing competition to develop open-source high-quality test prep materials in effort to equalize access to resources
  • Developed Physical Science questions and passages and submitted for review by official AAMC Content Experts prior to publication on website
  • Project led by Khan Academy and American Association of Medical Colleges (AAMC) and funded by Robert Wood Johnson Foundation
  • Designed and delivered 20-session bootcamp to prepare URM students for the MCAT in the Mentoring In Medicine Program
  • Created content lessons focusing on interaction and engagement, which included student pair, student group, and game activities, as well as individual exercises
  • Collaborated with alumni in conducting review sessions and in working with difficult students
  • Met with Executive Director weekly to discuss any curriculum changes or major student concerns
  • Created specialized Verbal program for JAMP, program for premeds from minority backgrounds in Texas
  • Developed materials for different initiatives, i.e. creating MCAT Tutoring Guide as handbook for newly trained tutors
  • Managed team of 15-20 instructors in Manhattan/Bronx, including assigning classes, conducting class observations, and providing specific feedback; increased excellence ratings in area by 5%


EDUCATION:

  • Completed Data Science Online Immersive Bootcamp in June 2021
  • Degree: Bachelor of Arts in Biological Sciences
  • Awards: College Honor Scholarship (full academic scholarship), Dean's List (all quarters)
  • Cumulative GPA: 3.7


Skills:

  • Programming Skills: Python, SQL, HTML/CSS
  • Machine Learning, Predictive Modeling: Scikit-learn, Tensorflow
  • Data Cleaning or Mungling: Numpy, Pandas
  • Statistics, Statistical Analysis: Excel, SciPy
  • Data Mining and Interpretation
  • Data Visualization: Matplotlib, Seaborn, Tableau
  • Version Control: Git, Github
  • Cloud Platform: Google Cloud Platform


Tech Stack:



Hobbies:


Home Gardening


Piano Playing & Performance


Hiking


Learning Languages


Volunteerism & Social Impact


Exploring World Cultures



PDF Version:

Last Revised on 12/29/23