Danny Ki


Data Scientist

Portfolio


About


Data Scientist. I’m a self-taught programmer, learner, and a data scientist who is passionate and committed to data analytics to solve problems with a creative and persistent mindset to unveil the hidden knowledge from the data. I embrace the unknown, the risk, the challenge, and learning new skills which enables me to develop insightful analytics with an artistic approach.

For the last six years, I had worked as sales and purchasing where I was responsible for analyzing financial reports and market data to maximize efficiency and minimize costs. During this time, I became very interested and fascinated by the capabilities data science can have to improve business decisions. This fascination led me to obtain several data science related certificates and pursuing further knowledge in data science by enrolling in Fast Campus (Data Science School).

I firmly believe that the future for all businesses will rely on data technologies to make executive decisions to improve businesses and maintain healthy relationships with its customers by analyzing the data to understand the customer needs. Now I am currently searching for an exciting role with the potential for growth where I can utilize my skill sets to provide detailed and insightful data analysis.

Contact Me


Email Address : kish1919@gmail.com

Phone Number : 678-850-4240

Predict Used Car Price

[Service Website]




(1) Subject : Vehicle Value Forecast Web Services based on Machine Learning
(2) Period : 2018. 03 - 2018. 04
(3) Tech : Python, Data Crawl , AWS, Flask, MySQL, Bootstrap
(4) Model : XGBooster (Accuracy : 85%)
(5) Structure :

(6) Website : http://dannyki.ga/
Fill in the information and then press the submit button

You can check the price of the used car you want, and you can also check the average price for different years with the same model

(7)Github : Github Website

Close Project

Predict House Price

[Kaggle Competition]



(1) Subject : Predict house prices in Ames, Lowa
(2) Period : 2018. 01 - 2018. 03
(3) Data : Train Data - 81 variables and 1460 house data
                  Test Data - 80 variables and 1459 house data
(4) Python : Preprocessing - Numpy, Pandas
                      Graph - Matplotlib, Seaborn
(5) Model : Ordinary Least Squares Model

(6) Kaggle Score : 0.12384 / Kaggle rank : 1042 / 4548 (22.9%)
(7) Github : Github Website

Close Project

Spooky Author Identification

[Kaggle Competition]



(1) Subject : Identify an author from sentences which they wrote
(2) Period : 2018. 03 - 2018. 04
(3) Data : Train Data - 3 variables and 19,579 text data
                  Test Data - 2 variables and 8,392 text data
(4) Python : Natural Language Processing - Stopword, Stemming
                      Vectorization - CountVectorizer
                      Model - Randomforest, AdaBoost, SVM, Naive Bayes Classification
(5) Model : Naive Bayes classification

(6) Kaggle Score : 0.48767 / Kaggle rank : 793 / 1244 (63.7%)
(7) Github : Github Website

Close Project

Titanic Machine Learning from Disaster

[Kaggle Competition]



(1) Subject : Predict survival on the Titanic
(2) Period : 2018. 03 - 2018. 04
(3) Data : Train Data - 12 variables and 891 data
                  Test Data - 11 variables and 418 data
(4) Python : Preprocessing - Numpy, Pandas
                      Graph - Matplotlib, Seaborn
                      Models - DecisionTree, Randomforest, Adaboost, Support Vector Machine, Naive Bayes Classfication, VotingClassifier
(5) Model : VotingClassifier Model

(6) Kaggle Score : 0.78468 / Kaggle rank : 4304 / 10676 (40.3%)
(7) Github : Github Website

Close Project

Bike Sharing Demand

[Kaggle Competition]



(1) Subject : Predict demand on bike
(2) Period : 2018. 03 - 2018. 04
(3) Data : Train Data - 12 variables and 10,886 data
                  Test Data - 9 variables and 6,493 data
(4) R : Preprocessing - dplyr
            Graph - ggplot
            Model - Randomforest
(5) Model : Randomforest

(6) Kaggle Score : 0.48613 / Kaggle rank : 1,357 / 3,251 (41.7%)
(7) Github : Github Website
Close Project

Digit Recognizer

[Kaggle Competition]



(1) Subject : Identify digits from a dataset of tens of thousands of handwritten images
(2) Period : 2018. 04 - 2018. 04
(3) Data : The training set (42,000 EA)
                  The test set (28,000 EA)
(4) Python : Preprocessing - Numpy, Pandas
                      Graph - Matplotlib, Seaborn
                      Models - Keras (Sequential)
(5) Model : Keras (Sequential) Model

(6) Kaggle Score : 0.98271/ Kaggle rank : 1,139/2,279 (49.9%)
(7) Github : Github Website

Close Project