Data Science Notebook

Menu

Menu

Data Science Notebook

Menu

Menu

Category Archives: statistic

My Data Science & Data Engineer Project Distributed computing with 120 CPUs using H2O

I just want to share a data science project I completed recently, with the integration of data engineer concepts to data science.

Data Engineer, data science, H2O, python
data cleasing, data retrieve, Deploy on Linux, jupyter notebook, Python, statistic, vitualization
Posted on May 15, 2018

unsupervised learning-3 Dimension reduction: PCA, tf-idf, sparse matrix, twitter posts clustering Intrinsic dimension, text mining, Word frequency arrays, csr_matrix, TruncatedSVD

Dimension reduction: PCA,  Intrinsic dimension

tf-idf,  Word frequency arrays

sparse matrix,  csr_matrix, TruncatedSVD

fun, pca, text mining, tf-idf
data cleasing, jupyter notebook, Python, statistic, text mining, unsupervised learning
Posted on February 18, 2017

Visualization with Seaborn statistic Python Seaborn

visualizing regressions

group by categorical feature

plot Residuals

Higher-order regressions

Visualizing univariate distributions

Visualizing multivariate distributions

 

learning, seaborn
data cleasing, jupyter notebook, Python, statistic, vitualization
Posted on February 17, 2017

data exploration -2 Seaborn statistic

data exploration and visualization with Python Seaborn

the way of constructing plots by define functions is a good learning point

learning
data cleasing, jupyter notebook, Pandas, Python, statistic, vitualization
Posted on October 21, 2016

data exploration -1 Famous Titanic dataset

sample data exploration and visualization

Using the famous Titanic dataset

fun, learning
data cleasing, jupyter notebook, matplotlib, Pandas, Python, statistic, vitualization
Posted on October 21, 2016

statistical thinking 2-5 DataCamp course note

statistical thinking 2-5 DataCamp course note

stat
jupyter notebook, statistic, vitualization
Posted on February 26, 2016

statistical thinking 2-4 DataCamp course note

statistical thinking 2-4 DataCamp course note

stat
practice, statistic, vitualization
Posted on February 24, 2016

statistical thinking 2-3 DataCamp course note

statistical thinking 2-3 DataCamp course note

stat
practice, statistic, vitualization
Posted on February 22, 2016

statistical thinking 2-2 DataCamp course note

statistical thinking 2-2 DataCamp course note

stat
statistic, vitualization
Posted on February 20, 2016

Post navigation

Older posts

Log in

  • Register
  • Log in
  • Entries RSS

contact me

Richard Ji

Richard Ji

Categories

  • bokeh (5)
  • Chinese (3)
  • data cleasing (25)
  • data retrieve (13)
  • database (5)
  • deep learning (18)
  • Deploy on Linux (16)
  • excel (2)
  • fun (1)
  • git (1)
  • jupyter notebook (57)
  • keras (12)
  • machine learning (11)
  • matplotlib (6)
  • Pandas (9)
  • practice (9)
  • project (7)
  • Python (69)
  • R (1)
  • source (2)
  • statistic (14)
  • tensorflow (5)
  • text mining (11)
  • tips (1)
  • topic modeling (3)
  • Uncategorized (14)
  • unsupervised learning (6)
  • vitualization (27)
  • wordpress (2)

Recent Posts

  • My Data Science & Data Engineer Project
  • A simple “click” that create LDA topic models for text mining
  • Face Similarity searching ~ landmark detecting
  • Simple LDA Topic Modeling in Python: implementation and visualization, without delve into the Math
  • Sentiment Analysis model deployed!

Click cat

Archives

  • May 2018 (1)
  • February 2018 (1)
  • September 2017 (1)
  • April 2017 (5)
  • March 2017 (12)
  • February 2017 (14)
  • January 2017 (36)
  • December 2016 (3)
  • November 2016 (6)
  • October 2016 (10)
  • August 2016 (2)
  • July 2016 (3)
  • March 2016 (2)
  • February 2016 (10)
  • January 2016 (6)
  • December 2015 (2)
  • November 2015 (1)
  • October 2015 (1)
  • September 2015 (1)
  • April 2015 (1)
  • January 2010 (1)

iridescent data science 

Theme by SiteOrigin.
  • Home
  • Visualization

Categories

  • bokeh
  • Chinese
  • data cleasing
  • data retrieve
  • database
  • deep learning
  • Deploy on Linux
  • excel
  • fun
  • git
  • jupyter notebook
  • keras
  • machine learning
  • matplotlib
  • Pandas
  • practice
  • project
  • Python
  • R
  • source
  • statistic
  • tensorflow
  • text mining
  • tips
  • topic modeling
  • Uncategorized
  • unsupervised learning
  • vitualization
  • wordpress
HTML Snippets Powered By : XYZScripts.com