<< Blue Angels Take Off together | Colin Bartol(@colin_bartol): Big Data in Health Care, Tricare West, DataScience Project Affinity >>

Daniel Whitenack (@dwhitena): Data Science language GO, Containers, Reproducibility

By Rajib Bahar at June 17, 2017 06:29
Filed Under:

Daniel (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

Interviewer: Rajib Bahar

Agenda:
- Many of us may or may not be aware of "Jupyter Notebook", which is a web application to write codes in various Languages such is R, Python, Julia, node.js, GoLang, Ruby, & Scala. That appliation in turn creates separate process in the Kernel to receive output from the OS and return the output back to the web application. One of the coolest thing you do is to maintain the Kernela on GoLang aka Go. Currently, Data Scientists tend to gravitate toward either R, or Python as language. You're playing with a bit more modern languages in data science. Why Go? How is it more useful in statistical analysis or Data visualization?
- How do you achive reproducibility in data science?
- Most of us heard of Virtual Machine tools such as VMWare, Virtual PC, Virtual Box. This is the 1st time I heard of containers. What are some key benefits of it?Are there websites such as Turnkey hub where you can get some good images of various OS / software / DBMS platforms?
- What are some best practices around deploying Data Science Models? Do you do something similar to DBAs or DataEngineers to run a job at certain frequencies in the day or hour?
- How do you use data pipelines in your project? Is that something used in ETL like Data-Wrangling process?
- Please tell us where we can find you in social media?

Tags: DataScience, Go, Reproducability, DataPipelines

Comments (0) E-mail Kick it! DZone it! del.icio.us Permalink Post RSS

Colin Bartol(@colin_bartol): Big Data in Health Care, Tricare West, DataScience Project AffinityData PodcastPedro Medina (@haystack_data, @analyzethisTC) - Data Science Community, AI competition & TensorFlowData PodcastCombining SMO and Powershell to Generate SQL Database SchemaThere are times we find the need to generate the database schema. In SQL Server, it can be easily d...

Month List

2022
- May (1)
2017
- June (14)
2016
- June (4)
2015
- December (1)
- July (1)
- June (1)
- March (1)
- February (1)
- January (3)
2014
- December (2)
- November (2)
- October (1)
- September (1)
- July (1)
- June (2)
- April (6)
- March (4)
- February (3)
- January (1)
2013
- December (4)
- November (1)
- July (1)
- June (1)
- May (5)
- March (2)
- February (8)
2012
- August (7)
- February (1)
- January (1)
2011
- November (1)
- October (3)
- September (1)
- August (7)
- July (1)
- June (2)
- May (1)
- March (2)
- January (1)
2010
- December (1)
- November (1)
- September (3)
- July (2)
- May (1)
- April (2)
- March (2)
- February (3)
- January (1)
2009
- December (2)
- November (5)
- October (2)
- September (3)
- August (1)
- July (3)
- April (2)
- March (2)
- February (2)
- January (10)
2008
- December (9)
- November (11)
- October (2)
- July (1)
2007
- November (1)
- October (7)
- August (8)
- July (1)
- June (3)
- May (4)
- April (3)
- March (17)
2005
- July (1)

Rajib Bahar's Blog

Thoughts on Big Data, Data Science, BI, AI, SQL, Data, Podcast, & Other stuff...

Daniel Whitenack (@dwhitena): Data Science language GO, Containers, Reproducibility

RecentPosts

Page List

Month List

Rajib Bahar's Blog

Thoughts on Big Data, Data Science, BI, AI, SQL, Data, Podcast, & Other stuff...

Daniel Whitenack (@dwhitena): Data Science language GO, Containers, Reproducibility

RecentPosts

Page List

Tag cloud

Month List