Pedro Medina (@haystack_data, @analyzethisTC) - Data Science Community, AI competition & TensorFlow

By Rajib Bahar at June 28, 2017 06:35
Filed Under: Data, Data Podcast, Data Science

Pedro Alexander Medina is Founder & Chief Analytics Officer at Haystack, LLC - an Advanced Analytics Agency specializing in custom managed solutions across the data value chain. With deep expertise in information management and advanced analytics, his mission is to help organizations optimize their strategic data assets by converting complex data into intelligence; intelligence into innovation; innovation into success.

Interviewer: Rajib Bahar

Colin Bartol(@colin_bartol): Big Data in Health Care, Tricare West, DataScience Project Affinity

By Rajib Bahar at June 19, 2017 06:29
Filed Under: Data, Data Podcast, Data Science

Colin Bartol has led a team that built 46% of the servers for Tricare West which covers 2.6 million people for the military which required security for NIST, PCI, and HIPPA. He has his MBA from Carlson school of management, CISSP, and is a SME for the CompTIA Project+ exam. Having been a consultant at 5 Fortune 100 companies in e-commerce, financial services, and retail sectors Colin has a wide experience of what happens in information technology. He is currently employed in telecommunications at a major health insurance company.
Interviewers: Rajib Bahar, Shabnam Khan

- Your background is largely in computer security infrastructure... Tell us about your Data Science related experience? We would like to hear about Affinity project
- What is Data Lake? How have you utilized it?
- I noticed HealthCare industry is hiring Big Data experts like crazy... Are you involved in similar project in your department?
- We are interested in learning about Tricare West
- As it relates to computer Security, what tips or recommendations do you have on securing data and infrastructure in general? and Social media presence

Daniel Whitenack (@dwhitena): Data Science language GO, Containers, Reproducibility

By Rajib Bahar at June 17, 2017 06:29
Filed Under:

Daniel (@dwhitena) is a Ph.D. trained data scientist working with Pachyderm (@pachydermIO). Daniel develops innovative, distributed data pipelines which include predictive models, data visualizations, statistical analyses, and more. He has spoken at conferences around the world (ODSC, Spark Summit, Datapalooza, DevFest Siberia, GopherCon, and more), teaches data science/engineering with Ardan Labs (@ardanlabs), maintains the Go kernel for Jupyter, and is actively helping to organize contributions to various open source data science projects.

Interviewer: Rajib Bahar

- Many of us may or may not be aware of "Jupyter Notebook", which is a web application to write codes in various Languages such is R, Python, Julia, node.js, GoLang, Ruby, & Scala. That appliation in turn creates separate process in the Kernel to receive output from the OS and return the output back to the web application. One of the coolest thing you do is to maintain the Kernela on GoLang aka Go. Currently, Data Scientists tend to gravitate toward either R, or Python as language. You're playing with a bit more modern languages in data science. Why Go? How is it more useful in statistical analysis or Data visualization?
- How do you achive reproducibility in data science?
- Most of us heard of Virtual Machine tools such as VMWare, Virtual PC, Virtual Box. This is the 1st time I heard of containers. What are some key benefits of it?Are there websites such as Turnkey hub where you can get some good images of various OS / software / DBMS platforms?
- What are some best practices around deploying Data Science Models? Do you do something similar to DBAs or DataEngineers to run a job at certain frequencies in the day or hour?
- How do you use data pipelines in your project? Is that something used in ETL like Data-Wrangling process?
- Please tell us where we can find you in social media?

Data Podcast

By Rajib Bahar at June 15, 2017 02:32
Filed Under: Data, Data Podcast, Database, Podcast, SQL, SQL Server, Big Data, Data Science, Analytics

Last few months Shabnam, & I were working on creating a podcast. Our podcast brings industry experts working in various Data practices. We focus on topics related to Big Data, Data Science, Database technologies, RDBMS. Many thanks to our colleagues & friends for their support in this initiative.


Here are links to our podcast:

Soundcloud ->

iTunes ->

Tag cloud

Month List