Sunday, October 9, 2016

PyData DC - Links to 2016 Talks


PyDataDC2016 was held from October 7th-9th 2016.

Check out the schedule: http://pydata.org/dc2016/schedule/

Here are the links to some of the talks.

The talks have been grouped by the following categories:
Directly Python Related | Machine Learning | Data Analysis | Database Related | 
NLP | Scheduling | Security | Other Topics

Talks

Directly Python Related

The 5 Kinds of Python Functions: Steven Lott
  Slides: https://slott56.github.io/five-kinds-of-python-functions/assets/player/KeynoteDHTMLPlayer.html


Learn How to Make Life Easier with Anaconda: Dhavide Arulia
  Twitter: 
  Slides (pdf)
  Jupyter Notebooks & Data: https://github.com/dhavide/PyData-DC-2016-Anaconda


Sustainable Scrapers: David Eads / @eads
  Google Docs


Open Data Dashboards & Python Web Scraping: Marie Whittaker
  Twitter: @MarieCWhittaker
  Presentation: https://github.com/mseew/Presentation-Slides/blob/master/pyData_MCW.pdf
  Github: https://github.com/mseew/DM-Dashboard


Agent-based Modeling in Python: Jackie Kazil
  (Mesa Framework)
  Twitter: @JackieKazil
  Github: https://github.com/projectmesa/Mesa



Machine Learning Related


Variational Inference in Python: Austin Rochford
  Twitter: @AustinRochford
  Slides: http://austinrochford.com/resources/talks/dydata-dc-2016-variational-python.slides.html#/ …   Jupyter notebook 1:
  https://nbviewer.jupyter.org/gist/AustinRochford/91cabfd2e1eecf9049774ce529ba4c16
  Jupyter Notebook 2: Dependent Dirichlet Process Regression


Clustering talk (McInnes & Healy) 
HDBScan
Twitter: @dvbalen
Jupyter Notebooks: https://github.com/scikit-learn-contrib/hdbscan

Logistic Regression: Behind The Scenes: Chris White
Twitter: @markov_gainz
Slides: http://www.slideshare.net/ChrisWhite249/logistic-regression-behind-the-scenes

Visual Diagnostics for more informed Machine Learning: Rebecca Bilbro
Yellowbrick
Twitter: @RebeccaBilbro
Slides: https://rebeccabilbro.github.io/pydata/#/
Github: https://github.com/DistrictDataLabs/yellowbrick

Building Serveless ML Models in the Cloud: Alex Casalboni
Github: https://github.com/cloudacademy/sentiment-analysis-aws-lambda



Data Exploration & Analysis


Building Your First Data Pipelines: Hunter Owens
Twitter: 
Presentation: http://hunterowens.net/data-pipelines/presentation/#/
Github: https://github.com/hunterowens/data-pipelines

Creating Python Data Pipelines in the Cloud: Femi Anthony
Twitter: @DataPhanatik
Slides: https://github.com/femibyte/data-eng/blob/master/PyData2016-DataPipelinesCloud.pdf
Github: See the references (last slide) in the presentation above

Parallel Python - Analyzing Large Data Sets: Aron Ahmadia, Matthew Rocklin
Github: https://github.com/mrocklin/scipy-2016-parallel

Transforming Data to Unlock Its Latent Value: Tony Ojeda 
EDA Framework
Twitter: @tonyojeda3
Jupyter Notebook

Time series exploration with matplotlib: Thomas Caswell
Twitter: @tacaswell
Github: https://github.com/tacaswell/interactive_mpl_tutorial

Forecasting Critical Food Violations at Restaurants using Open Data: Nicole Donnelly
Twitter: @NicoleADonnelly
Presentation: Slideshare
Github:https://github.com/nd1/DC_RestaurantViolationForecasting

Doing Frequentist Statistics in Python: Gustavo A. Patino
Github: https://github.com/gapatino/Doing-frequentist-statistics-with-Scipy


Database Related


NoSQL doesn't mean No Schema: Steven Lott
Twitter: @s_lott
Presentation


GraphGen: Conducting Graph Analytics Over Relational Databases
 Twitter: 
http://konstantinosx.github.io/graphgen-project/


Natural Language Processing
What you can learn about food by analyzing a million Yelp reviews: Patrick Harrison
  (Modern NLP in Python)

Machine Learning with Text in scikit-learn: Kevin Markham
Github: https://github.com/justmarkham/pydata-dc-2016-tutorial



Scheduling Related

Dask: Fine Grained Task Parallelism: Matthew Rocklin




Security

Eat Your Vegetables: Data Security for Data Scientists: William Vorhees
http://www.slideshare.net/WilliamVoorhees1/eat-your-vegetables-data-security-for-data-scientists



Other Topics


Keynote: A Dept of Commerce Conundrum: Star Ying
Slides: http://www.slideshare.net/StarYing1/pydata-dc-2016-a-doc-conundrum

Becoming a Data Scientist: Advice From My Podcast Guests: Renee Teate
Twitter: @BecomingDataSci
Slides & The slide notes are here.

Python Users: Daniel Chen
Github: https://github.com/chendaniely/2016-pydata-dc-python_useRs


Semi-autonomous Drone: YHat
https://github.com/yhat/semi-autonomous-drone

Data Sciencing while Female: Amanda Traud
Slides: Google Doc 
Shiny App: https://netmandi.shinyapps.io/DSMeetups/

Julia Tutorial: Chase Coleman
Github:https://github.com/cc7768/PyDataDC_julia



Thanks to Bhavika Tekwani & Renee Teate for help with a number of these links.

#PyDataDC2016